One of the things these sites check for is the anonymity level of the proxy. Generally they do this by simply checking the X-Forwarded-For and Via HTTP headers of the request your browser sends to the Web.
If an X-Forwarded-For header is present, the proxy is marked as transparent, because the value of the header is your Internet Protocol (IP) address. Traffic through the proxy can be traced back to you.
You might not want this.
The Via header is a little more complex. It lists all the proxies your browser request went through. If there are more than one, but no X-Forwarded-For header, an X-Forwarded-For header may still have been recorded in a downstream server (downstream from the viewpoint of the last proxy your request went through, upstream from you). A classic Man-in-the-Middle (MITM) information disclosure scenario.
To recap: no X-Forwarded-For is good, no X-Forwarded-For and no Via header (or a Via header with only one hop) is better. This is enough to rate a proxy "Anonymous".
"High Anonymity" usually means a proxy will do SSL. "Elite" means "pay me $20 a month for access".
Rather than do my checking through the public proxy scanners I decided to put a page on www.mrhinkydink.com to do the checking for me. I found the following ASP code on the Web:
For Each Key in Request.ServerVariables
response.write Key & ": " & request.servervariables(Key) & " <br><br>"
I threw that on the mrhinkydink.com Web server and it worked fine. It was exactly what I needed.
But I had second thoughts. I didn't want to host a public proxy scanning server. That might be a Bad Thing. GoDaddy might frown on that sort of activity. They have been known to boot people for less.
While mulling this predicament over I accidentally discovered, through a Google code search, that this same, exact code is on literally hundreds of servers worldwide.
God bless the Google Bots!
The absolute best part of this fluke is that the requisite HTTP headers aren't buried in HTML crap, which makes it a snap to pick one URL out at random and grep the results.
After all, it wouldn't be right to hammer just one of these sites to death with traffic.
So I made an "A" list of about 25 servers, based on how busy the servers are (and a lot of them are in dark, dusty, unused corners of the Interwebs) and how long they've been around (from Netcraft's point of view). The proxy validation script picks one at random, and if that site is down for some reason, it tries again using the mrhinkydink.com page as a backup.
The Security Dude inside me says this is an Information Disclosure vulnerability, and it is. Besides the HTTP header details, this code also reveals "too much information" about a Web server's capabilities and arhitecture, which could be very valuable to an attacker. This blog posting should really be an Advisory (it would make a great Nessus plug-in - give me a byline if you write one), but the... ummm... uh... Script Kiddie inside me needs this service and, after all, there are only hundreds (not thousands or millions) of sites with this code hanging out.
At the end of the day (GAWD I hate that phrase) it's a "blame the programmers" problem (don't get me started on Web programmers). It is "sample code", which should never be placed on a production server.
And yet... I did it myself.
However, you may now accuse me of "security by obscurity" (a lesser offense, IMHO) because I didn't reveal the URL.
Maybe you can Google it.
The Googlebots are EVERYWHERE!