I need to test the speed of some proxy server. Here’s a little script on how I achieved this.
I have a text-file ‘proxy.list’ which looks like: (I took out the last two digits of the ip).
...[lot's of ip's]...
193.196.*.*:3124 Germany
143.205.*.*:3127 Austria
64.161.*.*:3128 United States
.....
Here is the script which will run through the list of all proxy and will download 5 test pages from a specific site. Then it will determine the period of time which is needed for execution. It’s create/append to a file ‘time.list’ which will contain the needed Information to determine the best proxies. You also need to create a subdirectory called ‘raw_proxy’ where the raw html code is save that you retrieve from the proxies. The files are named ”raw_proxy/$ip.$port.$i.tmp’ where $i is the i.th test page you downloaded. I need to keep those files to determine if the proxy send me the right file or e.g. a login page .
#!/bin/bash
size=$(cat proxy.list | wc -l)
while read proxy
do
#determine the first parameter (IP:Port)
ad=$(echo $proxy | awk '{print $1}')
ip=${ad%:*} #extract ip
port=${ad#*:} #extract port
#set and export the proxy settings
http_proxy=$ip:$port && HTTP_PROXY=$http_proxy && export http_proxy HTTP_PROXY
#save start timestamp
start=$(date +%s)
#download 5 pages #(yes I know 'seq' but I'm on Mac and I needed a sth. quick&dirty)
for i in $(echo "1 2 3 4 5")
do
#use wget to retrive the page. We want to try 1 time and set some specific timeouts. + we force to use a Mozilla User agent to hide that we are using wget.
wget -O "raw_proxy/$ip.$port.$i.tmp" --tries=1 --dns-timeout=10 --connect-timeout=8 --read-timeout=15 -U "Mozilla/5.0 (compatible; Konqueror/3.2; Linux)" "http://www.yourTestPage.com/$i.txt" &> /dev/null
done
#save end timestamp
end=$(date +%s)
#calculate the difference
diff=$(( end - start ))
#append this info to time.list
echo -e "$ip:$port\t$diff" >> time.list
#to have a nice and shiny output I use figlet, this is optional, if you don't want it comment out next 3 lines or just remove ' | figlet'
clear
echo "PC: #"$size" - "$diff"s" | figlet
sleep 1
size=$(( size-1 ))
done < proxy.list
If you used figlet your output looks like this:
____ ____ _ _ __ _____ _ ____ ___
| _ \ / ___|_ _| || |_ / /_|___ // | |___ \ / _ \ ___
| |_) | | (_) |_ .. _| '_ \ |_ \| | _____ __) | | | / __|
| __/| |___ _ |_ _| (_) |__) | | |_____| / __/| |_| \__ \
|_| \____(_) |_||_| \___/____/|_| |_____|\___/|___/
It shows how much proxies need to be checked and shows the last execution time.
After the script has finished you need to get a list of which proxy was best.
This is the command line which evaluates everything and gives me back a list of ip's sorted by access time. It also removes all proxies where the downloaded page had a size of 0B.
#command line to list proxy with lowest time to download
clear && while read line; do ip=${line% *};time=$(echo $line | awk '{print $2}');ip=${ip%:*};echo -e $ip"\t"$time"\t"$(ls -alshr raw_proxy/ | grep 1.tmp | grep $ip | awk '{print $6}'); done < <(tr -s ' ' < time.list | sort -n -r -k2 | cut -d' ' -f3) | grep -v "0B"
This is the output:
201.22.*.* 43 52K
196.213.*.* 43 13K
....
147.102.*.* 1 2,1K
132.227.*.* 1 2,1K
....
130.75.*.* 1 52K
If you know the filesize the you can append a
| grep "52K"
to the last command to show only files which have the right size.
This is it 😉
I know that out there are better and fast implementations to do this but ...
but it was fun