tails of fortune 1 (aka tof1)

Avoid the Gates of Hell. Use Linux.
– unknown source


When your write scripts, debug code or deloy software there comes a point of boredomeness beyond fun. This is the time to install fortune and get a smile in the detection time of a keystroke.

And if you want your bash scripts to output something nice after it finished work, just put this on the line after your Shebang.

#!/bin/env bash
trap $(hash fortune && echo fortune) SIGINT SIGTERM EXIT

And if you like if colorfull and crazy.

#!/bin/env bash
function fun {
    hash fortune && hash cowsay && hash toilet && clear && fortune | cowsay -f apt | toilet --gay -f term    
}
trap fun SIGINT SIGTERM EXIT

Be sure to install fortune, cowsay and toilet.

PDF Sam command line on OS X

Today I struggled with the usage of PDFSam on my Mac.
I just needed the command line tool, not the gui. On Linux it was

apt-get install pdfsam

On my OS X I dowloaded the .dmg from www.pdfsam.org and created this tiny ‘alias’ in my ~/.profile.

alias pdfsam='java -Dlog4j.configuration=console-log4j.xml \ 
-classpath /Applications/pdfsam-2.2.1.app/Contents/Resources\ 
/Java/pdfsam-2.2.1.jar org.pdfsam.console.ConsoleClient "$@"'

On command line I can now use it like I use it on Linux.

running parallel bash tasks on OS X

How often did you needed to process huge amounts of small files, where a single task uses only a small amount of cpu and memory?
However, today I need a script which does exactly this.

I have a mysql table which contains the filenames located on my hard drive.
Now I created a little script which processes a single file in under 3 seconds. Unfortunately for 10.000+ files this would take more than 8 hours.

So what if I could run them in parallel with a maximum of 10 parallel task’s being executed? This would really speed up the computation!

Luckily in 2005 Ole Tange from GNU merged the command line tools xxargs and parallel into the a single tool ‘parallel‘.
With this great tool there is no need to write any complicated script to accomplish such tasks.
First you need to install it using homebrew.

brew install parallel

After that i had to add the path to my .profile

PATH=$PATH:/usr/local/Cellar/parallel/20110822/bin

Here’s the basic usage:

 $> echo -ne "1\n2\n3\n" | parallel -j2 "echo the number is {.}"

This would echo the numbers 1, 2, 3 to the stdout with a maximum of 2 parallel running echo’s.
Here’s the output:

the number is 1
the number is 3
the number is 2

As you can see printing a 3 outspeeds printing a 2 😉

So here is my 1 liner to process all my files:

 $> mysql -uroot -p[secretPW] my_database < \ 
    <(echo "SELECT filename FROM files")\ 
    | grep -v 'filename' | parallel -j10 "./processFile.sh {.}"

After using this it took only 37min to process my 10000+ files 🙂

iTunes Sharing over ssh

Today I realized that I had not a single song on my notebook hard disk. Thanks to Last.FM 🙂
Unfortunately “Simplify Media” has been adopted by Google Inc. and they do not offer a similar service yet. So I needed a solution to stream my iTunes from home to the office. If found a great solution by “Robert Harder” which works like charm. (source).
This is the bash script:

#!/bin/sh
dns-sd -P "Home iTunes" _daap._tcp local 3689 localhost.local. \ 
127.0.0.1 "Arbitrary text record" &
PID=$!
ssh -C -N -L 3689:localhost:3689 username@dyndns_name.dyndns.org
kill $PID

test proxy speed with bash and wget

I need to test the speed of some proxy server. Here’s a little script on how I achieved this.
I have a text-file ‘proxy.list’ which looks like: (I took out the last two digits of the ip).

...[lot's of ip's]...
193.196.*.*:3124       Germany
143.205.*.*:3127      Austria
64.161.*.*:3128         United States
.....

Here is the script which will run through the list of all proxy and will download 5 test pages from a specific site. Then it will determine the period of time which is needed for execution. It’s create/append to a file ‘time.list’ which will contain the needed Information to determine the best proxies. You also need to create a subdirectory called ‘raw_proxy’ where the raw html code is save that you retrieve from the proxies. The files are named ”raw_proxy/$ip.$port.$i.tmp’ where $i is the i.th test page you downloaded. I need to keep those files to determine if the proxy send me the right file or e.g. a login page .

#!/bin/bash
size=$(cat proxy.list | wc -l)
while read proxy
do
    #determine the first parameter (IP:Port)
    ad=$(echo $proxy | awk '{print $1}')
    ip=${ad%:*}   #extract ip
    port=${ad#*:} #extract port
    #set and export the proxy settings
    http_proxy=$ip:$port && HTTP_PROXY=$http_proxy && export http_proxy HTTP_PROXY
    #save start timestamp
    start=$(date +%s)
    #download 5 pages #(yes I know 'seq' but I'm on Mac and I needed a sth. quick&dirty)
    for i in $(echo "1 2 3 4 5")
    do
        #use wget to retrive the page. We want to try 1 time and set some specific timeouts. + we force to use a Mozilla User agent to hide that we are using wget.
    	wget -O "raw_proxy/$ip.$port.$i.tmp" --tries=1 --dns-timeout=10 --connect-timeout=8 --read-timeout=15 -U "Mozilla/5.0 (compatible; Konqueror/3.2; Linux)" "http://www.yourTestPage.com/$i.txt" &> /dev/null
    done
    #save end timestamp
    end=$(date +%s)
    #calculate the difference
    diff=$(( end - start ))
    #append this info to time.list
    echo -e "$ip:$port\t$diff" >> time.list
    #to have a nice and shiny output I use figlet, this is optional, if you don't want it comment out next 3 lines or just remove ' | figlet'
    clear
    echo "PC: #"$size" - "$diff"s" | figlet
    sleep 1
    size=$(( size-1 ))
done < proxy.list

If you used figlet your output looks like this:

                                                           
 ____   ____       _  _    __  _____ _           ____   ___      
|  _ \ / ___|_   _| || |_ / /_|___ // |         |___ \ / _ \ ___ 
| |_) | |   (_) |_  ..  _| '_ \ |_ \| |  _____    __) | | | / __|
|  __/| |___ _  |_      _| (_) |__) | | |_____|  / __/| |_| \__ \
|_|    \____(_)   |_||_|  \___/____/|_|         |_____|\___/|___/

It shows how much proxies need to be checked and shows the last execution time.

After the script has finished you need to get a list of which proxy was best.
This is the command line which evaluates everything and gives me back a list of ip's sorted by access time. It also removes all proxies where the downloaded page had a size of 0B.

#command line to list proxy with lowest time to download
clear && while read line; do ip=${line% *};time=$(echo $line | awk '{print $2}');ip=${ip%:*};echo -e $ip"\t"$time"\t"$(ls -alshr raw_proxy/ | grep 1.tmp | grep $ip | awk '{print $6}'); done < <(tr -s ' ' < time.list | sort -n -r -k2 | cut -d' ' -f3) | grep -v "0B"

This is the output:

201.22.*.*	43	52K
196.213.*.*	43	13K
....
147.102.*.*	1	2,1K
132.227.*.*	1	2,1K
....
130.75.*.*	1	52K

If you know the filesize the you can append a

 | grep "52K"

to the last command to show only files which have the right size.
This is it 😉

I know that out there are better and fast implementations to do this but ...
but it was fun

debian root set display

it’s been a while since my last post …
here just a short solution of a problem i had with some debian boxes:

Problem:

emacs /etc/inetd.conf
Invalid MIT-MAGIC-COOKIE-1 keyemacs: Cannot connect to X server :0.0.
Check the DISPLAY environment variable or use `-d'.
Also use the `xhost' program to verify that it is set to permit
connections from your machine.

Solution:

cb0@localhost:~$ su -
root@localhost:~# xauth merge ~cb0/.Xauthority #replace cb0 with your username
root@localhost:~# export DISPLAY=:0.0
#now you are able to open emacs
root@localhost:~# emacs .bashrc