I have a list that I use for making gnuplots. The structure is the following
wordId wordWeight
e.g. 101 34.342 = Word with id 101 has a calculated weight of 34.342 times.
The bad thing is that this list is unordered. Now I wan’t to order this list getting the greatest weight and their corresponding word id. Bash doens’t seem to be the best solution for this so I made up my first awk script.
Here is it, it prints the most common word from file toPlot.stats.
awk ‘ BEGIN{ max=0; w=-1 } { if ( $2 >= max) { max=$2; w=$1 } } END{ print “id”,w,”has most often with count of”, max; }’ toPlot.stats
and the output:
id 1545 has greates weight with 28199.40186090438
My file has 1.722.913 lines, execution time:
real 0m1.618s
user 0m1.576s
sys 0m0.032s