sort a hashmap, fast

Imagine you have an hash map where key is an String and value is a integer.
Now you want to sort based on the value of each entry in this hashmap in descanding order.

Here is how I do it:

List<Map.Entry<String, Integer>> list = new Vector<Map.Entry<String, Integer>>(myHashMap.entrySet());
java.util.Collections.sort(list, new Comparator<Map.Entry<String, Integer>>() {
      public int compare(Map.Entry<String, Integer> e, Map.Entry<String, Integer> e1) {
          return (e.getValue().equals(e1.getValue()) ? 0 : (e.getValue() > e1.getValue() ? 1 : -1));
}}); 
 

Now the list contains the sorted entrys from hashmap ‘THashMap myHashMap<String,Integer>’.
Iterate over it or copy it back to ‘myHashMap’ if you need to.

Hope this helps …

trove4j

For those who don’t know there is some nice gnu package you can use anywhere a normal HashMap is needed.

Just

import gnu.trove.THashMap;

and use it. The only thing that differs is speed and memory usage.

Currently I’m making a differential analysis for over 1800 documents. This means comparing all documents agains each other.
approx. (1800*1799)/2=1619100 comparisons. The average file size is 0.8 MB

With HashMap<String,Integer> I’ll use about 13Gig of Ram and need about 3h20min
With THashMap<String,Integer> I’ll never need more than 8 Gig and need about 2h40min

The THashMap is part of trove4j library.

It is descibed on their webpage:

“The Trove library provides high speed regular and primitive collections for Java.”

For more Infos visit their webpage at: http://trove4j.sourceforge.net/