sort a hashmap, fast

Imagine you have an hash map where key is an String and value is a integer.
Now you want to sort based on the value of each entry in this hashmap in descanding order.

Here is how I do it:

List<Map.Entry<String, Integer>> list = new Vector<Map.Entry<String, Integer>>(myHashMap.entrySet());
java.util.Collections.sort(list, new Comparator<Map.Entry<String, Integer>>() {
      public int compare(Map.Entry<String, Integer> e, Map.Entry<String, Integer> e1) {
          return (e.getValue().equals(e1.getValue()) ? 0 : (e.getValue() > e1.getValue() ? 1 : -1));
}}); 
 

Now the list contains the sorted entrys from hashmap ‘THashMap myHashMap<String,Integer>’.
Iterate over it or copy it back to ‘myHashMap’ if you need to.

Hope this helps …

java decompilers

Because I have no recent Backup of the big java project I was working I have to decompile the class Files that are in this jar.
I found 3 different java decompilers for mac os x.

  1. MacJAD (google for Download)
    – onyl able to open single class files
  2. JarInspector (http://www.codeland.org/)
    + can open jar files
    + correct 
    – unable to decompile the anonymous inner classes I use
    – seems to be confused by nested catch and try blocks 
  3. JD-GUI (http://java.decompiler.free.fr/)  
    + can open jar files
    + 90% able to decompiling anonymous inner classes

A big problem all tools share is that if you have nested iterations, for/while loops they all rename the variable you iterate over to ‘i$’. So it’s possible if your code looks like this:

while(iterator.hashNext()) {
...
    for(int varA; ....){
        for(int varB; ....){.....}
    }
...
}

the decompiled code will end up looking this:

while(i$.hashNext()) {
...
    for(int i$; ....){
        for(int i$; ....){.....}
    }
...
}

which isn’t really funny …

My conclusion is that JarInspector seems to be the best but if you need the inner class support use JD-GUI for the parts of the code those are in.

trove4j

For those who don’t know there is some nice gnu package you can use anywhere a normal HashMap is needed.

Just

import gnu.trove.THashMap;

and use it. The only thing that differs is speed and memory usage.

Currently I’m making a differential analysis for over 1800 documents. This means comparing all documents agains each other.
approx. (1800*1799)/2=1619100 comparisons. The average file size is 0.8 MB

With HashMap<String,Integer> I’ll use about 13Gig of Ram and need about 3h20min
With THashMap<String,Integer> I’ll never need more than 8 Gig and need about 2h40min

The THashMap is part of trove4j library.

It is descibed on their webpage:

“The Trove library provides high speed regular and primitive collections for Java.”

For more Infos visit their webpage at: http://trove4j.sourceforge.net/