I did my first profiling of Crinch’s hashing code this evening at it’s been interesting. Guava’s new hash package gives me something against which I can benchmark.
My methodology, briefly, was to implement the murmur3 32 bit hash then use the hash on a million randomly populated POJOs that look like this:
private static class Pojo {
String strVal;
int intVal;
long longVal;
boolean boolVal;
}
Hashing them all with Crinch using:
private static class PojoSource implements HashSource<Pojo> {
public void sourceData(Pojo pojo, WriteStream out) {
out.writeChars(pojo.strVal);
out.writeInt(pojo.intVal);
out.writeLong(pojo.longVal);
out.writeBoolean(pojo.boolVal);
}
}
And hashing them all with Guava using:
private static class PojoFunnel implements Funnel<Pojo> {
@Override
public void funnel(Pojo pojo, Sink into) {
into.putString(pojo.strVal)
.putInt(pojo.intVal)
.putLong(pojo.longVal)
.putBoolean(pojo.boolVal);
}
}`
And storing the hashes in a million element int[], recording the median number of milliseconds required for both Crinch and Guava over 10 repetitions. Yes, this is a very rough first investigation, but you have to start somewhere. Code for the test case is available at:
The variance between repetitions was very low and on my machine (running OpenJDK 6) the times were consistently:
This is a much larger disparity than I anticipated. So I attached a profiler and used method sampling to profile the call tree (instrumentation would have provided more complete information but may have skewed the timings). It appears that much (though not all) of the disparity is due to the time spent in Crinch’s WriteStream.writeChars() method compared to Guava’s equivalent Sink.putString() method.
Here’s a barely legible screenshot of the call tree breakdown:

Both methods take a single CharSequence, iterate over its characters and hash them individually, but Crinch has an optimized path for the common case where a String is passed to the method; Guava does not. There is actually further scope for optimizing Crinch here too that I may investigate.
It’s too early to draw any conclusions about the relative overhead of hashing in Guava compared to that of Crinch. I’ll want to revisit the comparison when I’ve found time to build a patched version of Guava containing some appropriate optimizations.