Latest Tweets

 

Can the GVM library be used on Android?

…say to cluster points in a map based on the zoom?

(information about the GVM Java library)

I was just asked this question by email, and thought I’d respond via a blog post in case the same query recurs. This short answer is: I certainly hope so, it’s exactly the kind of task the library was designed for, but there are some caveats:

  1. I’ve not personally used the library on Android - there could be some minor incompatibilities due mismatches between the standard Java APIs and Android (though I don’t anticipate any). Should such an issue arise, I’m sure it could be patched very quickly.

  2. As standard, I would typically use the implementation in the com.tomgibara.cluster.gvm.dbl package that uses double coordinates. But since double arithmetic may perform really poorly on handsets (I don’t know) it may be necessary to use alternative coordinate types for better performance (the float based com.tomgibara.cluster.gvm.flt for example). Note that whatever primitive coordinate-type you use, it will not avoid all double arithmetic.

  3. How many points do you have device side? Making point clustering worthwhile when panning and zooming freely around a map implies to me that there are lots of points. A natural question is how do these points find their way into the application in the first place? Downloading 1M points in response to a query and then clustering them into 30 groups for display is probably not a good use of the device’s resources. Of course GVM could be used server side too, to provide a quick result to the user while lots more results streamed in for clustering in near-real-time.

Finally, I’d add the comment that GVM is very well suited to exactly this sort of application because of the flexibility that it provides around cluster keys: each time a cluster is enlarged or merged, the application is given the opportunity to modify the key - this means that the cluster key can be used to maintain all the salient state for the application (and no more, so performance can remain good).

In this online demonstration the cluster is keyed against the most populous city in the cluster but this could easily have been something else entirely.

A memory efficient Map

Building on the perfect minimal hash I’ve developed for Java String objects, I’ve developed an memory efficient implementation of Java’s Map interface.

View the map source code in SVN

The map implementation is not general purpose, it’s specifically targeted to situations where:

  • The map is keyed by strings; the valid keys are known ahead of time.
  • The map values are, on average, anticipated to be fairly dense (ie. you don’t have situation where the map could contain any of several thousand keys, but only contains one value)
  • There will be many map instances with the same keys.
  • Performance comparable to that of a ‘HashMap’ is acceptable.

Even given these constraints, I frequently encounter situations where all of these apply:

  • High volume HTTP services where values need to be parsed out of the query string and bundled into a map.
  • Data querying APIs where query parameters need to be supplied via Map for substitution into a query.
  • Proxy implementations of setter/getter interfaces that are backed by a Map of values.

The implementation stores the set of valid keys in an object that can be shared between ‘like’ maps. The keys are hashed perfectly so that the map values can be stored directly into a single object array; no entry objects are required. This means that the memory required to store large numbers of similar maps can . Furthermore, none of the basic map operations (put/get/contains etc.) performs any memory allocation. This means that for many applications both memory overhead and object churn is minimal.

From the javadocs:

Unlike regular Map implementations, there is no default, or copy, constructor. Instead the map must be constructed around a set of possible String keys. Only the keys supplied to a constructor may be used in the maps it constructs. Re-using constructors (by repeatedly using them to create ParameterMap instances) minimizes the memory required to store the map. Furthermore, the only map operations that allocate new objects are those methods that required to do so by the Map interface; in-short, object creation strictly minimized. This makes the class ideal for environments where reducing GC overhead is important.

The source code is under the Apache 2.0 licence, built with Maven, and is available from my website’s google code project:

http://code.google.com/p/tomgibara/