Latest Tweets

 

A dynamically textured emblem in Android

I thought I’d take some time out to write-up some details of how I produced this morphing icon effect that I demonstrated in my previous post. The video is grainy, so it’s a little difficult to tell, but the result is a sharp, glossy looking emblem that appears to switch material periodically.

Initial appearance of icon

Initial animation

This is the first emblem that the user sees and it matches the application icon. I want to promote an identity for the application and strengthen its association with the launcher icon. What isn’t visible in the screenshot is the animation with which the emblem first appears. Using an XML anim resource, I combine an AlphaAnimation and a ScaleAnimation to the icon to make it slowly rise-out from the background. At the same time I use another alpha animation, combined with scaling and translation to pop a shadow behind. The aim is to give the impression that the icon sits slightly above the background. It’s also worth remarking that since this activity is the one used to launch the application, the system will probably be animating the activity onto the screen simultaneously. For this reason, I’d say it’s not worth doing anything too complex; any fine details will be missed and flashy animations may be overwhelming when combined with an activity transition.

I have to say at this point that the code isn’t finished yet, there are a number of small flaws that should be fixed, but I don’t know if there’ll be time before the first release. This is demonstrated in the screenshot above; I’m not dithering my bitmaps (which are ARGB_8888 for compositing purposes) for an RGB565 window (which was the default for Android until Gingerbread). You can often get away without dithering, but where bitmaps contain smooth gradients, the artifacts become distracting.

Cycling through textures

After this first presentation of the emblem, the background of the icon changes and the emblem begins to cycle through any number of different textures.

Various emblem textures

These textures are actually being rendered by the phone (using a genetically based algorithm) while the previously generated texture is being displayed. This is all done off-screen. Although the texture rendering is pretty efficient (thanks, in large-part, to the efficiency of the Android 2D APIs) it’s more work than one could safely commit to on the main application thread. So I use an AsyncTask, but I am considering switching to a Thread with a Looper because message-passing is probably a better model for what’s going on here, with the UI sending a message (indicating that it wants to update the emblem) to a worker which posts back another message (containing the bitmap). Currently the AsyncTask applies its new emblem in its onPostExecute() method, but this turns out to be a bit ugly if you want precise control over when the new emblem actually appears in the UI.

Texturing a disc

How the textures are rendered is a whole other post (series of posts probably) but it’s enough to know that that textures are rendered as a square image:

A sample texture

A circular mask has already been created (as a PNG resource this is all white, but with alpha).

The texture mask

And these are combined using the Canvas.drawBitmap() method, and a Paint that controls the composition.

Texture and mask combined

One way to do this is by drawing the texture Bitmap onto the mask bitmap, using the SRC_IN Porter-Duff rule (this basically preserves the alpha channel of the destination image, but replaces the colour channels with those of the source image). But in this case, the texture is being rendered into a fresh Bitmap anyway, so rather than repeatedly producing fresh masks (typically with the Bitmap.copy() method), I draw the mask onto the texture using the DST_IN mode. With the result that all the colour information from the texture is preserved with only the alpha channel being the mask (note that Android has an ALPHA_8 bitmap configuration especially for masking).

There’s a huge amount you can do with the Xfermode options that Android provides and it’s really worth becoming familiar with them.

UI

Now I have got as far as periodically generating a textured disc, but I still need to apply the logo to the emblem and display the result. Android makes this mostly very simple; the emblem is an ImageView with the white “M” as the image, and the disc as a background Drawable. So the disc appears behind the M with the View drawing doing the composition for us.

Simply, this “M” becomes a Bitmap within an ImageView:

Foreground

which is given this other Bitmap as a background:

Background

to end-up looking like this:

Combined

Because any view Animation will be applied to both the ImageView contents and background together, the two can be easily manipulated as one unit in the Java code and layout XML. This ImageView is then overlayed, using a FrameLayout, with another imageView that displays the shadow (I could have applied the shadow as a background on the FrameLayout, but I need to animate it independently in the initial animation).

Generating the shaded overlay

To give a greater impression of physical depth to the emblem, you can see shading on the image of the “M”, both within it and, more significantly, around it. Creating this image actually proved bothersome. First I used POV-Ray to render the icon at the correct size. Then I needed to separate out the highlights and lowlights in such a way that the lightest and darkest parts of the image had the greatest opacity. At that point I couldn’t find any way of achieving this using GIMP. After 30 minutes of hunting around for the right filter, I wrote a throw-away Java program to do the job in just 5. It simply:

  1. reads a PNG,
  2. finds the median lightness,
  3. computes a new opacity for each pixel based on its distance from the median
  4. changes every pixel to white or black (depending it it’s lighter/darker than the median)
  5. saves the result as a second PNG

The 2D classes in Java SE aren’t as slick as the corresponding APIs in Android, but they’re more than adequate for tasks like that.

More…

This post has described in broad terms how (with a little assistance from desktop Java), the Android View, Animation and graphics APIs were combined to create this morphing icon effect. But this is only part of the work. I also needed to:

  • use Drawables to provide smooth transitions between the textures
  • give visual feedback on user interactions: in the form of clicking, focusing long-pressing
  • adapt the logo for very light textures by simulating a glowing effect.

I’ll try to cover these in later posts.

Returning dynamically generated resources

As promised in an earlier post about intents I’m sharing a tip on how to return dynamically generated resources from a ContentProvider — though what’s written here applies equally well to an Activity returning a result.

The basic scenario is this: You have implemented a ContentProvider and you want to use it to return something that is too big to fit into a cursor, say an image. It’s clear from the relevant Android platform javadocs, though perhaps not from the general ContentProvider documentation, that this is done by implementing openFile and returning a ParcelFileDescriptor.

But now suppose that the image resource is being rendered on the fly. How do you implement openFile?

As per the contact of openFile you need to return a ParceFileDescriptor which means that the resource must be returned via an open file (or possibly a socket, which we’ll ignore). This means that the image must first be written to a file, which is then opened as a ParceFileDescriptor. The interesting question here is when do you delete the file?

You can’t wait until the calling process has finished reading the file because (a) you won’t get notified when that occurs and (b) it may never exhaust the stream anyway. You can’t rely on the calling process to delete the file for you either, also, the whole idea behind wrapping the resource in this way is to guard it from direct access by other processes. You could delete the file after a fixed time period — assuming that the caller will have finished reading the file by that time. But this is unnecessarily complex, because the answer turns out to be very simple:

You delete the file immediately after you’ve opened the ParcelFileDescriptor but before you return it from the openFile method. Here are the relevant lines of the PlantProvider class from my Daisy Garden application; the try/finally block is there to ensure the file gets deleted no matter what — the last thing we want to do is leak precious storage space.

The reason this works is down to the way that Linux filesystems operate: directories maintain links to files, when a process opens a file a new link is created, closing a file or removing it from a directory removes a link. When there are no links to a file, the file is deleted.

So by opening the file in our application, we create a link to it. Then ‘deleting’ the file actually unlinks it, but the file won’t really be deleted until the file descriptor is discarded (ie. the last link is removed). This will happen automatically at some point after the calling application has finished accessing the file and any associated ParcelFileDescriptor objects have become garbage.

So it’s a very simple solution, but only if you know something about how the filesystem can be expected to work on Android devices.

An Intent to Pick Daisies

One of my favourite things about the Android platform is how easily application can share data using Intents and ContentProviders. They provide a universal basis for Android applications to interoperate and I try to expose them whenever it’s practical, even for something as singular as my Daisy Garden application.

I’ve just published a new version of Daisy Garden and with it, Intents and ContentProviders that enable the users of other Android applications to choose flowers they have grown in Daisy Garden — assuming that the developers choose to support it. And the great thing is that supporting third-party intents is very easy to do (and even if no other developers use them, they will be heavily used to integrate my own applications).


An example of what the user sees when picking a flower.

The Java code embedded below is a self-contained Activity that responds to a tap on the screen by inviting the user to pick a flower and then displaying the flower as a bitmap. There are two basic parts: forming an Intent for the user to pick a flower and forming a Uri to the flower’s bitmap. Both of these aspects can be customized; messaging to the user, and the bitmap size for example.

It’s very simple — more than half the code is boilerplate and comments. And this is what’s really good: in a well structured Android application this functionality is simple to expose too — I’d go as far as saying it’s almost automatic, so there is very little reason not to share (non-sensitive) data between applications.

Having said all this, there is an interesting technical detail here that meant that I had to give some thought to exposing the bitmaps. Daisy Garden is generating the flower bitmaps “on-the-fly” when they are requested from the ContentProvider and it’s not obvious how to share these unless you know a little about how Linux filesystems work. I’ll try to find some time to outline the technique in a subsequent post.

Java code follows:

Double hashing comparison

After my previous post comparing Java’s regular random number generator to its cryptographic SHA1PNG relation, Benjamin Manes suggested using double-hashing has a way of generating multiple and provided an implementation for me to test.

The initial results appear to be excellent. It yielded by far the shortest test time (approx. 4s compared to 18s for the standard Java random number generator) and in this limited test exhibited a distribution that is almost indistinguishable from that of SHA1PNG.

Chart of Bloom filter false-positive rates

Multi-hashing effectiveness

Bloom filters are very easy to implement. Unfortunately, there doesn’t seem to be very much information on the web that describes how actually generate all those independent hash functions you’ll need, especially over arbitrarily typed key values.

The Problem

A standard Java hash value is an int. If you need to generate say 10 hash values over a range of 1 million indices an int just wont do - you just don’t have enough bits (by my mental arithmetic you’d need about 200bits). But short of requiring that any developer who wants to use a Bloom filter implements their own good quality multi-hash implementation, the int returned by Object.hashCode() is all we have.

A Possible Solution

My approach in the past has been to use a SecureRandom instance seeded with the object hash code (or byte data from the object if its available), making repeated calls to nextInt(int) to obtain as many hash values as I need. This is almost the best you can do to create a multiplicity of well-distributed hash values within a specified range.

It is nevertheless slow, SecureRandom objects are expensive to create and use. An obvious alternative is to consider using a regular Random object, the questions are:

  1. Does it impair the effectiveness of the Bloom filter (by significantly increasing the false positive rate)?
  2. How much faster is it than using a SecureRandom?

Results

This is based on a small test application I wrote that inserted increasing numbers of elements into a Bloom filter while estimating the false-positive rate at each step. Up to 10,000 integers were stored using 4 hashes in a basic Bloom filter containing approximately 57,000 bits. The test was repeated using both a regular Random implementation and a SecureRandom “SHA1PRNG” implementation.

The median runtime of the SecureRandom test was approximately 61 seconds, whereas the corresponding time for the Random test was approximately 18 seconds.

The chart below plots, for both classes, the increase in false positive rate that occurs as more elements are added to the filter. The plots are similar:

Chart of Bloom filter false positive rates

The source code used to perform these tests, and the resulting data used to compile these charts is available.

Conclusion

The answers appear to be:

  1. not much
  2. about 3 times faster

So it is plausible that using the Random class may prove an effective way of extending the standard Object.hashCode() values to meet the requirements of probabilistic data structures such as Bloom filters.

Compact Approximators

Compact approximators are a generalization of Bloom filters, see Wikipedia for introductory reading. They replace the boolean values that compose a Bloom filter with the elements of a lattice. Lattices are algebraic structures that occur frequently in mathematics and in computing.

In the informal vocabulary of a programmer, the bit values 0 and 1 form a lattice because you can and them to get the lesser of two values and or them to get the greater of two values. Sets provide another common example of lattice algebra in programming. In this case ‘anding’ two sets is to form their intersection, and ‘oring’ them is to form their union.

In practice, a compact approximator operates something like a map, except that you can’t iterate over the keys and the values must be the elements of a lattice. As with the Bloom filter, it has the advantage that its storage size can strictly bounded. And like a Bloom filter, it’s a probabilistic data structure:

Bloom filters provide a guarantee that, though there may be false positives, there will never be a false negative. This is to say, a Bloom filter may ‘erroneously’ assert it contains an element that was never added, but it will never assert that it doesn’t contain an element that was. How does this property carry over to compact approximators?

Due to their structure, lattices impose a ‘partial ordering’ on their elements. The nature of this ordering depends wholly on the lattice structure. In the case of a lattice generated by the bits 1 and 0, its simply the case that 1 is greater than 0. Or in the case of a lattice generated over the subsets of a set: one element is larger than another if it is a superset of that other element. A compact approximator guarantees that though it may return a larger value than the one you stored for a given key, it will never return a smaller value.

What does this mean in practice?

Suppose you have distributed hash table over modestly sized cluster of machines. The hash table has redundancy, so that key values may be available from more than one machine in the cluster. Assuming, not unreasonably, that identifying which machine can respond with a key’s value is a relatively expensive operation that may involve multiple network requests, it would be useful to cache which machines have values for which keys. A compact approximator could help here.

We create a compact approximator (CA) whose potential keys range over those of the distributed hash table (remember, the CA doesn’t actually persist the keys) and whose values are taken from a lattice over the set of all machine addresses in the cluster (ie. a value is a set of machine addresses). Every time a key value lookup is successfully performed on a machine, its address is added to the CA under that key. Retrieving a key from the CA will return all the machines which have successfully returned values for the key - though it may return a few more. This data can be used as a first best effort to locate an appropriate machine, if the machine doesn’t actually have the data, a full lookup can be performed instead. The expectation is that you will have sufficiently few wrong results from the CA that the time saved by avoiding lookups will dominate the time wasted in misdirected requests.

This is just one simple scenario, there are many; compact approximators were first identified as a way to accelerate unicode text searches. There are a number of algorithms that I’m looking to apply them to so I’ve added an implementation to a Java library for compact data representations that I’m putting together. It is an early, but fully featured implementation.

Bloom filters and Hashing

Across various software libraries I’ve produced, I have a small number of ad-hoc Bloom filter implementations. What started out as a small task to rationalize them into a single more general implementation has turned out to be trickier than I expected.

Already having a target interface for the BloomFilter implementation, and though equipped with my powerful BitVector class, it turns out that the real barrier to producing a general Bloom filter implementation is producing a useful API that generalizes Object.hashCode() efficiently.

This task, one I expected to be simple, has bogged me down, and I still don’t think much of my API: there are many infidelities. One of the core requirements is that the API needs to permit hash values to be accessed quickly (say passing an Object in and getting an int back) because hashing is often performance critical. Conversely, some hashes (say cryptographic hashes like SHA-1) need to return values larger than any Java primitive. You can’t accommodate these into a single method signature. In addition, many useful data structures require the generation of multiple hashes over a specific numeric range (Bloom filters are not the only example, Cuckoo hashes provide another), even though the common case will probably remain creation of single hash values.

It’s early days, but the hashing package source-code is available.

At present it defines a core Hash interface that can generate individual hash values for objects. This is extended by MultiHash which can generate a HashList to return multiple hash values. Each Hash declares a HashRange which is the range that the hash values it generates may span. To facilitate the implementation of general hashing algorithms, a HashSource interface provides a standard way to decompose Objects into byte streams. Finally a Hashing class provides static utility methods.

The source code for the Bloom filter implementation is also available

BitVector: a Java class for bit twiddling

Prior to implementing a BloomFilter along the lines of the interface described in the Guava project issue tracker, I’ve reviewed my sprawling personal collection of Java libraries to identify the all of the ad-hoc bit-processing related code I’ve written with a view to unifying it all into something more reusable.

To my surprise, much of the code didn’t actually deal with tricky bitwise processing, but with the practicalities of efficiently moving bit data between different (bytes, ints, longs, byte arrays, IO streams, Strings, BigIntegers etc). All of those bits have to come from somewhere I guess.

Of course, there was plenty of bit twiddling, but another obvious sticking point was supporting competing requirements for mutability and immutability, as is common when one is seeking to design APIs that are both robust and efficient.

So I’ve spent a bit of time combining everything I need for various binary processing tasks (bitwise operations, (im)mutability control, conversion to standard Java types) and the result is a single stand-alone class called BitVector.

The source code for the class is available under the Apache 2.0 licence from:

http://tomgibara.googlecode.com/svn/trunk/crinch/crinch-bits/src/main/java/com/tomgibara/crinch/bits/BitVector.java

The class is mostly complete for my needs, though I haven’t yet begun to employ it in any of my projects.

For a better idea of what the class provides, the class Javadoc is included below.


BitVector stores fixed-length bit sequences of arbitrary length and provides a number of bit-wise operations, and methods for exposing the bit sequences as established java types.

In keeping with Java standards, bits are operated-on and exposed as big-endian. This means that, where bit sequences are input/output from methods, the least-significant bit is always on the right and the most significant bit is on the left. So, for example, the toString() method will contain the most significant bit in the character at index 0 in the string. Naturally, In the cases where this class is used without externalizing the bit representation, this distinction is irrelevant.

A consequence of this is that, in methods that are defined over ranges of bits, the from and to parameters define the rightmost and leftmost indices respectively. As per Java conventions, all from parameters are inclusive and all to parameters are exclusive.

Instances of this class may be aligned (see isAligned() and aligned(). Many operations will execute more efficiently on aligned instances. Instances may also be immutable (see isMutable(), mutable() and immutable()). In addition to a number of copying operations, the class also supports views; views create new BitVector instances that are backed by the same bit data, this allows applications to expose mutable BitVector instances ‘safely’ via immutable views.

The class extends Number which allows it to be treated as an extended length numeric type (albeit, one that doesn’t support any arithmetic operations). In addition to the regular number value methods on the interface, an bigIntValue() method is available that returns the BitVector as a positive, arbitrarily sized integer. Combined with the fromBigInteger() method, this allows, with some loss of performance, a range of arithmetic calculations to be performed.

The class also implements Iterable and provides methods to obtain ListIterator as one would for a List, but stops short of implementing the List interface. This allows a BitVector to be used with a range of Java language constructs and standard library classes.

The class is Serializable and Cloneable too (clones are shallow and essentially behave as a view of the original instance). For instances where more control over serialization is needed, read(InputStream) and write(OutputStream) methods are available, though better performance may result from calling toByteArray() and managing the writing outside this class.

In addition to all of the above methods outlined above, a full raft of bitwise operations are available, including:

  • set, and, or, xor operations over a variety of inputs
  • shifts, rotations and reversals
  • tests for bit-wise intersection, containment and equality
  • tests for all-zero and all-one ranges
  • counting the number ones/zeros in a range
  • searches for first/last ones/zeros in a given range
  • searches for next/previous ones/zeros from a given index
  • copying and viewing bit ranges
  • obtaining bits as Java primitive types

Most such methods are available in both an operation specific version and operation parameterized version to cater for different application needs.

Performance should be adequate for most uses. There is scope to improve the performance of many methods, but none of the methods operate in anything more than linear time and inner loops are mostly ‘tight’. An implementation detail (which may be important on some platforms) is that, with some exceptions, none of the methods perform any object creation. The exceptions are: methods that require an object to be returned, floatValue(), doubleValue(), and situations where an operation is applied with overlapping ranges of bits, in which case it may be necessary to create a temporary copy of the BitVector.

The class is marked as final to ensure that immutable instances can be safely used in security sensitive code (eg. within a PrivilegedAction).

A memory efficient Map

Building on the perfect minimal hash I’ve developed for Java String objects, I’ve developed an memory efficient implementation of Java’s Map interface.

View the map source code in SVN

The map implementation is not general purpose, it’s specifically targeted to situations where:

  • The map is keyed by strings; the valid keys are known ahead of time.
  • The map values are, on average, anticipated to be fairly dense (ie. you don’t have situation where the map could contain any of several thousand keys, but only contains one value)
  • There will be many map instances with the same keys.
  • Performance comparable to that of a ‘HashMap’ is acceptable.

Even given these constraints, I frequently encounter situations where all of these apply:

  • High volume HTTP services where values need to be parsed out of the query string and bundled into a map.
  • Data querying APIs where query parameters need to be supplied via Map for substitution into a query.
  • Proxy implementations of setter/getter interfaces that are backed by a Map of values.

The implementation stores the set of valid keys in an object that can be shared between ‘like’ maps. The keys are hashed perfectly so that the map values can be stored directly into a single object array; no entry objects are required. This means that the memory required to store large numbers of similar maps can . Furthermore, none of the basic map operations (put/get/contains etc.) performs any memory allocation. This means that for many applications both memory overhead and object churn is minimal.

From the javadocs:

Unlike regular Map implementations, there is no default, or copy, constructor. Instead the map must be constructed around a set of possible String keys. Only the keys supplied to a constructor may be used in the maps it constructs. Re-using constructors (by repeatedly using them to create ParameterMap instances) minimizes the memory required to store the map. Furthermore, the only map operations that allocate new objects are those methods that required to do so by the Map interface; in-short, object creation strictly minimized. This makes the class ideal for environments where reducing GC overhead is important.

The source code is under the Apache 2.0 licence, built with Maven, and is available from my website’s google code project:

http://code.google.com/p/tomgibara/

Minimal Perfect Hash for Strings

A minimal perfect hash is simply a bijection between a set of N objects and the first N integers. See the wikipedia page about perfect hashing for more information.

I’m looking to produce a fast and memory efficient map implementation over String keys and I want to use a perfect hash to pack values without risking collisions.

Casting around online for a Java implementation led me to Sux4J (this rather unpromising sounding library looks excellent and I’m going to take a closer look at it soon) but the implementations are too generic for my purposes.

So I’ve fashioned my own minimal perfect hashing algorithm for Java strings.I don’t know anything about the theory of such algorithms, but my approach was to build a data-structure as follows:

  1. input an array of ‘hashable’ strings
  2. order the strings by their standard Java hashcode
  3. store the hashcodes in an integer array
  4. identify spans of duplicate hashcodes
  5. if there are no duplicates then do nothing more
  6. else for each set of strings with the same hashcode
    1. build a decision tree that can distinguish the strings
    2. store a pointer to the tree for each string in the set

The gist of this approach is outlined visually in the following diagram. Here, the hashes for the strings “cat” and “dog” collide, so a small decision tree needs to be consulted in the case that either of these two are hashed to confirm which value should be returned.

There are some nuances in the actual implementation. For example, Strings are actually ordered by length and character sequence too; this assists the building of fast decision trees. Also each duplicated string stores its offset within the span to avoid scanning back to find the start of the span (I think there are better space/time trade-offs around this that I may investigate in the future.).

However, these complexities only come into play when there are collisions of the String hash function. The common case, even for fairly large numbers of strings, is that there are no collisions. This reduces the time complexity to a binary search over the integer array of hashcodes — not as fast as only taking the hashcode, but still fast. And even the exceptional case is still fairly fast since it typically needs to test candidate string’s length to determine which minimal hash value it should be assigned.

Note that the algorithm is designed to examine as little about the string as is necessary to determine the perfect hash value (this helps to make it fast). Furthermore, the algorithm does not keep a reference to the list of potential candidates for hashing (this helps to reduce memory overhead). As a consequence, the algorithm cannot necessarily distinguish valid candidates; though they are frequently identified as such, invalid string arguments may generate valid hashes.

The source code for this implementation is fully documented and is under the Apache 2.0 licence. It is available through my website’s Google code project