I just committed some rough source code for, amongst other things, ImageUtil.
You’ll find it in my Google code SVN repository
You need this code to create the colour charts I posted a couple of weeks ago.
It’s inside a Maven project, but you can probably just cherry pick this file and have things work:
And watch out for this possible issue:
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4886732
I haven’t been able to track down all of the situations it occurs in.
This is a post about the API design of BitVector. I don’t think I’ve ever written a class which carried so many methods and here I try to give some justifications.
Binary data, in the form of an order sequence of bits, is an extremely fundamental software abstraction. Perhaps as a consequence of this there’s frequently a lot of it, and there are often many operations that need to be perform on it.
The large number of useful operations possible over binary data is one reason that BitVector has many methods, but there are others.
Developers spend most of their time processing binary data as byte sequences (generally byte arrays in most languages). Grouping bits into bytes, and in more generally into longer words (such as ints and longs), has the significant benefit of improving performance. An operation on a subsequence of 64 bits could be done with a single operation on a long. Addresses are shorter too; I can’t imagine any developer hankering for bit level memory addresses. And when a given word length neatly accommodates the data, everything is much simpler.
But some situations arise where fixed word lengths just don’t fit. Universal codes are one example: bit-length can vary dramatically from one value to the next, and though lengths are frequently less than a single byte long, they are unbounded. In this case, and in others, there isn’t an easy way to avoid bit-level processing.
So we’re still in the situation where we might have lots of data, and lots of things to do with it, but now have the added pressure of reduced performance. Take the example of a simple loop over a byte array which sets every byte to zero. The same loop at bit level will take eight times longer. What is more, the language (Java in this case) isn’t going to give you any core primitives to help: exposing direct access to byte arrays can boost efficiency, but bit-level operations are necessarily going to be forced through a method call. And, even more significantly for performance, the work of setting a single bit will also require significantly more code than the simple assignment needed for a language primitive like byte.
This pressure on performance significantly changes the trade-offs made when designing a general purpose API for bit-level processing. Whereas, the value of providing a method that, say, performs an element-wise XOR between two byte arrays is questionable (given the simplicity of implementing such a method), the value of the same method at bit level is much higher because very significant optimizations become available.
So performance is another reason that BitVector has so many methods. Simplistically: more specialized methods allow for more optimizations(*). But there is another reason there are so many methods.
Prior to starting development of the bit package, and BitVector in particular. I reviewed my various code bases to see what I needed from such a package. What I found was that my code (which often exposed bit level data in thinly wrapped byte arrays) was frequently forced to make array copies solely to guarantee encapsulation. And additionally, bytes were being copied into other intermediate forms, and then copied back just so that certain operations could be performed by methods that didn’t support the originating data representation.
I realized that, by creating a single class that supported encapsulation and provided these disparate methods, I could eliminate a great deal of the copying in my applications. And when there’s a lot of data, this gives a very large gain in performance.
Now, as one would expect when designing a class which is becoming overloaded with methods, I investigated ways of splitting the functionality into separate classes, and superficially this appears easy - the functionality of BitVector can be split along a number of clear lines and an API looking something like this is obvious:
v.asBits().setRange(0, 5, true);
v.asNumber().intValue();
v.asSequence().firstZero();
But a difficulty arises when BitVectors are aggregated in large numbers. For example, I had one small application that did lots of processing with 128 bit numbers. It stored a great many of these values, (representing them as a class with two long fields). For BitVector to be usable in this context, and others like it, per-instance memory usage becomes a consideration: Every ‘sub-view’ of a BitVector would require one of: storage of a reference to the view in a field and references back OR creating a new view on each call OR something more complicated.
I won’t go into detail, but however you tackle these options, nothing very good comes out of them: memory swells, or uneven performance occurs in ways that a developer wouldn’t anticipate. So I went for the low-tech approach - clear and logical method names together with documentation (still ongoing).
I’ll have to wait and see how it turns out.
(*) I also chose to provide more generalized methods too - this decision is more questionable - and was made on the basis that there is utility in the generalized form for some applications.
I took my son to a hospital appointment today. A standard part of any visit involves measuring the child’s height. My son was nervous, so I volunteered to go first and noticed that measuring device used a Gray code. Afterwards, the nurse kindly let me take this picture.
A small plastic paddle that can rest on the top of the head ran along a vertical track attached to the wall. The track had printed along its length a standard Gray code which I infer was read by optical sensors housed by the paddle. The changing pattern of light and dark regions encodes the height of the paddle above the ground. The reliability of Gray codes in mechanical tracking is well known (to those with an interest in computer science I guess) but this was the first time I’d seen one visibly used.
Naturally, the staff were curious about my curiosity, so I briefly explained that the pattern on the mounting was actually a code that was read by the paddle to compute the height; the staff had assumed the pattern was purely decorative. If, given the appearance of the code, this seems remarkable to you, you may not have sampled the gamut of NHS interior decor.
It’s been a while since I first shared my BitVector class and in that time it has grown larger and even more useful. Yet it remains a self contained class that’s easy to use in any Java project. The source code is available under the Apache 2.0 licence.
It’s killer benefit for me is that it makes it really easy to cast bits from one form into another and to operate on them, and safely pass them around: all very cheaply. It’s so useful, I’d really like to see other developers using it, so I’ve started trying to write a code based introduction to the class.
I’m not even half done, but I think the results so far are worth sharing. Don’t be intimidated! There are lots of methods but any given application will only need a small number of them. The benefit of locating all of the methods on a single class is that it makes things convenient & efficient.
It’s high time I made my open source Bloom filter implementation properly available by taking time out to introduce it properly.
Firstly, to get a copy of it, you’ll currently need to check it out of my Google code project and build it using maven:
svn checkout http://tomgibara.googlecode.com/svn/trunk/crinch
cd crinch
mvn install
It’s a multi-module build and you’ll need the following jars to make it work:
In the future I’ll probably produce a crinch-all.jar that conveniently aggregates these smaller jars.
All the Bloom filter related code is in the com.tomgibara.crinch.collections package, but here’s some code to get you off to a good start.
Permutations are a key abstraction that aren’t covered by the standard libraries and haven’t been well catered for outside of them (as far as I know).
My crinch library now addresses this with a new package for handling permutations. It’s efficiently implemented for speed and memory usage, robustly coded for use in security sensitive contexts, and has a nice fluent API.
It’s under the Apache 2.0 license and available from my Google Code project*.
Here’s the gist…
(*) No jar binaries yet; take the source, or build via Maven.
A global postal coding system.
I’ve barely been at my home computer during the past couple of months as I work towards the conclusion of a project for a multinational company that has now occupied me for over a year.
Writing good software at this scale requires much effort, and sometimes the complexities almost become frustrations; modelling logistics and delivery in countries without postal codes is one such complexity.
And although I haven’t been at a computer, it hasn’t stopped me filling my notebooks with ideas that would resolve some of my frustrations. Like this, a system for coding small regions of the globe with a focus on human communicability: readily identified, unambiguous, 6 or 8 character codes which include a checksum that can catch the majority of basic transcription errors.
The fact that it’s based on a single continuous space-filling curve that loops through every region of the earth’s surface is a bonus.
The Android AdapterView classes are very efficiently implemented and provide developers with a solid basis for creating fast, smooth views over arbitrarily large datasets. Their basic APIs have remained stable since Android was introduced and yet, based on the applications I try-out (and even some I use regularly) it seems that developers often make poor use of them.
This is the first of two posts that I’m hoping will improve this situation, by sharing some of the code I’ve developed for my own applications.
The performance of an AdapterView is mostly determined by the Adapter you provide it with. There are two core obligations when developing an Adapter that will provide a smooth user experience†:
It’s that simple… Except what happens when your items are records that have to be retrieved from a Web server? And, to compound things, what if displaying the record requires downloading a couple of images? These problems can actually be tackled separately. This post provides a base class that helps with 2 (I’ll provide code to help with 1 at a later date).
What makes 2 complex in this scenario is that there’s no time to do anything other than return a preliminary rendering of the View from the Adapter, so the work to produce a complete rendering of a View must be done on a background thread.
Unfortunately, what makes this still more complex, is that the efficiency of classes like ListView and GridView is only possible because they recycle the Views they use to render items. This complicates the task of asynchronously updating the View because during the interval in which the complete content for the View is being generated, the parent AdapterView may have assigned it another item to display (possibly many times). In this case the existing rendering task should probably be abandoned (it certainly shouldn’t get assigned to the view) and instead a new rendering needs to commence.
Of course, this situation may itself be temporary: A user may briefly scroll an item off-screen and then back again; we don’t want to keep restarting the same rendering task without making any progress. To handle this efficiently caching may be necessary and may need to be combined with partial updates. And, as if things weren’t complex enough, the re-appearing item may be assigned to a different View to the one that initially displayed it (even if its position on-screen does not appear to have changed for the User).
So, yes, there is some complexity around implementing a good Adapter, but many Android developers have had years to get this right! Hopefully this code will help: it handles all of this apparent complexity in what is a fairly simple base class called ViewRenderer.
It’s designed to be invoked within the getView() method of Adapter. In this approach the getView() method should be limited to:
View, or inflating/constructing a new one (but not customizing it for the item)renderView() method of ViewRenderer to perform all subsequent customization of the View.To make this work, you obviously need to customize the ViewRenderer with your own rendering logic. This is done by implementing the following three methods:
void prepare(View, Param, int) This method is called on the main application thread before the View is first displayed to the user. It’s your applications opportunity wipe-down the View (since it may have previously presented another adapter item) and display a quick placeholder/loading view.Render render(Param, int) This method is called on a background thread and does the slow work (eg. drawing or loading) to convert the item into the resources needed to display it (eg. a Bitmap or POJO).void update(View, Render, int) This method is called on the main application thread to apply a new render to a previously prepared View.And that’s pretty much it, everything is reduced to these three digestible chunks. The int that’s being passed into these methods specifies an optional rendering pass to accommodate progressive rendering. There are a few additional operational aspects that can be controlled: render caching, view tagging, thread priority and execution. Documentation for these and everything else can be found in the source code comments.
The code is amenable to a number of improvements (some possibilities are noted code comments), but I hope it’s useful and results in some smoother Android lists and grids, and here it is:
† Not creating garbage helps too.
When I posted action shots of my personal code motivation cards I didn’t realize they would be quite so hard to read. So here’s a clearer view:


And here is the text on the backs:

Never lose sight of the final goal: to deliver working software to engaged users.
Software with no users satisfies no one.
Focus on writing code and avoid distractions.
You cannot subsitute longer hours for a singular attention to the task at hand.
Don’t write code simply because you understand code.
Take time to understand the problem you are solving and the way you are solving it.
Have high standards for the code you write.
If you fall short of them, make sure it’s for reasons you can justify.
Resist the temptation to sidestep difficult or dull features out of laziness.
Complete solutions always expose better approaches.
Document for yourself, not just for other developers.
You will forget, and regaining your understanding will be more difficult without it.
Naturally, with this degree of brevity, there’s not much room for nuance. But the cards aren’t strictly prescriptive, their purpose is to nudge me away from bad habits.
My coding motivation cards.
I wrote them for myself, but recently added some explanatory text to the back so that other people could know what they stand for.
I don’t actually need any motivation to code, I just sometimes need a little extra motivation to code better.