May 2012
1 post
3 tags
May 20th
April 2012
11 posts
4 tags
My first impression of the Chrome Web Store
I’ve been interested for a long time to find out how the chrome web store stacks up against Play (aka Android market) for app developers. So, largely by way of an experiment, I published my harmonica tuning tool on there today and I thought I’d capture some rambling observations in a blog post. In what follows: CWSDD = Chrome Web Store Developer Dashboard PADC = Play Android...
Apr 30th
1 note
2 tags
Experimenting with Harmonica Tunings
I’ve just published a small online tool investigate different harmonica tunings. http://www.tomgibara.com/harmonica/tunings/ As indicated by my previous two posts, I’ve recently become interested in learning about harmonica tunings. So I wrote a small tool that allowed me to analyze the chords that are available in a given tuning. Then I thought I’d share it, so I extended it...
Apr 27th
1 note
2 tags
Why an harmonica tuning?
I’ve just finished designing a new harmonica tuning. It’s difficult to say why. I think I just really enjoy finding elegant and balanced solutions to problems. A harmonica tuning provides a really excellent challenge - even if you’re mostly musically illiterate like me. Firstly there are a number of rigid constraints: there are fixed number of holes and physical laws...
Apr 19th
2 tags
Apr 19th
4 notes
2 tags
Binary digits
I just realized that when I count on my fingers, I count in binary. How often as adults do we count on our fingers? Today, for the first time I can remember, I did. It was a reflex, and when I looked down at my fingers, I found myself (to my own surprise!) counting in binary patterns on my fingers. My mind was jolted back to a dim memory: that in my teens, I made a concious decision that I...
Apr 18th
5 tags
Graphics utility code
I just committed some rough source code for, amongst other things, ImageUtil. You’ll find it in my Google code SVN repository You need this code to create the colour charts I posted a couple of weeks ago. It’s inside a Maven project, but you can probably just cherry pick this file and have things work: ...
Apr 15th
3 tags
Crinch 0.7.2 Released
Crinch has had another minor bump to version 0.7.2. This release fixes an issue that can prevent some character sequences being returned from tries.
Apr 12th
2 tags
Apr 10th
4 notes
2 tags
Apr 7th
2 tags
Apr 6th
1 note
3 tags
Only the stupid
I’m writing this based on very limited information, I only know what the BBC has reported here. But I feel that I don’t need to know more than this: A new law - which may be announced in the forthcoming Queen’s Speech in May - would not allow GCHQ to access the content of emails, calls or messages without a warrant. But it would enable intelligence officers to identify...
Apr 1st
1 note
March 2012
11 posts
3 tags
Mar 30th
1 note
3 tags
Crinch 0.7.1 Released
A quick post to say that the Crinch libraries have had a minor bump from 0.7 to 0.7.1. It was motivated by an urgent fix for a bug that impacted one of my clients. The bug was in the incomplete and undocumented record library, so it shouldn’t have any impact for other users of the library at the moment. The only new aspect to this release is that the source code for each library is now...
Mar 30th
2 tags
Eye Colour Variation
For reasons I won’t elaborate on now, I want to find out how eye color can be parameterized. After some fruitless online searches, I resorted to discovering it myself. I’m not looking perform a scientific study, so my steps are simple: sample some eye colours, analyze them with some plots, create a model and then validate it. The first two steps haven’t taken very long at...
Mar 28th
1 note
3 tags
Crinch 0.7 Released
Another weekend and another code release. I’ve released a new version of my crinch libraries. It now includes a documented API for binary coding. It currently supports a range of universal and non-universal codings. All I/O for the package is directed via the Crinch bits library which makes the library very flexible. Universal codes The library currently supports the following universal...
Mar 25th
3 tags
BitWriter performance
Crinch provides a number of different BitWriter implementations that can be used to accumulate written bits in memory. They each have advantages in different situations: ByteArrayBitWriter is fairly fast, but can’t be sized at a bit level. OutputStreamBitWriter can be used to write unbounded amounts of data, but is slower. IntArrayBitWriter has a fairly balanced performance profile. ...
Mar 15th
3 tags
All I want to do is tweak a method
My encodePositiveInt method is now going to allow calls with zero. So how should I change the method name to reflect this? encodeNonNegativeInt long encodeNonNegative overloaded encodeNonNegInt ugly encodeUnsignedInt misleading encodePositiveInt inaccurate Goddamit. What do I call this method?
Mar 14th
2 notes
5 tags
The impact of Views on BitVector performance
Following on from my post about the design of the BitVector class yesterday, the question was asked, do you really need operations over ranges when you could just use views? For example, the following two lines of code will have exactly the same result: v.setRange(100, 200, false); v.rangeView(100,200).set(false); Except that the second line will create an intermediate object. Putting aside...
Mar 11th
5 tags
Why does BitVector have so many methods?
This is a post about the API design of BitVector. I don’t think I’ve ever written a class which carried so many methods and here I try to give some justifications. Binary data, in the form of an order sequence of bits, is an extremely fundamental software abstraction. Perhaps as a consequence of this there’s frequently a lot of it, and there are often many operations that need...
Mar 10th
4 tags
Crinch 0.6 Released
I’m pleased to announce that version 0.6 of Crinch has been released. To accompany this release, I’ve added a dedicated Crinch project page on my website. This is an important release because it’s the first in which any of the Crinch libraries is stable enough to be properly documented and supported. This is the Crinch bits library. The good news is that the entire bit...
Mar 9th
6 tags
Generating De Bruijn sequences using BitVector
Seeing a Gray code being used yesterday, in turn made me think about De Bruijn sequences. Both enumerate all fixed-length words over set of symbols (frequently the binary symbols 0 and 1). But where Gray codes change one symbol at a time to generate the next word, De Bruijn sequences shift the whole word by one symbol. I’ve posted about De Bruijn sequences before. They have a special...
Mar 3rd
3 tags
Mar 3rd
3 notes
February 2012
2 posts
5 tags
BitVector: now with more bits!
It’s been a while since I first shared my BitVector class and in that time it has grown larger and even more useful. Yet it remains a self contained class that’s easy to use in any Java project. The source code is available under the Apache 2.0 licence. It’s killer benefit for me is that it makes it really easy to cast bits from one form into another and to operate on them, and...
Feb 29th
2 tags
“Sometimes the truth is best told with a lie.”
– Tom Gibara
Feb 12th
January 2012
8 posts
3 tags
Identifying uniqueness in O(n log α) bits
Note that my analysis of the algorithm here is in no way complete. This serves me as documentation; it’s public because I hope it may be of use to others. Though I have implemented this algorithm and am employing it daily, what follows may be inaccurate or just plain wrong. Problem Given a large table of records, stored on disk, how do we verify that every record is unique, (ie. that they...
Jan 28th
48 notes
3 tags
Jan 14th
3 notes
2 tags
An algorithm for painting clusters
As I posted yesterday, I needed an algorithm that could paint clusters so as to make them distinguishable from each other. Today, I implemented a simple, “good enough” algorithm. Inputs and Outputs The algorithm is fairly general, only requiring that each cluster has: has a size — the number of points it contains has a radius — gives a general indication of the...
Jan 13th
2 notes
5 tags
Jan 13th
12 notes
6 tags
A prefix metric on Strings
I’ve been investigating how clustering can be applied to sets of character strings in which common prefixes indicate similarity, and I came up with the following metric (probably known already, but no amount of online searching gave any useful results). A metric over infinite sequences We start buy considering the set \(S\) of sequences over some alphabet \(\Sigma\). We write...
Jan 7th
54 notes
4 tags
Living in Hash Table Ignorance
I’ve been living in hash table ignorance, always accepting the received wisdom that a load factor (the ratio of elements stored to available buckets) of around 0.75 provides a good trade-off between storage and performance but never stopping to think about what’s actually going on. For example, what happens at the limit, when the number of elements approaches the number of buckets in a...
Jan 6th
20 notes
4 tags
Benchmarking Crinch hashing against Guava
I did my first profiling of Crinch’s hashing code this evening at it’s been interesting. Guava’s new hash package gives me something against which I can benchmark. Methodology My methodology, briefly, was to implement the murmur3 32 bit hash then use the hash on a million randomly populated POJOs that look like this: private static class Pojo { String strVal; int...
Jan 3rd
12 notes
5 tags
Guava's hashing abstractions compared to Crinch's
Guava recently introduced a new hashing package and I’ve spent the evening benchmarking my hash library implementation against Guava’s. I’m interested in whether I can abandon my com.tomgibara.crinch.hashing package for com.google.guava.hash. It’s been fascinating to closely inspect an alternative abstraction of the same (fairly fundamental) concept; especially since...
Jan 2nd
11 notes
December 2011
5 posts
4 tags
Can the GVM library be used on Android?
…say to cluster points in a map based on the zoom? (information about the GVM Java library) I was just asked this question by email, and thought I’d respond via a blog post in case the same query recurs. This short answer is: I certainly hope so, it’s exactly the kind of task the library was designed for, but there are some caveats: I’ve not personally used the...
Dec 29th
12 notes
4 tags
Dec 29th
4 notes
3 tags
How to use Crinch's BloomFilter
It’s high time I made my open source Bloom filter implementation properly available by taking time out to introduce it properly. Firstly, to get a copy of it, you’ll currently need to check it out of my Google code project and build it using maven: svn checkout http://tomgibara.googlecode.com/svn/trunk/crinch cd crinch mvn install It’s a multi-module build and you’ll...
Dec 13th
4 notes
3 tags
Dec 6th
1 note
5 tags
Mapping numbers to numbers compactly
Imagine a situation where every word in a large corpus is assigned an arbitrary numeric identifier, and you need a data structure that can quickly return the word-frequency for any particular identifier. If you are not a programmer or you can’t imagine finding yourself in this situation you can safely stop reading now. I wasn’t in exactly in this situation, but a similar one that...
Dec 1st
3 notes
October 2011
4 posts
3 tags
Oct 21st
20 notes
6 tags
A Permutations package for Java
Permutations are a key abstraction that aren’t covered by the standard libraries and haven’t been well catered for outside of them (as far as I know). My crinch library now addresses this with a new package for handling permutations. It’s efficiently implemented for speed and memory usage, robustly coded for use in security sensitive contexts, and has a nice fluent API. ...
Oct 17th
23 notes
7 tags
Oct 12th
100 notes
3 tags
Clustering Angles
I was recently asked via email for a suggestion of how to adapt my GVM algorithm to cluster angular values. The example that was given in the email makes the problem clear: Correct clustering: { 1°, 359°, 355° }, { 85°, 89°, 96° } Incorrect clustering: { 1°, 85°, 89°, 96° }, { 359°, 355° } Unfortunately, directly adapting the algorithm so that it works with angular data is not possible...
Oct 4th
26 notes
August 2011
3 posts
2 tags
Aug 11th
4 tags
The Riots
Like many Britons, I have been watching the news of recent days with sadness, dismay and not a little anxiety. Listening to the seemingly ceaseless reports of gangs engaging in arson, looting and assault, it has been difficult not to succumb to an impotent anger; anger that lives are being overturned by the actions of a few selfish delinquents. It is inevitable that emotions will be swelled by...
Aug 9th
4 tags
Reservations about the Web Intents system
This is a repost of a comment I originally made on Paul Kinlan’s blog. Since it appears that moves to adopt Web Intents in Chrome and other browsers are progressing quickly, I felt that I should reiterate my reservations here. I’m excited about the prospect of an intent mechanism for the Web to parallel that provided by the Android framework. The Intent has demonstrated itself to be...
Aug 4th
33 notes
July 2011
5 posts
2 tags
A programmer's vocabulary
The word vocabulary has a special place in my memory… I recall as a small boy being totally unable to pronounce the word - I could get as far as the “vocab” part and would then stumble hopelessly on the second half of the word, and circumlocution was difficult because there are no good synonyms (that I’m aware of) for its everyday meaning. The only other word I recall...
Jul 22nd
3 notes
2 tags
Jul 19th
17 notes
3 tags
Jul 17th
8 notes
5 tags
Rendering Android Views in the background
The Android AdapterView classes are very efficiently implemented and provide developers with a solid basis for creating fast, smooth views over arbitrarily large datasets. Their basic APIs have remained stable since Android was introduced and yet, based on the applications I try-out (and even some I use regularly) it seems that developers often make poor use of them. This is the first of two...
Jul 15th
13 notes
2 tags
Jul 11th
11 notes