Latest Tweets

 

Android for Augmented Reality

Moseycode is an augmented reality barcode system that I’ve been developing for the Android platform. As the project’s profile slowly grows, I’ve had a number of people ask me about the suitability of Android for similar projects. I’ve decided to put my comments, such as they are - limited by my own narrow experience - into a post so that I can refer future enquiries here.

I started developing Moseycode when the Android SDK first became available - before any Android powered phones were available - so much of the early development of Moseycode was done using a PC, a webcam and the JMF framework. None of the image processing I’ve been focusing on has required high resolution capture, but webcams are really very problematic - especially since they usually lack any focus control. The upside is that if you get your algorithms working reliably with webcam captured footage, you should be pleasantly surprised by the results from anything else!

Though I’m a huge fan of the Android platform, I think it has some drawbacks at present for this sort of development:

  • Generally I would rate the image capture quality of the G1 as fairly poor. This is probably hardware related but I’d speculate that some of it may be due to underdeveloped drivers. I can’t comment on the image quality from other phones. The auto focus though is a major boon to barcode capture.

  • I like the SDK (I’m an experienced Java developer) but real-time image processing won’t fly with the current Dalvik interpreter; the C/C++ support in the form of the NDK is very focused on compatibility and not features (which is definitely the right call) so consequentially the available libraries are quite minimal and this might cause difficulties porting over existing image processing libraries (though I’m not sure - I haven’t had to try).

  • The camera APIs within the Android framework are weak. I always feel I have to qualify this by saying that (1) this is simply my opinion, check out them out for yourself and form your own opinion, and (2) there are many reasons that poor APIs arise, most of which are not the responsibility of the engineers who craft them. That said, there are some serious structural problems in the APIs that thwart attempts to write portable and performant applications. Many of the problems are documented in the public issue tracker. My bête noire is issue 2794. I’ve got as far as posting an outline proposal for how things could be improved but I’m unlikely to find the time to attempt an implementation until the first release of Moseycode is out of the way.

  • There is no effective API available for parsing and and rendering 3D object models. I don’t necessarily expect one to be provided within the standard framework, it’s simply unfortunate that, at this time, the platform is not sufficiently mature that any such libraries are available. Perhaps if/when OpenGL ES is directly accessible from native code it will be possible to port an established C library.

All those niggles notwithstanding, it’s exciting being able to capture and process live image data on a mobile phone, something that’s so central to people’s lives. Combined with Android’s powerful abstractions that support collaborative applications it’s really exciting.

I’ll end by saying that I’m currently working towards releasing Moseycode as an open-source platform that developers will be able to re-use at a number of levels. Android might be a more inviting target for augmented reality applications when that is available since it will provide a jumping-off point for developers to build their own applications.

Standardizing details of the Moseycode 2D to 3D mapping.

Standardizing details of the Moseycode 2D to 3D mapping.

I couldn’t resist getting diverted this evening by an idea I’ve been turning over in my head for a good while now. Codenamed Moseyplane, the aim is to provide an extendable and reliably trackable surface for mobile augmented reality games.

In this case I’m trying to ensure that every portion of the plane contains multiple rings with a  wide distribution of diameters. There’s a long way to go from this initial design. For example I need to tackle identifying locations within the plane in addition to the plane itself.

I couldn’t resist getting diverted this evening by an idea I’ve been turning over in my head for a good while now. Codenamed Moseyplane, the aim is to provide an extendable and reliably trackable surface for mobile augmented reality games.

In this case I’m trying to ensure that every portion of the plane contains multiple rings with a wide distribution of diameters. There’s a long way to go from this initial design. For example I need to tackle identifying locations within the plane in addition to the plane itself.

Slowly limping towards open-sourcing Moseycode. These are the projects that comprise the android client library. There are changes to come… “shared” will probably change to “symbology” and a new project will be added to contain my really shoddy pose estimation code (I’m almost too embarrassed to open source it).

Maven really makes this sort of work much easier.

Slowly limping towards open-sourcing Moseycode. These are the projects that comprise the android client library. There are changes to come… “shared” will probably change to “symbology” and a new project will be added to contain my really shoddy pose estimation code (I’m almost too embarrassed to open source it).

Maven really makes this sort of work much easier.

Good progress on getting Moseycode’s realtime 3D rendering running under Android on my G1. The stuttering framerate is caused by incessant garbage collection caused by the camera API. It’s frustrating that the camera API forces me to step down to 176x144 too - it’s not quite good enough for steady detection using my current algorithms.

YUV420 to RGB565 conversion in Android

Moseycode is an augmented reality barcode system that combines live camera data and 3D graphics. Due to limitations in the way that live camera data is rendered by the Android framework, if you want to overlay 3D graphics onto live camera data your only option is to convert the camera data into an OpenGL texture.

The method below does just that in Java – and it’s not impossibly slow either; it’s been optimized quite a lot. At the fallback camera preview dimensions of 176x144 it takes approximately 90ms to perform the conversion on my G1.

/**
 * Converts semi-planar YUV420 as generated for camera preview into RGB565
 * format for use as an OpenGL ES texture. It assumes that both the input
 * and output data are contiguous and start at zero.
 * 
 * @param yuvs the array of YUV420 semi-planar data
 * @param rgbs an array into which the RGB565 data will be written
 * @param width the number of pixels horizontally
 * @param height the number of pixels vertically
 */

//we tackle the conversion two pixels at a time for greater speed
private void toRGB565(byte[] yuvs, int width, int height, byte[] rgbs) {
    //the end of the luminance data
    final int lumEnd = width * height;
    //points to the next luminance value pair
    int lumPtr = 0;
    //points to the next chromiance value pair
    int chrPtr = lumEnd;
    //points to the next byte output pair of RGB565 value
    int outPtr = 0;
    //the end of the current luminance scanline
    int lineEnd = width;

    while (true) {

        //skip back to the start of the chromiance values when necessary
        if (lumPtr == lineEnd) {
            if (lumPtr == lumEnd) break; //we've reached the end
            //division here is a bit expensive, but's only done once per scanline
            chrPtr = lumEnd + ((lumPtr  >> 1) / width) * width;
            lineEnd += width;
        }

        //read the luminance and chromiance values
        final int Y1 = yuvs[lumPtr++] & 0xff; 
        final int Y2 = yuvs[lumPtr++] & 0xff; 
        final int Cr = (yuvs[chrPtr++] & 0xff) - 128; 
        final int Cb = (yuvs[chrPtr++] & 0xff) - 128;
        int R, G, B;

        //generate first RGB components
        B = Y1 + ((454 * Cb) >> 8);
        if(B < 0) B = 0; else if(B > 255) B = 255; 
        G = Y1 - ((88 * Cb + 183 * Cr) >> 8); 
        if(G < 0) G = 0; else if(G > 255) G = 255; 
        R = Y1 + ((359 * Cr) >> 8); 
        if(R < 0) R = 0; else if(R > 255) R = 255; 
        //NOTE: this assume little-endian encoding
        rgbs[outPtr++]  = (byte) (((G & 0x3c) << 3) | (B >> 3));
        rgbs[outPtr++]  = (byte) ((R & 0xf8) | (G >> 5));

        //generate second RGB components
        B = Y2 + ((454 * Cb) >> 8);
        if(B < 0) B = 0; else if(B > 255) B = 255; 
        G = Y2 - ((88 * Cb + 183 * Cr) >> 8); 
        if(G < 0) G = 0; else if(G > 255) G = 255; 
        R = Y2 + ((359 * Cr) >> 8); 
        if(R < 0) R = 0; else if(R > 255) R = 255; 
        //NOTE: this assume little-endian encoding
        rgbs[outPtr++]  = (byte) (((G & 0x3c) << 3) | (B >> 3));
        rgbs[outPtr++]  = (byte) ((R & 0xf8) | (G >> 5));
    }
}