It was my pleasure speaking at the first LTS event last week about the complexity and issues related to cross device mobile development from the perspective of a Java/JVM developer.
When we created Codename One with the intent of delivering Java’s Write Once Run Anywhere (WORA) solution to all mobile devices we were keenly aware of many of the issues involved since we came from an established background of device development and porting. We too, however, were caught by surprise at some of the difficulties and complexities we ran into when trying to adapt such solutions for all of today's mobile devices.
Recap In my talk I covered 4 major topics going from the easy stuff to the more complex issues, the major topics I discussed are: device density, device fragmentation, scene graph/adapting to low level graphics and JIT/AOT/GC issues.
I naturally simplified all of the above considerably in order to squeeze the general ideas into a 90 minute talk but the concepts should give you some perspective of the difficulties involved when creating such a solution and as well as providing perspective when building mobile applications (using cross platform tools or otherwise).
Device Fragmentation My opinion of device fragmentation is that this subject is often addressed by developers comparing Android to iOS which is both unfair and no longer accurate. In the past iOS development was MUCH easier since it had just one or two resolutions but these keep piling up. While the number of resolutions needed in Android is a few order of magnitude larger than the one iOS developers have to deal with, Android was constructed to deal with this reality from day one and iOS was not. Today there are really 3 major problems that are at the root of device fragmentation:
Bugs - there are serious device specific bugs in Android mostly when dealing with specific hardware capabilities such as with a camera. However, there are odd VM behaviors that only appear in specific devices I’m assuming that these relate to a very specific Android version that a particular vendor chose to use.
Lack of OS updates - as a Galaxy Nexus owner this stings. Google just announced that the device I bought from them directly 18 months ago will no longer be supported or updated. This is painful, we still have a VERY large Android 2.x install base years after 4.x came out. This in itself isn’t a big deal as much as the fact that OS bugs (of which there are many) just don’t get fixed. Worse, new issues get introduced in newer OS’s and you can no longer use the backward compatible API effectively e.g.: https://code.google.com/p/android/issues/detail?id=58385 (the workaround is obviously to adapt by version detection, hardly elegant or simple).
Vendors - Google is currently trying to grab hold of Android by moving its new features into the market app (google play) and offering services thru apps. This is actually a great way to add features to older devices that are no longer updated (e.g. my galaxy nexus) however, this code is proprietary. Hence it doesn’t make it into Amazon’s version of Android. Samsung is also trying to fork the OS more and more and this approach is leading to a bad place.
Resolutions Dealing with multiple resolutions is often perceived as the big challenge of Android and cross platform. It’s really not that hard since the strategies for multi resolutions programming have been around for years. However, most multi resolution development centered around desktop developers who had relatively low variance in their target platforms. It’s really easy to illustrate the problem when discussing the iPhone 4 which has exactly double the density that iPhone 3 had (4 times the pixels per inch of screen!). This is PPI or DPI (Points/Dots Per Inch) iPhone 4 and 5 have 326ppi where modern Android devices can reach past 500! These are ridiculously high numbers and it’s far harder to work with higher ppi devices than with resolution variation.
Normally when we build a touch oriented application we try to size elements for fingers, they should be big enough to touch and spaced enough so you won’t touch the wrong thing. However, if the pixels are two dense you will end up with a smaller physical surface and a cramped UI. If you err on the side of safety your data will not appear on the screen. Since the variation is so large a developer needs to provide image alternatives for every density to enable the UI to adapt properly. This is the strategy taken both by Android and iOS however iOS has the advantage of supporting only two densities retina and non-retina since it has such strong control over its devices/market.
We also have the 9-patch (or 9-piece) images which allow us to construct infinitely scaled textures without degradation and layout managers to arrange elements as we need them.
This whole scenario would seem like an ideal case for vector graphics, however they aren’t supported natively by Android or iOS and are only supported via their webkit implementation. The main reasoning behind this is battery life and performance; raster graphics are simple and very fast. They don’t require CPU power to draw thus don’t waste battery life which is a very precious rare substance on mobile devices.
Scene Graph There are two approaches a platform vendor can pick when building a graphics layer in his mobile OS. Scene Graph and Immediate Mode (Immediate mode is used in many different and contradicting contexts, there is no clearly defined definition for the “opposite approach to scene graph”). Immediate mode is simple and should be very familiar to anyone who wrote graphics in Swing/AWT/Java2D, Win32 etc. Effectively the OS invokes your paint method instructing you to draw, you draw everything you need every time and rinse repeat. If you need to refresh the UI invoke repaint to draw again. This has the huge advantage of simplicity over other solutions, it’s really easy to get started and just draw arbitrary data.
Scene graph is arguably more advanced, instead of drawing elements you construct a tree (graph) of the environment e.g. to draw a rectangle you just add a rectangle object to the tree or objects. To rotate it you define a rotation transform and place it above the rectangle in the hierarchy. This is more complex and it gets both harder and simpler when animations come into play. With animations we (developers) bare all the responsibility in the immediate mode and we tend to be lazy just moving things around. Scene graph requires a bit of work since you need to create a destination mutation for the tree, define keyframes and a way in which the mutation will take place. This is complex but it ends up creating a very refined animation from pre-existing pieces which give scene graph UI’s a very fluid feel.
The real benefit of scene graphs is that they are optimized by the vendor and can use the GPU very effectively by potential. Unfortunately because they provide a relatively high level of abstraction it’s really hard to tune your application properly with scene graph and detect when you are doing things that aren’t quite optimal.
The problem with scene graph is that it is inherently hard to port. While an immediate mode API will always have a set of simple API’s matching things such as drawString, drawImage etc. a scene graph will have a far more complex hierarchy not to mention complex rules and logic about animations. So you can create a portable scene graph implementation (e.g. Flash and JavaFX are such examples) but you will find it very hard to port it on to a scene graph native API e.g. JavaFX is written on top of OpenGL and not on top of Core Animation which is a scene graph API.
The real challenge here is with platforms such as iOS which are based on Scene Graph, we can’t port into that API since it’s too platform specific. Luckily all platforms have a gaming API are always very fast and efficient. Unfortunately this often doesn’t include API’s that are rather important for applications but not essential for some games such as text input or even string drawing. iOS is pretty good in that regard since it allows mixing its OpenGL implementation with its higher level graphics API resulting in both great performance and access to all of the native API’s. Windows Phone 8 doesn’t include that and effectively forces you into a choice of either using its highly limiting scene graph API or using Directx from C++ without fonts, text input or interaction with native widgets. This is obviously pretty limiting. Our current approach maps our immediate mode API on top of the Windows Phone 8 scene graph API which is obviously very limited and problematic. VM’s, AOT & GC Mobile JIT’s (just in time compilers) aren’t optimized for performance; they are optimized for memory conservation, battery conservation and security. This is very evident in J2ME’s pre-verification process which both saved battery and size by moving a piece of the VM verifier to work offline. This is just as correct for Dalvik which is a pretty great VM, it isn’t the fastest VM for mobile but it isn’t trying to be and we should not judge mobile performance based on our experiences when dealing with desktop development.
The only way of getting Java code onto iOS is by AOT compilation (ahead of time compilation) which allows us to package the fully complied application as a single binary.
Since Objective-C isn’t the fastest language in the world Java is actually pretty high performance in this case and compiles rather well using several open source solutions available today. The main issue still remaining is the HUGE size of the JVM and thus the overhead of even a stripped down Java application binary. The solution is to ruthlessly prune the AOT compiler to remove all unnecessary overhead and bring it down; also tools such as reflection just don’t work in this case since they effectively require the packaging of the whole VM without removing the symbol information! In this sense the Java API is often a much bigger hindrance to portability/size/performance than the Java language.
GC is often maligned as a major cost in terms of performance, it is not free and it is not determined that it is a given. However, it isn’t very expensive and because of its nature GC cycles can effectively be tuned to points in the application where CPU is idle. If you don’t have CPU idle points you are doing something wrong (even in a high FPS game CPU utilization should max the GPU well before the CPU is maxed out). The big cost with GC is deal locating native resources which hold “peer” references. Since these resources need an additional GC cycle for the finalization process which is very volatile to begin with, these native resources can both clear a lot of memory but also trigger complex types of fragmentation and very slow cleanups. When working with resources that have a native element to them it is highly recommended to hold on to them when possible to avoid a major GC penalty.
Final Thoughts I got some feedback both during and after the talk from listeners most of which revolved around Codename One and performance. Codename One was designed initially to run on far weaker CPU’s than the ones we have today and with heaps as low as 2mb. When running on a modern phone e.g. Android which considers a 512mb heap as low end, this might seem amusing but it’s a huge advantage since Codename One’s efficient architecture really takes advantage of the additional horsepower. The main benefit is in very low memory utilization which in turn allows for more caching, overall caching is pretty much the biggest benefit you can give to performance. You see this in CPU level caches all the way up to big data sharding strategies! Keeping your data as close as possible is a sure way to make your code fly.
The other thing we do well is just push out raw pixels as fast as possible. The fact that Codename One’s graphics layer is actually very simple and it does one thing well: push pixels fast. This allows something that both Google & Apple figured out when building their platforms. They gave raster images a front center place within their architecture because that is just the fastest most battery efficient way to push pixels to the screen.