Android 4.4 & rendering pipeline improvements
#android   #androiddev   #projectbutter

Android 4.4 adds many new developer APIs, such as printing and animated transitions, that are extensively documented in the platform highlights and release notes. The Android team has also made numerous under-the-hood optimizations that applications automatically benefit from. Since I'm passionate about graphics, I would like to share with you some of the graphics-related optimizations brought to you by the Android framework team in 4.4.

Shared assets texture
Most Android applications load a similar set of assets, used by the framework to render Holo widgets. These assets, for instance a button's pressed state, have always been preloaded and shared between processes through zygote. This unfortunately didn't apply to the textures generated from those assets. Until 4.4, every app drawing a button would create a GPU copy of the button asset as a texture.

Starting with Android 4.4, the system generates at startup a single, shared texture containing most Holo assets. This has two major benefits.
First, each process will use a little bit less memory. This is particularly important to help Android run on devices with "only" 512 MB of RAM for instance.
The second benefit is improved sorting, batching and merging of drawing operations. This optimization was introduced in Android 4.3 and just got even better. For instance, text fields and buttons and can now all be drawn together in a single draw call. This significantly reduces the number of state changes and calls to the OpenGL drivers.

To generate this shared texture, the system will compute an atlas of all the assets every time you get a new version of the framework. The atlas is computed based on the set of assets preloaded in zygote as well as your device's resolution and GPU characteristics. A Nexus 5 will not use the same atlas as a Nexus 7. The atlas generation process iterates through different algorithms to find one that works best on your device. I have attached an example of what the final shared texture looks like to this post (it's a bit old so the assets it contains do not match the new Android 4.4 style.)

Here is an animated example of batching and merging in action with the new shared texture: https://plus.google.com/109538161516040592207/posts/WHwzJinyFY6

Better merging of drawing operations
The merging code was improved to allow more operations to be merged together. 9-patches and scaled bitmaps in particular will be merged more frequently.

Asynchronous texture uploads
On devices that support OpenGL ES 3.0, Android's rendering pipeline will use Pixel Buffer Objects (or PBOs) to update font caches asynchronously. Such update are  usually performed at the very beginning of a new frame, when the GPU is typically idle. This improves parallelism and saves up to a couple of milliseconds per frame in applications that put a lot of pressure on the font cache (emojis, CJK locales, large fonts, rotated or scaled glyphs, etc.)

Improved GPU state management
Through various optimizations, including state caching and data sharing, the rendering pipeline is now a little better at managing the GPU state. For instance, the font renderer and hardware layers now all share a single Vertex Buffer Object (or VBO) to store indices. These optimizations also cut down on the number of calls to OpenGL drivers.

Automatic overdraw avoidance
The rendering pipeline is now able to detect simple cases of overdraw and fix them on the fly. The best example is a window background entirely covered by other opaque elements. The extraneous window background will not be rendered at all, thus saving CPU processing time and GPU bandwidth. This optimization is disabled when the overdraw debugging tools are enabled to make sure you will not miss those extra views and/or drawables. This does not mean you shouldn't fix overdraw in your application!

Software v-sync
+Naseer Ahmed justly reminded me of software v-sync. This new feature allows to offset v-sync events at the framework level from the actual hardware v-sync events. The goal is to reduce latency by getting work done ahead of the hardware. See https://android.googlesource.com/platform/frameworks/native/+/faf77cce9d9ec0238d6999b3bd0d40c71ff403c5 for the gory details.

OpenGL drivers pre-loading
Android 4.4 will pre-load OpenGL drivers in the zygote process. This feature may not work on all devices and depends on how the OpenGL drivers are implemented. The goal of this optimization is to reduce memory usage. Any application that uses OpenGL – and since applications are hardware-accelerated it applies to any application with a GUI – will benefit from this optimization.
Photo
Shared publiclyView activity