Liking cljdoc? Tell your friends :D

Deep Learning in a Single File for Smart Devices

Deep learning (DL) systems are complex and often depend on a number of libraries. Porting a DL library to different platforms can be painful, especially for smart devices. One simple solution to this problem is to provide a light interface to the library, complete with all required code in a single file with minimal dependencies. In this document, we explain how to amalgamate all necessary code into a single file, and demonstrate the approach with an example in which we run image recognition on a mobile device.

Amalgamation: Making the Whole System a Single File

We come to the idea of of amalgamation following the example of SQLite, which pack all the code needed to run a simple database into a single source file. All that's necessary to create the library is to compile that single file. This simplifies the problem of porting to various platforms.

Thanks to Jack Deng, MXNet provides an amalgamation script that compiles all code needed for prediction based on trained DL models into a single .cc file, containing approximately 30K lines of code. This code only depends on the BLAS library. Moreover, we've also created an even more minimal version, with the BLAS dependency removed. You can compile the single file into JavaScript by using emscripten.

The compiled library can be used by any other programming language. The .h file contains a light prediction API. Porting to another language with a C foreign function interface requires little effort.

For examples, see the following examples on GitHub:

If you plan to amalgamate your system, there are a few guidelines you ought to observe when building the project:

  • Minimize dependence on other libraries.
  • Use namespace to encapsulate the types and operators.
  • Avoid running commands such as using namespace xyz on the global scope.
  • Avoid cyclic include dependencies.

Image Recognition Demo on Mobile Devices

With amalgamation, deploying the system on smart devices (such as Android or iOS) is simple. But there are two additional considerations:

  • The model should be small enough to fit into the device's memory.
  • The model shouldn't be too expensive to run given the relatively low computational power of these devices.

Let's use image recognition as an example. We start with the state-of-the-art inception model. We train it on an ImageNet dataset, using multiple servers with GTX 980 cards. The resulting model fits into memory, but it's too expensive to run. We remove some layers, but now the results are poor.

Finally, we show an Android example, thanks to Leliana, https://github.com/Leliana/WhatsThis to demonstrate how to run on Android.

By using amalgamation, we can easily port the prediction library to mobile devices, with nearly no dependency. After compiling the library for smart platforms, the last thing we must do is to call C-API in the target language (Java/Swift).

Besides this pre-trained Inception-BatchNorm network, we've provided two pre-trained models.

We tested our model on a Nexus 5:

Top-1 Validation on ILSVRC2012TimeApp SizeRuntime Temp Memory Req
FastPoorNetaround 52%, similar to 2011 winner1s<10MB<5MB
Sub InceptionBNaround 64%, similar to 2013 winner2.7s<40MB<10MB
InceptionBNaround 70%4s-5s<60MB10MB

These models are for demonstration only. They aren't fine-tuned for mobile devices, and there is definitely room for improvement.
We believe that making a lightweight, portable, and fast deep learning library is fun and interesting, and hope you enjoy using the library.

Source Code

https://github.com/Leliana/WhatsThis

Demo APK Download

Can you improve this documentation? These fine people already did:
Sheng Zha & thinksanky
Edit on GitHub

cljdoc is a website building & hosting documentation for Clojure/Script libraries

× close