The Push for Machine Learning on Edge Devices, Including Your Smartphone

Machine learning (ML) execution and inferences are being pushed into so-called “edge” devices, which includes smartphones. Deloitte Global thinks that 300 million smartphones sold in 2017 have on-board ML.[i] The migration of intelligence to the edge is necessary to process data close to the source and augment the cloud. Reasons for moving computing to the edge include low latency, privacy, and connectivity issues. Pushing ML to edge devices is not the same as training for machine learning; models are trained on high performance compute platforms, and the resulting model gets pushed to an edge device. Most understand that an ML model can exist and execute on an intelligent edge device, but an inference is a different but related concept. An inference refers to neural networks that infer, or make reasonable assumptions, about new data that comes in based on its existing training model.
Machine Learning vs. Inference

Deep Learning vs. Inference. (Click to enlarge.) Inference uses the model created via training with deep learning. (Credit: Nvidia)

Intelligence at the edge is the forefront of Artificial Intelligence (AI) and the Internet of Things (IoT), but challenges are great when it comes to shoe-horning an ML model into an edge device with resource constraints. Nevertheless, the smartphone is the most prevalent compute  platform. IHS Markit predicts six billion smartphones in use by 2020. Applications that use ML will make smartphones more autonomous, increasing privacy and reliability because they will not have to connect to a cloud as often. The challenge for AI workloads is that they are very compute-intensive, involve large and complicated neural network models with complex concurrencies. Smartphones (and other IoT devices) always tend to be on and often must produce results in real time. Constrained environments require thermally efficient design to enable sleek, ultralight devices that need a long battery life for all-day use and come with storage and memory limitations. Smartphones and IoT devices share common challenges. This is one reason why any improvement in ML algorithms is a big deal right now. Companies are racing to optimize space and run-time efficiency for improving ML on resource-constrained devices. Again, the benefit to computing at the edge versus connecting to a cloud removes latency and connectivity issues and increases privacy as authentication data is not required to travel. Face recognition as a method for authenticating payments from a smartphone, for example, needs to be local to the smartphone to decrease security vulnerabilities and increase reliability.

Away from the developer and in the hands of consumers, Google has created a crowd-sourced training model called Federated Learning that “enables mobile phones to collaboratively learn a shared prediction model while keeping all the training data on the device, decoupling the ability to do ML from the need to store the data in the cloud.” It’s kind of like creating an automated patch on your  smartphone to update the smartphone model in the cloud. You download the current model from the cloud and improve the model by allowing it to learn from data on your phone. Changes made to the model are sent as a “small, focused update” which is encrypted and sent to the cloud. This might explain why there’s an increase in crazy suggestions (words I have never seen before) for my Android’s text entries these days.

Google can use data from your smartphone

Google’s research blog states, “Your phone personalizes the model locally, based on your usage (A). Many users’ updates are aggregated (B) to form a consensus change (C) to the shared model, after which the procedure is repeated.” (Source: Google)

[i] Neal, Phil. “The Deloitte Consumer Review Digital Predictions 2017.” Deloitte., Deloitte,