At the International Solid State Circuits Conference in San Francisco this week, MIT researchers presented a new chip designed specifically to run mobile neural networks. The chip is 10 times as efficient as a mobile GPU and means mobile devices could run powerful AI algorithms locally.
Vivienne Sze, an assistant professor of electrical engineering at MIT whose group developed the new chip said that deep learning was useful for many mobile applications including object recognition, speech, face detection.
“Right now, the networks are pretty complex and are mostly run on high-power GPUs. You can imagine that if you can bring that functionality to your cell phone or embedded devices, you could still operate even if you don’t have a Wi-Fi connection. You might also want to process locally for privacy reasons. Processing it on your phone also avoids any transmission latency, so that you can react much faster for certain applications.”
Dubbed Eyeriss, the new chip could be useful for the Internet of Stuff. AI armed networked devices could make important decisions locally, entrusting only their conclusions, rather than raw personal data, to the Internet. And, of course, onboard neural networks would be useful to battery-powered autonomous robots.
Sze and her colleagues used a chip with 168 cores, roughly as many as a mobile GPU has.
Eyeriss’s minimized the frequency with which cores need to exchange data with distant memory banks, an operation that consumes time and energy. The GPU cores share a single, large memory bank and each Eyeriss core has its own memory. The chip has a circuit that compresses data before sending it to individual cores.
Each core can communicate directly with its immediate neighbours, so that if they need to share data, they don’t have to route it through main memory.
The final key to the chip’s efficiency is special-purpose circuitry that allocates tasks across cores. In its local memory, a core needs to store not only the data manipulated by the nodes it’s simulating but data describing the nodes themselves. The allocation circuit can be reconfigured for different types of networks, automatically distributing both types of data across cores in a way that maximizes the amount of work that each of them can do before fetching more data from main memory.
At the conference, the MIT researchers used Eyeriss to implement a neural network that performs an image-recognition task, the first time that a state-of-the-art neural network has been demonstrated on a custom chip.