Neuromorphic Computing Chips Make Transistors Feel Obsolete
Intel’s upcoming platform for neuromorphic computing, the Loihi neuromorphic processor, reflects a broader trend in the AI hardware community, but you wouldn’t know it from looking at the packaging. The Loihi processor packs major changes to the underlying transistor architecture and uses a unique signaling method used for computation. Intel’s products and other chips from hardware startups are starting to resemble the functionality in the human brain more closely.
These new products reflect a recent trend in the embedded AI community: to bring AI capabilities to the device level and away from the data center. So what exactly makes on-chip AI different, and what new adventures in AI can designers expect to see once these products are commercialized? For now, we can look at some of the claims from Intel to see how these systems will perform under pressure from increasingly demanding AI workloads.
Getting Under the Hood
The image below shows the Loihi chip from Intel, which claims to provide ~1000x speed increases for computationally intensive tasks like image recognition/segmentation and NLP. Other companies are working on their own neuromorphic products, and I’ve been fortunate enough to review some research papers in this field. It’s certainly an exciting time to work on neuromorphic computing. Perhaps the most important takeaway from the Loihi chip is it reflects a recent trend by researchers and startups to redesign transistor architectures to support on-device AI workloads fundamentally.
Intel has taken the same track as some other companies I know of, where AI workloads are distributed across multiple cores to provide a massive speedup of multiply-accumulate (MAC) operations directly on a chip. It’s not clear if Intel has taken the approach of encoding multiple neuron weight values in an AI core, but this the approach taken by others working on AI-optimized SoCs. Namely, transistors are engineered to provide a range of discrete levels for encoding neuron weights in neural networks, rather than storing a binary number (i.e., two weight values). If you’re familiar with digital logic, you can already see how this rethinking of transistors as mixed-signal devices allows transistor counts in these products to be greatly reduced.
This space is so new that people can rightly argue which metrics are best for comparing AI hardware platforms. Everyone I know in this space, whether they are in academia or industry, uses MAC operations per second to quantify AI compute power. This measures how quickly a processor can perform iterative matrix calculations used in machine learning algorithms.
The software side of all this is incredibly interesting, especially when we start thinking about how to shift away from the current set of rules-based AI systems. The goal in neuromorphic computing is to allow the computer to learn the rules by which decisions are made and to do away with the current IF-THEN-ELSE architecture of AI systems. A side goal of neuromorphic computing is to bring these capabilities onto the end device level and out of the data center.
Get Out of the Data Center With Neuromorphic Computing
Today’s heavy-hitting AI systems do not run on end devices. You may have phones and laptops that can run some simpler unsupervised learning algorithms at the device level to (take Apple as an example). Still, many of the envisioned applications for artificial intelligence simply can’t be run on a small chip on client devices. These applications are meant to provide a meaningful user experience, but I don’t think anyone would be able to call your smartphone truly “smart” within the envisioned applications for AI.
If you want to do AI at the hardware level (i.e., on a chip instead of in a data center), you currently have two options:
- The big, slow, hot option: Use multiple CPUs/GPUs in parallel with big heatsinks and fans. These boards can get pretty large, which may be fine for robotics or automotive, but not for something like wearables or mobile phones. Even at the 5 nm transistor node, you still hit a heat wall that prevents the device from scaling to heavier AI compute workloads. This is why heavy-hitting tasks (e.g., the NLP engine in Amazon Alexa) are performed in a data center, not on the device.
- The small, fast, difficult option: Redesign the transistor architecture so that a few transistors can act as a neuron, rather than using large banks of transistors logic to simulate a neuron. These redesigned transistors are like mixed-signal devices that encode weight values for neurons as a range of values, rather than storing weight values as bits in large logic circuits.
The second option is the route being taken by researchers in academia, companies like Intel, and a number of startups. The current set of components and systems being announced are nowhere near the level of human intelligence; it would take ~1000 of Intel’s Nahaku boards (each with 32 Loihi chips, see below) to provide the same number of neurons as a human infant. Still, going the route of redesigning transistors as mixed-signal devices is paying dividends.
Although the Loihi chip and other products don’t create human-level intelligence at the device level, it’s a major step towards bringing more AI compute power to the device level. It’s also a major step forward in terms of power efficiency and scalability. The chips and their cores are designed for parallelization, giving a simple way to scale processing power that matches the current strategy used in large servers. We can envision new systems being designed on a board with a master MCU or FPGA, which then controls the AI block and other sensors required for the system to function.
What Happens at the Board Level?
Designers will need to get their hands on a sample part to see all the specs and interfaces available on these chips. Most likely, you’ll work with standard low-speed and high-speed interfaces to interface the chip with a controller and memories, so you’ll still be working with the same signalling standards you’re used to. The difference here is that applications involving computer vision, NLP, or sensor fusion will require multiple sensors to be integrated onto the board to provide an interface with the real world.
In the near future, expect these components to become more commercialized and integrated into data centers for high-speed, high-efficiency training and inference in AI applications. I think there will still be some reliance on the data center for AI applications until some other major breakthrough allows more AI compute power to be placed at the device level.
When you need to design and layout PCBs for neuromorphic computing systems, the design tools in Altium Designer® contain everything you need to keep up with new technology. Designers and layout engineers will have access to a complete set of tools for designing PCBs for any application. Now you can download a free trial of Altium Designer and learn more about the industry’s best layout, simulation, and production planning tools. Talk to an Altium expert today to learn more.