What if any desktop PC could become an AI inference beast with a single upgrade? And what if that transformed beast still sipped power like it was enjoying a martini?
That’s the idea pitched by Neuchips, a Taiwanese startup founded in 2019 and known for delivering top-class AI efficiency. It came to CES Unveiled 2024—the media pregame show before the main event—with a PCIe add-on card that can upgrade the AI capabilities of a typical desktop computer while adding just 55 watts to the PC’s power budget.
It’s not just a concept. The card was plugged into a desktop computer on the show floor and offered real-time, offline conversation with a chatbot powered by Meta’s popular Llama 2 7B large language model (Neuchips says the card will also run Llama 2 13B).
Neuchips’ card, the Evo PCIe accelerator, is built around the company’s Raptor Gen AI accelerator chip. The Raptor chip delivers “up to 200 tera operations per second,” and the company says it’s optimized for transformer-based models.
The card that Neuchips demonstrated had the Raptor chip, but a single chip isn’t the card’s final form. Neuchips’ CEO Ken Lau, an Intel veteran of 26 years, says Raptor can be used to design cards with varying numbers of chips onboard.
“The chip is actually scalable,” says Lau. “So we start with one chip. And then we have four chips. And then eight chips.” Each chip provides up to 200 trillion operations per second (TOPS), according to Neuchip’s press release. The card also carries 32 GB of LPDDR5 memory and reaches 1.6 terabytes of memory bandwidth. Memory bandwidth is important, because it’s often a factor when handling AI inference on a single PC.
Neuchips wants to give owners the tools needed to use the card effectively as well, although with many months until release the details here remain a bit sparse. A Neuchips representative said the company has compiler software and will provide a driver. The demonstration I saw had a custom interface for interacting with the Llama 2 7B model. Neuchips’ card was running, but it appeared bare-bones.
A focus on efficiency
There’s already hardware that anyone can plug into a desktop’s PCIe slot to greatly improve AI performance. It’s called a GPU, and Nvidia has a stranglehold on the market. Going toe-to-toe with Nvidia on performance would be difficult. In fact, Nvidia announced new cards with a focus on AI at CES 2024; the RTX 4080 Super, which will retail for US $999 starting on 31 January, quotes AI performance of up to 836 TOPS.
Neuchips, however, sees an opening. “We are focused on power efficiency,” says Lau, “and on handling the many different models that are out there.”
Modern graphics cards are powerful, but also power hungry. The RTX 4080 Super can draw up to 320 W of power and will typically require a computer with a power supply that can deliver at least 750 W. Neuchips’ Evo PCIe accelerator, by contrast, consumes just 55 W of power. It consumes so little…
Read full article: CES 2024: Neuchips Demos Low-Power AI Upgrade for PCs
The post “CES 2024: Neuchips Demos Low-Power AI Upgrade for PCs” by Matthew S. Smith was published on 01/09/2024 by spectrum.ieee.org