Researchers upend AI status quo by eliminating matrix multiplication in LLMs

Researchers declare to have developed a brand new option to run AI language fashions extra effectively by eliminating matrix multiplication from the method. This essentially redesigns neural community operations which are at present accelerated by GPU chips. The findings, detailed in a recent preprint paper from researchers on the College of California Santa Cruz, UC Davis, LuxiTech, and Soochow College, may have deep implications for the environmental impact and operational prices of AI methods.

Matrix multiplication (typically abbreviated to “MatMul”) is on the center of most neural community computational duties at this time, and GPUs are significantly good at executing the maths rapidly as a result of they’ll carry out massive numbers of multiplication operations in parallel. That means momentarily made Nvidia the most valuable company on this planet final week; the corporate at present holds an estimated 98 percent market share for knowledge heart GPUs, that are generally used to energy AI methods like ChatGPT and Google Gemini.

Within the new paper, titled “Scalable MatMul-free Language Modeling,” the researchers describe making a {custom} 2.7 billion parameter mannequin with out utilizing MatMul that options comparable efficiency to traditional massive language fashions (LLMs). In addition they reveal working a 1.3 billion parameter mannequin at 23.8 tokens per second on a GPU that was accelerated by a custom-programmed FPGA chip that makes use of about 13 watts of energy (not counting the GPU’s energy draw). The implication is {that a} extra environment friendly FPGA “paves the best way for the event of extra environment friendly and hardware-friendly architectures,” they write.

The approach has not but been peer-reviewed, however the researchers—Rui-Jie Zhu, Yu Zhang, Ethan Sifferman, Tyler Sheaves, Yiqiao Wang, Dustin Richmond, Peng Zhou, and Jason Eshraghian—declare that their work challenges the prevailing paradigm that matrix multiplication operations are indispensable for constructing high-performing language fashions. They argue that their strategy may make massive language fashions extra accessible, environment friendly, and sustainable, significantly for deployment on resource-constrained {hardware} like smartphones.

Getting rid of matrix math

Within the paper, the researchers point out BitNet (the so-called “1-bit” transformer approach that made the rounds as a preprint in October) as an essential precursor to their work. In line with the authors, BitNet demonstrated the viability of utilizing binary and ternary weights in language fashions, efficiently scaling as much as 3 billion parameters whereas sustaining aggressive efficiency.

Nevertheless, they notice that BitNet nonetheless relied on matrix multiplications in its self-attention mechanism. Limitations of BitNet served as a motivation for the present research, pushing them to develop a totally “MatMul-free” structure that would keep efficiency whereas eliminating matrix multiplications even within the consideration mechanism.

Source link

Expert Predicts SHIBASHOOT Could Be the Next 100x SHIB – Shiba Shootout Presale Review

Zenless Zone Zero: Everything we know about HoYoverse’s new gacha – release times, platforms and more

“RegreSSHion” vulnerability in OpenSSH gives attackers root on Linux

Say ‘Hi’ to The Acolyte’s New Little Guy

‘Metroid Prime 4’ Gets a Release Date After Years of Troubled Development

Nvidia, with $3.34 Trillion Market Cap, Becomes Most Valuable Company

Netflix House will open two locations in Texas and Pennsylvania in 2025

CoinPoker Up 80x During Bear Market – Could It Be the Best Crypto Gaming Platform? ClayBro’s Video Reviews

Most Popular

Say ‘Hi’ to The Acolyte’s New Little Guy

‘Metroid Prime 4’ Gets a Release Date After Years of Troubled Development

Nvidia, with $3.34 Trillion Market Cap, Becomes Most Valuable Company

Our Picks

Google’s greenhouse gas emissions climbed nearly 50 percent in five years due to AI

Snap will pay $15 million to settle California lawsuit alleging sexual discrimination

Get a one-year subscription to Microsoft 365 for $45 – a new low price

Researchers upend AI status quo by eliminating matrix multiplication in LLMs

Getting rid of matrix math

Related Posts