LLMs Accessible to All: LLaMA courtesy llama.cpp, revolutionize Accessibility
Large language models (LLMs) like GPT-3 are no longer exclusive to big tech companies, as the open release of Facebook’s LLaMA model and llama.cpp by Georgi Gerganov has enabled developers to run LLMs on their own hardware. This change is reminiscent of the Stable Diffusion moment in August 2022, which kick-started a new wave of interest in generative AI.
LLMs, which have primarily been developed by private organizations, are resource-intensive and expensive to operate. Consequently, they have been accessible only through APIs and web interfaces. However, the release of Facebook’s LLaMA model, a collection of foundation language models, has changed the landscape. LLaMA models range from 7B to 65B parameters, and LLaMA-13B even outperforms GPT-3 on most benchmarks.
Although LLaMA is not fully open and requires users to agree to strict terms for access, the model files have been made available via unofficial BitTorrent links. Georgi Gerganov’s llama.cpp project allows LLaMA to run on personal laptops using 4-bit quantization, which reduces model sizes and hardware requirements. This breakthrough has made GPT-3 class models accessible on consumer hardware.
Here is 4-bit inference of LLaMA-7B using ggml:https://t.co/NSetwCti03
— Georgi Gerganov (@ggerganov) March 10, 2023
Pure C/C++, runs on the CPU at 20 tokens/sec (M1 Pro)
Generated text looks coherent, but quickly degrades – not sure if I have a bug or something 🤔
Anyway, LLaMA-65B on M1 coming soon!
Despite the potential negative uses of LLMs, such as spam generation, disinformation, and automated radicalization, there are numerous ways LLMs can be utilized for good. Generative AI tools can enhance productivity and enable users to tackle ambitious projects. The focus should be on exploring and sharing positive applications of this technology.
As the race to release the first fully open language model heats up, LLaMA serves as a proof-of-concept that LLMs can be feasibly run on consumer hardware. The era of LLMs being accessible to everyone is already here, opening up a world of possibilities for innovation and exploration.