By Yuvraj Malik and Katie Paul
(Reuters) – Meta Platforms Inc said on Friday it was releasing to researchers a new large language model, the core software of a new artificial intelligence system, heating up an AI arms race as Big Tech companies rush to integrate the technology into their products and impress investors.
The public battle to dominate the AI technology space kicked off late last year with the launch of Microsoft-backed OpenAI’s ChatGPT and prompted tech heavyweights from Alphabet Inc to China’s Baidu Inc to trumpet their own offerings.
Meta’s LLaMA, short for Large Language Model Meta AI, will be available under non-commercial license to researchers and entities affiliated with government, civil society, and academia, it said in a blog.
Large language models mine vast amounts of text in order to summarize information and generate content. They can answer questions, for instance, with sentences that can read as though written by humans.
The model, which Meta said requires “far less” computing power than previous offerings, is trained on 20 languages with a focus on those with Latin and Cyrillic alphabets.
“Meta’s announcement today appears to be a step in testing their generative AI capabilities so they can implement them into their products in the future,” said Gil Luria, senior software analyst at D.A. Davidson.
“Generative AI is a new application of AI that Meta has less experience with, but is clearly important for the future of their business.”
AI has emerged as a bright spot for investments in the tech industry, whose slowing growth has prompted widespread layoffs and a cutback on experimental bets.
Meta said LLaMA could outperform competitors that examine more parameters, or variables that the algorithm takes into account.
Specifically, it said a version of LLaMA with 13 billion parameters can outperform GPT-3, a recent predecessor to the model on which ChatGPT is built.
It described its 65-billion-parameter LLaMA model as “competitive” with Google’s Chinchilla70B and PaLM-540B, which are even larger than the model that Google used to show off its Bard chat-powered search.
A Meta spokeswoman attributed the performance to a larger quantity of “cleaner” data and “architectural improvements” in the model that enhanced training stability.
Meta in May last year released large language model OPT-175B, also aimed at researchers, which formed the basis of a new iteration of its chatbot BlenderBot.
It later introduced a model called Galactica, which could write scientific articles and solve math problems, but quickly pulled down the demo after it generated authoritative-sounding false responses.
(Reporting by Yuvraj Malik and Eva Mathews in Bengaluru and Katie Paul in New York; Editing by Shailesh Kuber and Grant McCool)