Meta has just released Llama 3.1, the largest open-source AI model with 405 billion parameters, promising to outperform both GPT-4o and Claude 3.5.
Last April, Meta revealed that it was developing its first product for the AI industry: an open-source platform that would rival today’s best platforms from companies like OpenAI. Now, that platform is here.
Meta is releasing Llama 3.1, its largest open-source AI model to date, which the company claims outperforms Anthropic's GPT-4o and Claude 3.5 Sonnet on several benchmarks.
Meta Launches Open Source AI That's Superior to GPT-4o - 1
The company is also making its Llama-based Meta AI assistant available in more countries and languages, and adding a feature that can generate images based on a sample image of a specific person. CEO Mark Zuckerberg now predicts that Meta AI will be the most widely used assistant by the end of the year, surpassing ChatGPT.
Llama 3.1 is significantly more complex than the smaller Llama 3 models released a few months ago. The largest version has 405 billion parameters and was trained with more than 16,000 of Nvidia's super-expensive H100 GPUs. Meta hasn't disclosed how much Llama 3.1 cost to develop, but based on the cost of Nvidia's chips alone, it's safe to assume it cost hundreds of millions of dollars.
So, given the cost, why does Meta continue to offer Llama with a license that only requires approval from companies with hundreds of millions of users? In a letter published on Meta’s corporate blog, Zuckerberg argues that open-source AI models will surpass, and are currently improving faster than, proprietary models, similar to how Linux became the open-source operating system that powers most phones, servers, and gadgets today.
Zuckerberg compared Meta's investment in open-source AI to the earlier Open Compute Project, which he said saved the company "billions of dollars" by having outside companies like HP help improve and standardize Meta's data center designs as the company built its own capabilities. Looking ahead, the Meta CEO expects a similar dynamic to play out with AI, writing that "I believe the Llama 3.1 release will be a turning point in the industry where most developers start using primarily open source."
To help get Llama 3.1 out into the world, Meta is partnering with more than 20 companies, including Microsoft, Amazon, Google, Nvidia, and Databricks, to help developers deploy their own versions. Meta claims that Llama 3.1 costs half as much as OpenAI's GPT-4o to run in production. Meta is releasing the model's weights so other companies can train it on custom data and tweak it to their liking.
According to Meta spokesperson Jon Carvill, Gemini was not included in these benchmark comparisons because Meta had difficulty using Google's API to replicate the previously stated results.
Meta launches open-source AI that outperforms GPT-4o - 2
Unsurprisingly, Meta doesn’t say much about the data it used to train Llama 3.1. People at AI companies say they don’t disclose this information because it’s a trade secret, while critics say it’s a tactic to delay upcoming copyright lawsuits.
Watch More Image Part 2 >>>
Meta will say it used synthetic data, or data generated by models rather than humans, to arrive at the 405 billion parameter version of Llama 3.1, while improving on smaller versions with 70 billion and 8 billion parameters. Ahmad Al-Dahle, Meta's VP of generative AI, predicts that Llama 3.1 will be popular with developers as "a teacher for smaller models that can then be deployed" in a "more cost-effective way."
When I asked whether Meta agreed with the growing consensus that the industry is running out of quality training data for models, Al-Dahle suggested that there is a limit coming, though it may be further away than some people think. “We think we have a few more training runs,” he said. “But it’s hard to say.”
For the first time, Meta's red teaming of Llama 3.1 included looking for potential cybersecurity and biochemical use cases. Another reason for testing the model more aggressively is what Meta describes as emerging "agent" behaviors.
For example, Al-Dahle told me that Llama 3.1 has the ability to integrate with search engine APIs to “fetch information from the internet based on complex queries and call multiple engines in succession to complete the task.” Another example he gave was asking the model to plot the number of homes sold in the United States over the past five years. “It can take those web search results for you and generate Python code and execute that code.”
Meta’s Llama implementation is their AI assistant, positioned as a multi-purpose chatbot like ChatGPT and can be found in almost every part of Instagram, Facebook, and WhatsApp. Starting this week, Llama 3.1 will be accessible first through WhatsApp and the Meta AI website in the US, followed by Instagram and Facebook in the coming weeks. It’s also being updated to support new languages, including French, German, Hindi, Italian, and Spanish.
W