The Allen Institute for AI (AI2), a nonprofit research organization founded by the late Microsoft co-founder Paul Allen, has launched OLMo 2, the latest addition to its “open language model” series. Unlike many so-called open models, OLMo 2 adheres to the Open Source Initiative’s (OSI) strict definition of open-source AI, ensuring transparency and reproducibility at every stage of development.

Fully Transparent AI Development

OLMo 2’s release builds on the foundation laid by its predecessor, the first OLMo model introduced in February. Both models meet the OSI’s recently finalized open-source AI criteria, which prioritize public access to tools, data, and methodologies.

“OLMo 2 [was] developed start-to-finish with open and accessible training data, open-source training code, reproducible training recipes, transparent evaluations, intermediate checkpoints, and more,” AI2 noted in a blog post. “By openly sharing our data, recipes, and findings, we hope to provide the open-source community with the resources needed to discover new and innovative approaches.”

Key Features of OLMo 2

The OLMo 2 family consists of two models:

  • OLMo 7B: A 7-billion-parameter model.
  • OLMo 13B: A 13-billion-parameter model.

For context, the number of parameters in a model correlates with its ability to handle complex tasks; higher-parameter models generally perform better. OLMo 2 models can perform a variety of text-based tasks, including answering questions, summarizing documents, and generating code.

The models were trained on a massive dataset of 5 trillion tokens, sourced from curated websites, academic papers, discussion boards, and math workbooks (both synthetic and human-generated). Tokens, which represent fragments of data, are the building blocks of AI model training—1 million tokens is roughly equivalent to 750,000 words.

Performance and Licensing

According to AI2, the OLMo 2 models outperform Meta’s Llama 3.1 in some benchmarks, showcasing significant improvements over the original OLMo. The smaller OLMo 7B even exceeds the performance of Llama 3.1’s 8B model.

AI2 has made the models available for download under the Apache 2.0 license, allowing for commercial use while maintaining their open-source nature.

Balancing Open Access and Safety

The release of OLMo 2 comes amidst ongoing debates about the risks of open-source AI. Open models like Meta’s Llama have reportedly been used in controversial applications, including defense tools developed by Chinese researchers.

Dirk Groeneveld, an AI2 engineer, acknowledged the potential for misuse but emphasized the broader benefits of open AI. “Yes, it’s possible open models may be used inappropriately or for unintended purposes,” Groeneveld said earlier this year. “[However, this] approach also promotes technical advancements that lead to more ethical models… and reduces a growing concentration of power, creating more equitable access.”

Looking Ahead

OLMo 2 represents a milestone in the push for open-source AI, providing researchers and developers with the tools to innovate while challenging the industry’s trend toward proprietary systems. With its emphasis on transparency and reproducibility, AI2’s latest release positions itself as a model for ethical and equitable AI development.

Leave a Reply

Your email address will not be published. Required fields are marked *