Smaller AI Models Challenge GPT-4, Boost Business Accessibility

Inflection’s recent Pi chatbot upgrade showcases the trend of developing smaller, more cost-effective artificial intelligence (AI) models that make AI more accessible for businesses.

The chatbot has been upgraded with the new Inflection 2.5 model, achieving close to OpenAI’s GPT-4’s effectiveness while only using 40% of the computational resources for its training.

Designed to facilitate natural, empathetic, and safe conversations, Inflection 2.5 boasts improved coding and mathematics capabilities compared to its predecessor. The updated model expands the range of topics Pi users can discuss, showing that smaller large language models (LLMs) can still deliver strong performance efficiently.

“Smaller LLMs offer users more control compared to larger language models like ChatGPT or Anthropic’s Claude, making them more desirable in many instances,” Brian Peterson, co-founder and chief technology officer of Dialpad, a cloud-based, AI-powered platform told PYMNTS in an interview. “They’re able to filter through a smaller subset of data, making them faster, more affordable, and, if you have your own data, far more customizable and even more accurate.”

Small but Mighty

Pi’s chatbot may be small, but it packs a powerful punch in performance and capabilities. Inflection 2.5 achieves more than 94% of GPT-4’s average performance on benchmarks such as massive multitask language understanding (MMLU), which evaluates a model’s language understanding capabilities. This impressive feat is accomplished while using just 40% of the FLOPS (floating-point operations per second) required by the OpenAI model. 

Smaller LLMs, also referred to as small language models (SLMs), typically have between a few hundred million and 10 billion parameters, requiring less energy and computational resources compared to their larger counterparts. SLMs make advanced AI and high-performance natural language processing (NLP) tasks more accessible to a wide range of organizations. The costs associated with SLMs are lower due to the use of more affordable graphic processing units (GPUs) and machine-learning operations (MLOps).

“We are currently experiencing a Cambrian explosion of small and medium-sized language models in the open-source community,” Akshay Sharma, chief AI officer at Lyric, an AI-based payment technology company, told PYMNTS in an interview. 

While GPT-4 and other large models remain popular, both the enterprise and startup sectors are seeing numerous companies release their own SLMs, Sharma said. Examples include Meta’s Llama2 7bMistral 7b, and Microsoft’s Orca-2.

Benefits of SLMs

One benefit of smaller LLMs is their efficiency. The increasing energy consumption of LLMs is raising concerns among experts and environmentalists. As these AI models become more advanced and widely used, the computational power needed to train and deploy them is leading to a substantial rise in electricity use, contributing to a growing carbon footprint for the industry.

Smaller language models, such as Inflection, are gaining popularity for their cost efficiency, Enrique Partida, founder of AuraChat.ai told PYMNTS in an interview. 

“One of the primary benefits of smaller LLMs is that they require less computational power, making them more cost-effective and energy-efficient,” he added. “This is because training and deploying large language models can be expensive and time-consuming, especially for companies with limited resources.”

For instance, the 7B parameter SLMs, Mistral 7b can be operated on an affordable GPU costing $500 per month through a cloud service.

“Businesses will have multiple specialized SLMs for different subdomains through fine-tuning, offering flexibility to build scalable solutions,” Sharma said.

Smaller LLMs offer another significant advantage: They can be fine-tuned for specific tasks, potentially increasing their accuracy and efficiency in certain applications, Partida said. For instance, a company aiming to develop a chatbot for customer service may discover that a smaller LLM provides sufficient performance at a considerably lower cost compared to a larger model.

By reducing the computational requirements, smaller LLMs enable companies to deploy AI solutions more rapidly and efficiently. This can lead to faster return on investment (ROI) and improved customer satisfaction.

In the future, we’ll likely see a surge of compact AI models that excel in specific fields, Bob Rogers, CEO of Oii.ai, a data science company specializing in supply chain design, told PYMNTS in an interview. He said that so far, large and general models have shown remarkable ability to adapt to new tasks without further training, which questions the necessity for multiple small, task-specific models.

“On the other hand, so far the writing of LLMs has a vanilla, generic feel to it that belies the fact that it is really trained on the lowest common denominator of text in a very large training corpus,” he added. “Models tuned to be aware of domain-specific facts, or to have sharper, domain-appropriate language styles, are likely to separate themselves from the pack for many enterprise applications where ‘better’ writing is needed.”