Mistral launches fine-tuning tools to make customizing its models easier and faster

VB Transform 2024 returns this July! Over 400 enterprise leaders will gather in San Francisco from July 9-11 to dive into the advancement of GenAI strategies and engaging in thought-provoking discussions within the community. Find out how you can attend here.

Fine-tuning is critical to improving large language model (LLM) outputs and customizing them to specific enterprise needs. When done correctly, the process can result in more accurate and useful model responses and allow organizations to derive more value and precision from their generative AI applications.

But fine-tuning isn’t cheap: It can come with a hefty price tag, making it challenging for some enterprises to take advantage of. 

Open source AI model provider Mistral — which, just 14 months after its launch, is set to hit a $6 billion valuation — is getting into the fine-tuning game, offering new customization capabilities on its AI developer platform La Plateforme.

The new tools, the company says, offer highly efficient fine-tuning that can lower training costs and decrease barriers to entry. 

VB Transform 2024 Registration is Open

Join enterprise leaders in San Francisco from July 9 to 11 for our flagship AI event. Connect with peers, explore the opportunities and challenges of Generative AI, and learn how to integrate AI applications into your industry. Register Now

The French company is certainly living up to its name — “mistral” is a strong wind that blows in southern France — as it continues to roll out new innovations and gobble up millions in funding dollars. 

“When tailoring a smaller model to suit specific domains or use cases, it offers a way to match the performance of larger models, reducing deployment costs and improving application speed,” the company writes in a blog post announcing its new offerings. 

Tailoring Mistral models for increased customization

Mistral made a name for itself by releasing several powerful LLMs under open source licenses, meaning they can be taken and adapted at will, free of charge.

However, it also offers paid tools such as its API and its developer platform “la Plateforme,” to make the journey for those looking to develop atop its models easier. Instead of deploying your own version of a Mistral LLM on your servers, you can build an app atop Mistral’s using API calls. Pricing is available here (scroll to bottom of the linked page).

Now, in addition to building atop the stock offerings, customers can also tailor Mistral models on la Plateforme, on the customers’ own infrastructure through open source code provided by Mistral on Github, or via custom training services. 

Also for those developers looking to work on their own infrastructure, Mistral today released the lightweight codebase mistral-finetune. It is based on the LoRA paradigm, which reduces the number of trainable parameters a model requires. 

“With mistral-finetune, you can fine-tune all our open-source models on your infrastructure without sacrificing performance or memory efficiency,” Mistral writes in the blog post. 

For those looking for serverless fine-tuning, meanwhile, Mistral now offers new services using the company’s techniques refined through R&D. LoRA adapters under the hood help prevent models from forgetting base model knowledge while allowing for efficient serving, Mistral says. 

“It’s a new step in our mission to expose advanced science methods to AI application developers,” the company writes in its blog post, noting that the service allows for fast and cost-effective model adaptation. 

Fine-tuning services are compatible with the company’s 7.3B parameter model Mistral 7B and Mistral Small. Current users can immediately use Mistral’s API to customize their models, and the company says it will add new models to its finetuning services in the coming weeks.

Finally, custom training services fine-tune Mistral AI models on a customer’s specific applications using proprietary data. The company will often propose advanced techniques such as continuous pretraining to include proprietary knowledge within model weights.

“This approach enables the creation of highly specialized and optimized models for their particular domain,” according to the Mistral blog post. 

Complementing the launch today, Mistral has kicked off an AI fine-tuning hackathon. The competition will continue through June 30 and will allow developers to experiment with the startup’s new fine-tuning API.

Mistral continues to accelerate innovation, gobble up funding

Mistral has been on an unprecedented meteoric rise since its founding just 14 months ago in April 2023 by former Google DeepMind and Meta employees Arthur Mensch, Guillaume Lample and Timothée Lacroix. 

The company had a record-setting $118 million seed round — reportedly the largest in the history of Europe — and within mere months of its founding, established partnerships with IBM and others. In February, it released Mistral Large through a deal with Microsoft to offer it via Azure cloud. 

Just yesterday, SAP and Cisco announced their backing of Mistral, and the company late last month introduced Codestral, its first-ever code-centric LLM that it claims outperforms all others. The startup is also reportedly closing in on a new $600 million funding round that would put its valuation at $6 billion. 

Mistral Large is a direct competitor to OpenAI as well as Meta’s Llama 3, and per company benchmarks, it is the world’s second most capable commercial language model behind OpenAI’s GPT-4.

Mistral 7B was introduced in September 2023, and the company claims it outperforms Llama on numerous benchmarks and approaches CodeLlama 7B performance on code. 

What will we see out of Mistral next? Undoubtedly we’ll find out very soon.

Source link

About The Author

Scroll to Top