A new open source AI code generation tool, AlphaCodium, was inspired by Google DeepMind’s AlphaCode (and AlphaCode 2, which launched last month powered by Gemini) but has now surpassed it, setting X/Twitter all aflutter this week.
“We are one step closer to having AI generate code better than humans!” posted Santiago Valdarrama,. “The results put AlphaCodium as the best approach to generate code we’ve seen. It beats DeepMind’s AlphaCode and their new AlphaCode2 without needing to fine-tune a model!”
And OpenAI’s Andrej Karpathy, who was previously director of AI at Tesla, highlighted the tool’s ‘flow engineering’ method to improve code generation — “moving from a naive prompt:answer paradigm to a ‘flow’ paradigm, where the answer is constructed iteratively.”
To improve the performances of LLMs on code-specific problems, AlphaCode’s ‘flow engineering’ goes beyond chain-of-thought prompt engineering by bringing back elements of GAN architecture (which was developed by Ian Goodfellow in 2014) to include a model that generates code as well as an adversarial model that provides code integrity through testing, reflection and spec matching.
The flow begins with inputs and then includes a series of pre-processing steps where AlphaCodium reflects on the problem and eventually reaches to a first code solution. Then, it generates additional tests that help refine the solution, and reaches the final one that actually works.
Startup CodiumAI developed AlphaCodium
Tel Aviv-based startup CodiumAI — whose mission, according to its website, is ‘to enable developers to build faster with zero bugs’ — developed AlphaCodium, which was tested on the CodeContests dataset, containing around 10,000 competitive programming problems. Its performance on CodeContests benchmark showed that its performance improved GPT-4’s accuracy from 19 – 44%. According to CodiumAI, “This result is not just a numerical improvement; it’s a leap forward in the capabilities of LLMs in code generation, setting a new benchmark in the field.”
CodiumAI, which was founded in 2022 and raised $10.6 million in March 2023, shared an AlphaCodium GitHub repository and an accompanying paper, ‘Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering.’
Cofounder and CEO Itamar Friedman told VentureBeat in an interview that he has been surprised by the attention AlphaCodium has generated so far but added that he knew it was a breakthrough that could help the entire developer community — emphasizing that AlphaCodium is not simply an model, but a system and algorithm that enables a ‘flow’ of communication between a code-generating model and a ‘critic’ model.
“That’s the big thing that we’re bringing here — it’s important to see it as as a flow, which is why we call it ‘flow engineering,’” he said. That flow, he explained, allows AI to not only generate boilerplate code, but generate code that works and is accurate.
OpenAI and Google DeepMind are biggest competitors
Friedman pointed out that he sees OpenAI (which developed Codex) and Google DeepMind (which developed AlphaCode and AlphaCode 2) as CodiumAI’s biggest competitors.
“We were greatly inspired by DeepMind,” he said, adding that he has also spoke to OpenAI CEO Sam Altman about the importance of code integrity.
“I have a very high alignment with Sam that code integrity is not only super important for the next generation of code building, but also it’s important for AI alignment,” he said. AlphaCodium, he explained, is really about offering the ‘next generation’ of code integrity — “it not only gets my spec, it also gets my my culture documents, my beliefs and other guidelines.”
Google DeepMind included aspects of flow engineering in their AlphaGo solution but not in AlphaCode, he said — “I don’t know why.” Perhaps, he suggested, it’s because the idea was not part of the mainstream narrative of simply needing a better large language model.
“The reason AI is not generating working code is not because you need a better LLM,” he said. “It’s because you need a flow.”
VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.