Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124

So-called logic AI models are becoming easier – and cheaper – to develop
On Friday, NovaSky, a team of researchers out of UC Berkeley’s Sky Computing Lab, released Sky-T1-32B-Preview, a logic model that competes OpenAI’s previous version of o1 on several key criteria. Sky-T1 appears to be the first truly open source reasoning model in the sense that it can be Copied from scratch; The team released the data set used for their training as well as the required training code.
“Remarkably, Sky-T1-32B-Preview was trained for less than $450,” the team wrote. Blog post“Demonstrating that high-level reasoning capabilities are cost-effective and efficiently replicable.”
$450 may not sound that affordable. But it wasn’t that long ago that the price tag for training a model with comparable performance Often in the range of millions of dollars. Artificial training data, or training data generated by other models, has helped reduce costs Palmyra X 004, a model recently released by AI company Reiter, is almost fully trained Synthetic dataIt cost just over $700,000 to develop.
Unlike most AI, reasoning models effectively truth-check themselves, which Helping them avoid some of the problems that usually trip up models. Reasoning models take longer – typically seconds to minutes – to reach a solution than a simple non-reasoning model. On the flip side, they tend to be more reliable in domains like physics, science, and math.
Novasky’s team says it used another logic model, Alibaba’s QwQ-32B-PreviewSky-T1 leverages OpenAI to generate initial training data, then “curate” the data mix. GPT-4o-mini To refactor the data into a more efficient format. Training the 32-billion-parameter Sky-T1 took about 19 hours using a rack of 8 Nvidia H100 GPUs. (The parameters roughly correspond to a model’s problem-solving ability.)
According to the NovaSky team, the Sky-T1 outperformed the early preview version of the o1 on the MATH500, a collection of “competition-level” math challenges. The model also outperforms o1’s preview on a set of difficult problems from LiveCodeBench, a coding assessment.
However, Sky-T1 falls short of GPQA-Diamond’s o1 preview, which includes physics, biology, and chemistry-related questions that a PhD graduate would know.
It is also important to note that OpenAI’s GA release of o1 A stronger model than o1’s preview version, and OpenAI is expected to release a better-performing logic model, o3In the coming weeks.
But the NovaSky team says Sky-T1 is only the beginning of their journey to develop open source models with advanced reasoning capabilities.
“Moving forward, we will focus on developing more efficient models that maintain robust reasoning performance and explore advanced techniques that further improve the models’ efficiency and accuracy during testing,” the team wrote in the post. “Stay tuned as we make progress on these exciting initiatives.”