Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124

This week, Sakana AI, an NVIDA-backed startup that has collected several million dollars from VC companies, made a significant claim. The agency says it has created an AI system, AI Chuda Engineer, which can effectively speed up the training of the AI model specified by a factor up to 100x.
The only problem, the system does not work.
Users At the x Quick discovery The Sakana system is actually the result of the performance of training worse than average. According to one userA 3x recession was created as a result of Sakana’s AI – not a speedup.
What went wrong? A bug in the code, a Post Lucas Baier, a member of the technical staff of the Openai.
“Their source code is incorrect [a] The fine way, “Bayer wrote in X -“. ” They run the benchmarking twice with various results in the wild, they should stop them and think. ”
A PostMortem published Friday, Sakana admits that the system has found a way – as Sakana describes it – “Cheating” and the system’s trend blames to “hack” the reward – ie defects to achieve high metrics by not achieving the desired goals (Make Model Training Faster ) Similar events have been observed AI trained to play chess games thatThe
According to Sakana, the system absorbs the company that is used in the evaluation code, which allows it to bipus legitimacy for appropriateness in other checks. Sakana says it has resolved the matter and wants to correct its demands on updated materials.
“We have since made the evaluation and the RunTime profiling harness to eliminate many of these. [sic] Lufols, ”the company wrote in an X post. “We are in the process of correcting our paper and our results to reflect and discuss the effects […] We are deeply apologizing for our readers to supervise our readers. We will provide amendment of this job soon and discuss our teaching. “
Props to Sakana to be the owner of the mistake. But the episode is a good reminder that if any claim seems to be true, if it looks very good, Especially in the AIThis is probably