Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124

Researchers have discovered a new AI “Scaling Law”? It Some rumors on social media Suggests – however, experts are skeptical.
AI scaling laws, some informal concepts, the effectiveness of the AI models describes the size of the datasets and the size of computing resources used for their training. Until a year ago, “pre-training” scaling-chir-greater datasets are always trained-greater models-it has been dominant in the sense that most border AI labs take it so far.
Pre-training did not go away, but two additional scaling laws, the next scaling of training and Test-time scalingIt has grown as complementary. Post-training scaling is essentially tune in a model’s behavior, while the test-time scaling is to run a form of “logic” — such as running models-like running models (see models: models like R 1)
Researchers in Google and UC Berkeley recently suggested a Paper Some commentators online described as Key Fourth Act: “Investigation-time search.”
A model of inference-time searches in parallelly generates many potential answers to a query and then select the “best” of the bunch. Researchers have claimed that it could increase the performance of one year old model Google’s Jemi 1.5 ProAt a level O1 -Preview The “logic” model on the criteria of science and mathematics.
Our paper focuses on the axis of this search and its scaling trends. For example, with only 200 reactions and self-pseudo samples, Gemini 1.5 (an ancient early 2024 model!) Defeated O1-Preview and reached O1. It is finetooning, without RL or Ground-Satyon verification. pic.twitter.com/hb5fo7ifnh
– Eric Zhao (@Ericza 28) March 17, 2025
“[B]Y simply randomly defeated the 1224 model-o 1-pavue of an ancient 2024, and reached the OR 1, “Google doctorate fellow and one of the co-authors of the paper, Eric Zhao wrote at Eric Zhao.” Post series on XThe “Self-circulation is naturally easier on a scale that the pool of your solutions becomes even bigger, but the opposite is the opposite!”
Several experts have said that the results are not surprising, and the search-time search may not be effective in many situations.
Alberta University AI researcher and assistant professor Matthew Guzdial TechCrunch told that if there was a good “evaluation function” – the best answer to a question is easily determined by the procedure. However, most questions are not that cut-dry.
“[I]F we cannot write the code to define what we want, we cannot use [inference-time] Search, “He said,” We cannot do this for the general language interaction. […] In fact, this is not a great way to solve most problems ”” “”
Mike Cook, a research fellow at Kings College London in AI, agreed with Guzdiel’s assessment and added that it highlighted the interval between “logic” in the AI sense of words and the gap between our own thinking processes.
“[Inference-time search] The model’s ‘logic does not improve’, “said Cook.[I]T is a way of working around the restriction of a technology that is a tendency to make mistakes supported with very confident […] Intuly if your model makes 5% of the time mistake, checking 200 attempts in the same problem makes it easier to identify those mistakes ””
It may be certain that this hypothesis may have the limitations of the search for any AI industry, the model “logic” can be unwanted news to any industry that is seeking to scale up. As co-authors of paper notes, rational models may rack up today Computing thousands of dollars The problem of single mathematics.
It looks like the search for new scaling techniques will continue.