This Tool Probes Frontier AI Models for Lapses in Intelligence

Spread the love

At the Executives Artificial intellect Organizations could Would like to tell us That AGI Almost here, however, the latest models still need some additional tutoring to help them.

The Fontier AI company has created a platform for helping to create advanced models, scale AI has created a platform that can automatically examine a model in thousands of benchmarks and functions, giving additional training data to help enhance their skills. The scale must provide the necessary data.

The scale of human labor suppliers for advanced AI models for training and examination has increased. Large language models (LLMs) are trained on scrapped texts from books, web and other sources. These models need additional “post training” in the form of such a human being to turn into a helpful, well -managed and well -managed chatbots that provide feedback about the output of a model.

Scale suppliers who specialize in search of models for problems and limitations. The new tool, known as scale assessment, is automatically automatically using the scale of the scale of the scale of the scale.

“Among the big labs, there are all these Hafazard ways to track some models,” said Daniel Berios, chief of the product for scale evaluation. New tools ”for a way [model makers] The results are to go through the results and to the pieces of the pieces

Berrows says that several Frontier AI Model companies are already using the equipment. He says that most are using it to improve their best model’s argument. The AI ​​logic involves a model of trying to break any problem in a problem material part of a problem to solve it more effectively. This procedure depends on the users to determine if the model has properly resolved a problem.

In one example, Berios says the scale evaluation suggests that a model’s logic skills were reduced when the non-English prompt was fed. “When [the model’s] The general purpose was the rational powers pretty good and the criterion was well performed, when they were not in English, they continue to diminish a bit, “he said. The scale evolution was highlighted and allowed the organization to collect additional training data to solve it.

AI scientist Jonathan Frankle, the head of a company created by a large AI model, has said that a foundation model is being able to examine a principle against the useful listening in principle. “Anyone who moves the ball forward in the evaluation is helping us to create a better AI,” Frankol said.

In recent months, scale AI models have contributed to the development of a number of new criteria designed to push AI models to become smarter, and to investigate how they can mislead them. In the meantime Enigmeval, Multichenose, MaskAnd The last test of humanityThe

The scale says that measuring the improvement of AI models is becoming more challenging, but they are getting better in executing existing tests. The company says its new equipment provides more extensive images by combining different criteria and can be used to create custom tests for any model, such as searching for its logic in different languages. The scale of the scale can take a provided problem and create more examples, allowing a model’s extensive examination of skills.

The company’s new equipment may also inform the AI ​​models to test the testing for the misconduct. Some researchers have said that lack of redemption means Some models jailbreaks cannot go unnoticedThe

In February, the US National Institute of Standards and Technologies has announced that the scale will help to develop procedures for test models so that they can ensure safe and credible.

What kind of defects have you identified in the outputs of the generator AI equipment? Do you think the biggest blind spots of models? Let us know by email Hello @wired.com Or comments below.

Leave a Reply

Your email address will not be published. Required fields are marked *