Scientist Use A.I. To Mimic the Mind, Warts and All

By: New York Times Science Posted On: July 02, 2025 View: 17

Companies like OpenAI and Meta are in a race to make something they like to call artificial general intelligence. But for all the money being spent on it, A.G.I. has no settled definition. It’s more of an aspiration to create something indistinguishable from the human mind.

Artificial intelligence today is already doing a lot of things that were once limited to human minds — such as playing championship chess and figuring out the structure of proteins. ChatGPT and other chatbots are crafting language so humanlike that people are falling in love with them.

But for now, artificial intelligence remains very distinguishable from the human kind. Many A.I. systems are good at one thing and one thing only. A grandmaster can drive a car to a chess tournament, but a chess-playing A.I. system is helpless behind the wheel. An A.I. chatbot can sometimes make very simple — and very weird — mistakes, like letting pawns move sideways in chess, an illegal move.

For all these shortcomings, an international team of scientists believe that A.I. systems can help them understand how the human mind works. They have created a ChatGPT-like system that can play the part of a human in a psychological experiment and behave as if it has a human mind. Details about the system, known as Centaur, were published on Wednesday in the journal Nature.

In recent decades, cognitive scientists have created sophisticated theories to explain various things that our minds can do: learn, recall memories, make decisions and more. To test these theories, cognitive scientists run experiments to see if human behavior matches a theory’s predictions.

Some theories have fared well on such tests, and can even explain the mind’s quirks. We generally choose certainty over risk, for instance, even if that means forgoing a chance to make big gains. If people are offered $1,000, they will usually take that firm offer rather than make a bet that might, or might not, deliver a much bigger payout.

But each of these theories tackles only one feature of the mind. “Ultimately, we want to understand the human mind as a whole and see how these things are all connected,” said Marcel Binz, a cognitive scientist at Helmholtz Munich, a German research center, and an author of the new study.

Three years ago, Dr. Binz became intrigued by ChatGPT and similar A.I. systems, known as large language models. “They had this very humanlike characteristic that you could ask them about anything, and they would do something sensible,” Dr. Binz said. “It was the first computation system that had a tiny bit of this humanlike generality.”

At first Dr. Binz could only play with large language models, because their creators kept the code locked away. But in 2023, Meta released the open-source LLaMA (Large Language Model Meta AI). Scientists could download and modify it for their own research.

(Thirteen authors have sued Meta for copyright infringement, and The New York Times has sued OpenAI, ChatGPT’s creator, and its partner, Microsoft.)

The humanlike generality of LLaMA led Dr. Binz and his colleagues to wonder if they could train it to behave like a human mind — not just in one way but in many ways. For this new lesson, the scientists would present LLaMA with the results of psychological experiments.

The researchers gathered a range of studies to train LLaMA — some that they had carried out themselves, and others that were conducted by other groups. In one study, human volunteers played a game in which they steered a spaceship in search of treasure. In another, they memorized lists of words. In yet another, they played a pair of slot machines with different payouts and figured out how to win as much money as possible. All told, 160 experiments were chosen for LLaMA to train on, including over 10 million responses from more than 60,000 volunteers.

Dr. Binz and his colleagues then prompted LLaMA to play the part of a volunteer in each experiment. They rewarded the A.I. system when it responded in a way that a human had.

“We essentially taught it to mimic the choices that were made by the human participants,” Dr. Binz said.

He and his colleagues named the modified model Centaur, in honor of the mythological creature with the upper body of a human and the legs of a horse.

Once they trained Centaur, the researchers tested how well it had mimicked human psychology. In one set of trials, they showed Centaur some of the volunteer responses that it hadn’t seen before. Centaur did a good job of predicting what a volunteer’s remaining responses would look like.

The researchers also let Centaur play some of the games on its own, such as using a spaceship to find treasure. Centaur developed the same search strategies that human volunteers had figured out.

To see just how humanlike Centaur had become, the scientists then gave it new games to play. In the spaceship experiment, scientists had changed the story of the game, so that now volunteers rode a flying carpet. The volunteers simply transferred their spaceship strategy to the new game. When Dr. Binz and his colleagues made the same switch for Centaur, it transferred its spaceship strategy, too.

“There is quite a bit of generalization happening,” Dr. Binz said.

The researchers then had Centaur respond to logical reasoning questions, a challenge that was not in the original training. Centaur once again produced humanlike answers. It tended to correctly answer questions that people got right, and failed on the ones that people likewise found hard.

Another human quirk emerged when Dr. Binz and his colleagues replayed a 2022 experiment that explored how people learn about other people’s behavior. In that study, volunteers observed the moves made by two opposing players in games similar to Rock, Paper, Scissors. The observers figured out the different strategies that people used and could even predict their next moves. But when the scientists instead generated the moves from a statistical equation, the human observers struggled to work out the artificial strategy.

“We found that was exactly the same case for Centaur as well,” Dr. Binz said. “The fact that it actually predicts the human players better than the artificial players really means that it has picked up on some kind of things that are important for human cognition.”

Some experts gave Centaur high marks. “It’s pretty impressive,” said Russ Poldrack, a cognitive scientist at Stanford University who was not involved in the study. “This is really the first model that can do all these types of tasks in a way that’s just like a human subject.”

Ilia Sucholutsky, a computer scientist at New York University, was struck by how well Centaur performed. “Centaur does significantly better than classical cognitive models,” he said.

But other scientists were less impressed. Olivia Guest, a computational cognitive scientist at Radboud University in the Netherlands, argued that because the scientists hadn’t used a theory about cognition in building Centaur, its prediction didn’t have much to reveal about how the mind works.

“Prediction in this context is a red herring,” she said.

Gary Lupyan, a cognitive scientist at Indiana University, said that theories that can explain the mind are what he and his fellow cognitive scientists are ultimately chasing. “The goal is not prediction,” he said. “The goal is understanding.”

Dr. Binz readily agreed that the system did not yet point to a new theory of the mind. “Centaur doesn’t really do that yet, at least not out of the box,” he said. But he hopes that the language model can serve as a benchmark for new theories, and can show how well a single model can mimic so many kinds of human behavior.

And Dr. Binz hopes to expand Centaur’s reach. He and his colleagues are in the process of increasing their database of psychological experiments by a factor of five, and they plan on training the system further.

“I would expect with that data set, you can do even more stuff,” he predicted.

Read this on New York Times Science