Artificial intelligence

Personal characteristics can make LLMs more attractive | MIT News

The latest multilingual models (LLMs) are designed to remember information from past conversations or user profiles, enabling these models to generate personalized responses.

But researchers from MIT and Penn State University found that, in long conversations, such personal characteristics tend to increase the likelihood that the LLM will over-conform or begin to show individualistic views.

This phenomenon, called sycophancy, can prevent the model from telling the user that they are wrong, eliminating the accuracy of LLM responses. In addition, LLMs like someone’s political beliefs or worldview can promote misinformation and distort users’ perception of reality.

Unlike many previous studies of sycophancy that examine information in a lab setting without context, the MIT researchers collected two weeks of interview data from people who interacted with actual LLMs during their daily lives. They studied two settings: agreeing on personal advice and reflecting users’ beliefs on political explanations.

Although the interactive context increased agreement in four of the five LLMs they studied, the presence of a summary user profile in the model’s memory had the greatest effect. On the other hand, mirror behavior only increases if the model can accurately detect the user’s beliefs in the conversation.

The researchers hope that these results will stimulate future research into the development of more robust personal approaches to LLM sycophancy.

“From the user’s point of view, this work highlights how important it is to understand that these models are dynamic and their behavior can change as you interact with them over time. If you talk to a model for a long time and start to take your thinking out of it, you may find yourself inside an echo chamber that you cannot escape from. That is a danger that users must remember,” said Shomik graduate student (SS ID) in System Jain, lead System Data and Institute) author of the paper on this study.

Jain is joined on the paper by Charlotte Park, a graduate student in electrical engineering and computer science (EECS) at MIT; Matt Viana, graduate student at Penn State University; and senior authors Ashia Wilson, Lister Brothers Career Development Professor at EECS and principal investigator at LIDS; and Dana Calacci PhD ’23, assistant professor at Penn State. The research will be presented at the ACM CHI Conference on Human Factors in Computing Systems.

Extended collaboration

Based on their sycophantic experiences with LLMs, the researchers began to think about the possible benefits and consequences of an overly consistent model. But when they searched the literature to expand their analysis, they found no studies that attempted to understand sycophantic behavior during long-term LLM interactions.

“We use these models by using extended connections, and they have a lot of context and memory. But our testing methods are lagging behind. We wanted to test LLMs in ways that people actually use them to understand how they behave in the wild,” said Calacci.

To fill this gap, researchers designed a user study to examine two types of sycophancy: agreement sycophancy and perspective sycophancy.

Agreement sycophancy is the LLM’s tendency to be overly agreeable, sometimes to the point of giving incorrect information or refusing to tell the user that they are wrong. Perspective sycophancy occurs when the model reflects the user’s values ​​and political views.

“We know a lot about the benefits of interacting with like-minded or different people. But we don’t yet know the benefits or risks of extended interactions with like-minded AI models,” added Calacci.

The researchers created a user interface focused on LLM and recruited 38 participants to talk to the chatbot over a two-week period. Each participant’s conversations took place in the same context window to capture all interaction data.

Over a two-week period, the researchers collected an average of 90 questions from each user.

They compared the behavior of five LLMs with this user context in comparison to the same LLMs that were not provided with any conversational data.

“We found that the context really changes the way these models work, and I would like to see this situation go beyond sycophancy. And while sycophancy tends to increase, it doesn’t always increase. It really depends on the context itself,” Wilson said.

Content indicators

For example, when LLM disseminates information about a user in a certain profile, it leads to a very large gain in the sycophancy of the agreement. This user profile feature is increasingly being baked into newer models.

They also found that random text from artificial conversations also increased the likelihood that some models would agree, even though that text contained no user-specific data. This suggests that conversation length may sometimes affect sycophancy more than content, Jain adds.

But content matters most when it comes to perspective sycophancy. The context of the conversation only increased the perception of sycophancy if it revealed specific information about the user’s political views.

To gain this insight, the researchers carefully questioned the models for user beliefs and then asked each individual whether the model’s deductions were correct. Users say that LLMs get their political views right about half the time.

“It’s easy to say, in retrospect, that AI companies should do this kind of testing. But it’s difficult and it takes a lot of time and investment. Using people in the testing loop is expensive, but we’ve shown that it can reveal new insights,” said Jain.

Although the aim of their study was not to reduce, the researchers developed some recommendations.

For example, to reduce sycophancy one can design models that better identify relevant information in context and memory. In addition, models can be built to see mirror behavior and flag responses with extreme agreement. Model developers can also give users the ability to moderate personalization in longer conversations.

“There are many ways to personalize models without overfitting them. The line between personalization and sycophancy is not a fine line, but distinguishing personalization from sycophancy is an important area for future work,” Jain said.

“At the end of the day, we need better ways to capture the dynamics and complexity of what goes on during long conversations with LLMs, and how things can go wrong during that long process,” adds Wilson.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button