The NLP that reveals human honesty.

Study: Advice from Algorithms Can Influence People to Act Dishonestly

It is safe to say that most us think we are honest. We try to teach our children to be honest, too. However, there are moments when we falter.  An article from shows that honesty can be somewhat manipulated.

A fascinating study published by researchers at the University of Amsterdam, Max Planck Institute, Otto Beisheim School of Management and the University of Cologne aims to discover the degree to which AI-generated advice can lead people to cross moral lines. In a large-scale survey leveraging OpenAI’s GPT-2 language model, the researchers found AI’s advice can “corrupt” people even when they’re aware the source of the advice is AI.

An experiment to pair up dice rollers and match the outcomes for a profit to the rollers has revealed some disturbing results. The coauthors of this latest paper trained GPT-2 to generate “honesty-promoting” and “dishonesty-promoting” advice using a dataset of contributions from around 400 participants. Then they recruited a group of over 1,500 people to read instructions, receive the advice and engage in a task designed to assess honest or dishonest behavior.

People from the group were paired in “dyads” comprising a first and second “mover.” The first mover rolled a die in private and reported the outcome, while the second mover learned about the first mover’s report before rolling a die in private and then reporting the outcome. Only if the first and second mover reported the same outcome were they paid according to the value of the double die roll, with higher rolls corresponding to higher pay. They weren’t paid if they reported different outcomes.

Here is a paragraph from the study:

“We conducted a large-scale, financially incentivized, pre-registered experiment ( In it, we employ the state-of-the-art NLP algorithm, Generative PreTraining 2 (GPT-2) 22, to produce AI-advice texts. To facilitate reproducibility 23, the training data and code are openly available ( We then assess participants’ (dis)honest behaviour in response to such advice. Thus, we extend prior work that has either examined stated preferences towards algorithms in hypothetical scenarios or confronted participants with human-written texts that were labelled as AI-generated. The current experiment is among the first to adopt a machine behaviour approach 1 , assessing how actual algorithmic output influences human behavior.”

Before reporting the die roll outcome, people randomly assigned to different treatments read honesty-promoting or dishonesty-promoting advice that was either human-written or AI-generated. They either knew the source of the advice or knew there was a 50-50 chance that it came from either source. Those who didn’t know could earn bonus pay if they correctly guessed the source of the advice.

According to the researchers, the AI-generated advice “corrupted” people, whether the source of the advice was disclosed to them or not. In fact, the statistical effect of AI-generated advice was indistinguishable from that of human-written advice. Even worse, honesty-promoting advice from AI failed to sway behavior.

The researchers say their study illustrates the importance of testing the influence of AI as a step toward maintaining responsible deployment. And they warn that people with malicious intentions could use AI to corrupt others.

“AI could be a force for good if it manages to convince people to act more ethically. Yet our results reveal that AI advice fails to increase honesty. AI advisors can serve as scapegoats to which one can deflect (some of the) moral blame of dishonesty…

Transparency about algorithmic presence does not mitigate its harm, according to the researchers.

“When AI-generated advice aligns with individuals’ preferences to lie for profit, they gladly follow it, even when they know the source of the advice is an AI. It appears there is a discrepancy between stated preferences and actual behavior, highlighting the necessity to study human behavior in interaction with actual algorithmic outputs.”

Kyle Wiggers has written the above piece, which makes it worth considering whether to take advice from an AI or not.