Algorithm Struggles with Separating Satire from Factual News

If you spend any time at all on social media, you know how difficult it can be sometimes to tell what is a real story and what is just clickbait. And it’s been shown in several studies that different age groups are fooled at different rates by fake news postings. It seems even AI can’t tell what’s real 100% of the time. But it’s getting closer.

A team of programmers asked their AI to spot the difference between satire and fake news, and pondered how it might be done. According to a recent article, it usually comes down to semantic and linguistic differences, but the nuances can be tough to spot. That’s why researchers at George Washington University, Amazon AWS AI, and startup AdVerifai investigated a machine learning approach to classifying misleading speech. They say the AI model they developed, which outperformed the baseline, lays the groundwork for the study of other linguistic features.

Their work follows that of MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), which earlier this year architected an AI model that could decide whether a source is accurate or politically prejudiced.

The researchers leveraged a statistical technique called principal component analysis to convert potentially correlated metrics into uncorrelated variables (or principal components), which they used in two logistic regression models (functions that model the probability of certain classes) with the fake and satire labels their dependent variables. Next, they evaluated the models’ performance on a corpus containing 283 fake news stories and 203 satirical stories that had been verified by hand.

The top-performing algorithm achieved a 0.78 score, with 1 a perfect score, while revealing that satirical articles tended to be more sophisticated (and less easy to read) than fake news articles.

“Overall, our contributions, with the improved classification accuracy and toward the understanding of nuances between fake news and satire, carry great implications with regard to the delicate balance of fighting misinformation while protecting free speech,” they wrote.

The article explains how fake news sites have begun claiming to be satirical making it harder for their AI to pick up. In future work, the researchers plan to study linguistic cues such as absurdity, incongruity and other humor-related features.