Two AI Algorithms Create Two Different Viewpoints, Biased by Global Politics
In a world that is greatly divided politically on pretty much most of the continents, it should come as no surprise that AI programming also has a bit of a division with the type of training it receives. Once the program is running, the platforms on which it is participating can have very different answers to the same question. This has researchers a bit concerned.
AI is hardly confined by international borders, as businesses, universities, and governments tap a global pool of ideas, algorithms, and talent. Yet the AI programs that result from this global gold rush can still reflect deep cultural divides.
New research shows how government censorship affects AI algorithms—and can influence the applications built with those algorithms.
Margaret Roberts, a political science professor at UC San Diego, and Eddie Yang, a Ph.D. student, examined AI language algorithms trained on two sources: the Chinese-language version of Wikipedia, which is blocked within China; and Baidu Baike, a similar site operated by China’s dominant search engine, Baidu, that is subject to government censorship. Baidu declined to comment.
The wired.com story by Will Knight shows how very divided AI training can be and the varying results of that division.
The researchers were curious whether censorship of certain words and phrases could be learned by AI algorithms and find its way into software that uses those algorithms. This might influence the language that a chatbot or a voice assistant uses, the phrasing by a translation program, or the text of autocomplete tools.
The UCSD researchers found key differences in the resulting AI algorithms that showed how information is censored in China. The one trained on Chinese Wikipedia described “democracy” closer to positive words, such as “stability.” The algorithm trained on Baike Baidu reported “democracy” as being closer to “chaos.”
Roberts and Yang then used the algorithms to build two programs to assess the sentiment—the positive versus negative meaning—of news headlines. They found that one trained on Chinese Wikipedia assigned more positive scores to headlines that mentioned terms including “election,” “freedom,” and “democracy,” while the one trained on Baidu Baike assigned more positive scores to headlines featuring “surveillance,” “social control,” and “CCP.” The study will be presented at the 2021 Conference on Fairness Accountability and Transparency (FAccT) in March.
Race Training Within AI
Seeflection.com has reported several times on the racial tensions that are still being built into AI programming, whether due to passing on systemic racism or just plain old-fashioned racism purposely embedded.
In recent years, researchers have highlighted how race and gender biases are baked into many programs. Algorithms trained on text scraped from the web or old books, for instance, will learn to replicate the biases displayed by the human authors of that text. In 2018, researchers at Google demonstrated cultural biases in image recognition algorithms, which may, for example, recognize only Western wedding scenes.
Roberts says researchers and policymakers need to consider how governments in the future might influence how AI systems are trained in order to make censorship more effective or export particular values.
Graeme Hirst, a professor at the University of Toronto who specializes in computational linguistics and natural language processing, argues that the study methodology might not reflect censorship, but cultural differences. It is also possible that Chinese Wikipedia contains anti-Chinese or overtly pro-democracy content, he says. Hirst adds that it is unclear how the sentiment analysis was done and whether bias may have been introduced there.
Others see this as a chance to learn about the biases the AI comes up with and to discern whether it is built-in from human contact or if it has developed its own reasons for thinking in that manner from its own considerations.
read more at wired.com