Stanford Explores Geographic Bias in Medical Algorithms

Stanford’s Human-Centered Artificial Intelligence (HAI) department reviews potential geographic biases in medical data. Image is from an HAI conference. (Source: HAI)

Three States Account for Majority of Medical Data Used to Train AI Systems

You have probably been aware of some of the problems programmers of AI algorithms have had regarding racial bias. Recently Twitter came under scrutiny for how its AI crops photos, citing how minorities are minimized in how photos are cropped for the platform. Now another form of bias has been identified involving the geography of where training tools for AI are being built. Three states seem to have gobbled up most of the business involving these AI tools.

This week, an article in scienceblog.com discussed a review of the process of developing and producing AI algorithms in limited areas of the country. Much as medicine was once developed for a certain group of white adults and then it was expected to react the same for everyone across the spectrum of races and ages. There finally had to be a law written in the 1990s requiring testing to be done on all humans across these categories.

The article claims we are now at risk of repeating these same mistakes as we develop new technologies. Researchers from Stanford University examined clinical applications of machine learning to find that most algorithms are trained on datasets from patients in only three geographic areas and that the majority of states have no represented patients whatsoever.

“AI algorithms should mirror the community,” says Amit Kaushal, an attending physician at VA Palo Alto Hospital and Stanford adjunct professor of bioengineering. “If we’re building AI-based tools for patients across the United States, as a field, we can’t have the data to train these tools all coming from the same handful of places.”

Kaushal, along with Russ Altman, a Stanford professor of bioengineering, genetics, medicine, and biomedical data science, and Curt Langlotz, a professor of radiology and biomedical informatics research, examined five years of peer-reviewed articles that trained a deep-learning algorithm for a diagnostic task to assist with patient care. Among U.S. studies where geographic origin could be characterized, they found the majority (71%) used patient data from California, Massachusetts, or New York to train the algorithms. Some 60% solely relied on these three locales. Thirty-four states were not represented at all, while the other 13 states contributed limited data.

The research didn’t expose bad outcomes from AI trained on the geographies but raised questions about the validity of the algorithms for patients in other areas.

“We need to understand the impact of these biases and whether considerable investments should be made to remove them,” says Altman, associate director of the Stanford Institute for Human-Centered Artificial Intelligence.

One takeaway from the article is this: limited data means limited vision. Definitely, something for us to be more considerate of on all levels of AI development, disbursement and direction.

About the Author: Paul Morris

Morris has a background in multimedia and is a published author. He also has spent years in radio broadcasting, the music industry, and is now a contributor to Seeflection.com.