Suboptimal practices in the development of machine learning models increase the risk of these solutions producing biased data.

Addressing bias in radiology machine learning systems

September 06, 2022
by John R. Fischer, Senior Reporter
Suboptimal practices in the development of machine learning systems put them at risk of producing biased insights when applied in radiology.

But researchers at Mayo Clinic have come up with several strategies for addressing developmental problems and eliminating the risk of biased information, with the first focusing on the data handling process and the 12 suboptimal practices associated with it.

"If these systematic biases are unrecognized or not accurately quantified, suboptimal results will ensue, limiting the application of AI to real-world scenarios,” said Dr. Bradley Erickson, professor of radiology and director of the AI Lab at the Mayo Clinic, in Rochester, Minnesota, in a statement.

The data handling process consists of data collection, data investigation, data splitting and data engineering. The issues afflicting this phase include:

The researchers recommend in-depth reviews of clinical and technical literature and working with data science experts to plan out data collections. They also say collections should come from multiple institutions in different countries and regions, use data from different vendors and different times, or include public data sets to incorporate diverse data sets.

"Creating a robust machine learning system requires researchers to do detective work and look for ways in which the data may be fooling you,” said Erickson. "Before you put data into the training module, you must analyze it to ensure it's reflective of your target population. AI won't do it for you."

The second and third reports discuss biases that occur when developing and evaluating the model, and when reporting findings.

The findings were published in Radiology: Artificial Intelligence, a journal of the Radiological Society of North America.