{"subscriber":false,"subscribedOffers":{}} Mitigating Racial And Ethnic Bias And Advancing Health Equity In Clinical Algorithms: A Scoping Review | Health Affairs

Cookies Notification

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Find out more.
×

Review Article

Health Equity
Review Article

Mitigating Racial And Ethnic Bias And Advancing Health Equity In Clinical Algorithms: A Scoping Review

Affiliations
  1. Michael P. Cary Jr. ([email protected]), Duke University, Durham, North Carolina.
  2. Anna Zink, University of Chicago, Chicago, Illinois.
  3. Sijia Wei, Northwestern University, Chicago, Illinois.
  4. Andrew Olson, Duke University.
  5. Mengying Yan, Duke University.
  6. Rashaud Senior, Duke University.
  7. Sophia Bessias, Duke University.
  8. Kais Gadhoumi, Duke University.
  9. Genevieve Jean-Pierre, Duke University.
  10. Demy Wang, Duke University.
  11. Leila S. Ledbetter, Duke University.
  12. Nicoleta J. Economou-Zavlanos, Duke University.
  13. Ziad Obermeyer, University of California Berkeley, Berkeley, California.
  14. Michael J. Pencina, Duke University.
PUBLISHED:Open Accesshttps://doi.org/10.1377/hlthaff.2023.00553

Abstract

In August 2022 the Department of Health and Human Services (HHS) issued a notice of proposed rulemaking prohibiting covered entities, which include health care providers and health plans, from discriminating against individuals when using clinical algorithms in decision making. However, HHS did not provide specific guidelines on how covered entities should prevent discrimination. We conducted a scoping review of literature published during the period 2011–22 to identify health care applications, frameworks, reviews and perspectives, and assessment tools that identify and mitigate bias in clinical algorithms, with a specific focus on racial and ethnic bias. Our scoping review encompassed 109 articles comprising 45 empirical health care applications that included tools tested in health care settings, 16 frameworks, and 48 reviews and perspectives. We identified a wide range of technical, operational, and systemwide bias mitigation strategies for clinical algorithms, but there was no consensus in the literature on a single best practice that covered entities could employ to meet the HHS requirements. Future research should identify optimal bias mitigation methods for various scenarios, depending on factors such as patient population, clinical setting, algorithm design, and types of bias to be addressed.

TOPICS

In August 2022 the Department of Health and Human Services (HHS) issued a notice of proposed rulemaking that would revise the interpretation of Section 1557 of the Patient Protection and Affordable Care Act, which prohibits discrimination on the basis of race, color, national origin, sex, age, or disability.1 This notice includes a new provision stating that covered entities, which include health care providers and health plans,2 must not discriminate against any individual through the use of clinical algorithms in decision making;1 however, it does not specify what measures covered entities should take to ensure this. Instead, it solicits comments on practices to ensure that algorithms are not discriminatory and requests resources and recommendations on identifying and mitigating discrimination that results from the use of clinical algorithms.

In response to concerns about algorithmic bias, other federal agencies and nonprofit organizations have recently taken action to regulate algorithms or support governance and oversight measures aimed at making algorithms safe, fair, and transparent.3,4 These efforts are reflected in the Food and Drug Administration’s final guidance on the principles of software validation;5 the Agency for Healthcare Research and Quality’s Impact of Healthcare Algorithms on Racial and Ethnic Disparities in Health and Healthcare research protocol;6 the National Institute of Standards and Technology’s Artificial Intelligence (AI) Risk Management Framework;7 the Coalition for Health AI’s Blueprint for Trustworthy AI Implementation Guidance and Assurance for Healthcare, version 1.0;8 and the White House Office of Science and Technology Policy’s Blueprint for an AI Bill of Rights.9

These publications join a growing body of academic and professional literature in defining, describing, and providing ways to measure algorithmic bias harmonized around the articulation of principles to guide the development of algorithms, such as safety, fairness, and transparency. Although there is broad agreement on the need to remove harmful bias from clinical algorithms, there is little consensus on how to achieve this critical objective. As health care systems develop and implement these technologies, researchers, developers, and clinicians who build and deploy clinical algorithms need concrete strategies, methods, and tools that enable them to identify and mitigate bias.

Anticipating that many covered entities will actively assess whether their clinical algorithms comply with the proposed new Section 1557 provision and attempt to correct any biases they uncover, we identified health care applications, frameworks, and tools for identifying and mitigating bias in clinical algorithms, focusing on racial and ethnic bias. Our study expanded on previous reviews1013 to capture the full breadth of published resources. We incorporated articles from journals focused on clinical practice, public health, ethics, law, public policy, data science, computer science, machine learning, and AI. This comprehensive, multidisciplinary scope allowed us to summarize a full suite of mitigation approaches that can be applied to the HHS directive to prevent discrimination arising from use of algorithms in clinical decision making.

Study Data And Methods

Design

We followed the JBI scoping review methodology;14 in this article we report the Preferred Reporting Items for Systematic reviews and Meta-Analyses extension for Scoping Reviews (PRISMA-ScR).15,16

Information Sources

The databases we searched were Medline (PubMed), Embase (Elsevier), Web of Science (Clarivate), and ProQuest Computer Science Database and ProQuest Dissertation and Theses Global.

Search Strategy

The search was developed and conducted by a medical librarian, with input from coauthors. It included a mix of keywords and subject headings, including algorithm, bias, mitigation/assessment, health care, and race/ethnicity. The original searches, conducted on August 24, 2022, produced 18,028 citations. The searches were independently peer reviewed by another medical librarian, using a modified Peer Review of Electronic Search Strategies (PRESS) checklist. The author team also assessed several sources of grey literature through a targeted web search of government entities, including the National Institute of Standards and Technology. Full reproducible search strategies for all included databases are detailed in online appendix exhibit A1.17

Study Selection

After the search, all identified studies were uploaded into Covidence, a software system for managing systematic reviews. The software removed 6,381 duplicates. A final set of 11,647 citations was left to be screened in the study title and abstract phase.

Eligibility Criteria

We systematically screened studies to include applications, frameworks, reviews and perspectives, and tools that dealt with racial and ethnic bias mitigation in algorithms used either to guide care decisions for individual patients or to inform decisions for achieving population health goals, such as efficiently allocating health care resources; mitigation strategies could be applied at any stage in the algorithm development life cycle, from pre- to postdeployment. We excluded conference abstracts and dissertations.

After studies’ titles and abstracts were screened for relevance, their full texts were reviewed to verify that they met inclusion and exclusion criteria. All aspects of screening were performed by at least two independent reviewers. At each stage, disagreements between reviewers were resolved by a third reviewer. The study selection process is represented in appendix exhibit A2.17

Data Extraction And Synthesis

Before extracting studies, we tested the extraction matrix on four studies—one for each study type: applications, frameworks, reviews and perspectives, and tools. We revised the matrix to reflect consensus among reviewers and developed an extraction manual to ensure consistency. Because there were relatively few tool studies and we extracted the same data elements from the tools and applications studies, we combined applications and tools into a single category. Two reviewers independently extracted data from the full texts of all eligible studies. Conflicts between reviewers were resolved by a third reviewer.

Limitations

Our study had several limitations. First, our review was limited to clinical algorithms used by health care providers, and it excluded rule-based algorithms. Second, our search returned few to no examples of mitigation methods for text-based algorithms and generative AI, and we cannot speak to the best mitigation methods for these types of algorithms. Other limitations included restrictions imposed on our sample (English-language only and 2011–22 study interval). Despite our efforts to implement best practices for a comprehensive and inclusive search strategy, we may have missed some relevant studies during data extraction if specific mitigation strategies were missed or misclassified by independent reviewers.

Study Results

Of the 11,579 studies that remained after our deduplication process, 11,233 were excluded during title and abstract screening. A total of 346 studies were sought for retrieval; one was unavailable, and the full texts of the remaining 345 studies (plus an additional nine studies identified from other methods via targeted web searches) were assessed for eligibility on the basis of the study criteria. After we excluded 245 studies that did not meet our criteria, a final total of 109 studies were included: 106 studies from the database search plus three studies identified from other sources (a full list of the studies included in our review is in appendix exhibit A3).17

Of the three study types (applications and tools, frameworks, and reviews and perspectives), the most common types were reviews and perspectives (n=48 [44 percent]) and health care applications and tools (n=45 [41 percent]). The least common type was frameworks (n=16 [15 percent]). Most of the studies (n=101 [93 percent]) we identified were published within the past four years (exhibit 1).

Exhibit 1 Number of studies on bias mitigation strategies in clinical algorithms, by study type, 2011–22

Exhibit 1
SOURCE Authors’ analysis of studies that identified and mitigated bias in clinical algorithms. NOTES The study types are described in greater detail in the text. A complete list of studies included in the authors’ scoping review is in appendix exhibit A3 (see note 17 in text).

Mitigation Strategies By Study Type

To facilitate interpretation of our findings, we classified mitigation strategies into three strategy categories (exhibit 2): technical (such as data collection, algorithm design, preprocessing, processing, postprocessing, and monitoring postdeployment), operational (governance, design principles, and interdisciplinary multistakeholders), and systemwide (training and education, collaborative platforms, standards, incentives, and regulation).

Exhibit 2 Number of studies on bias mitigation strategies in clinical algorithms, by strategy category and study type, 2011–22

Exhibit 2
SOURCE Authors’ analysis of studies that identify and mitigate bias in clinical algorithms. NOTES The strategy categories and study types are described in greater detail in the text. Applications and tools have been combined as explained in the text. A complete list of studies included in the authors’ scoping review is in appendix exhibit A3 (see note 17 in text).

There were notable differences in mitigation strategies discussed across study types. Most applications and tools studies focused on technical solutions to bias mitigation, such as better selection of prediction targets and reweighting sample data to resemble the target population. Frameworks and reviews and perspectives, in contrast, more often included discussions of nontechnical solutions, such as the role of governance and the need for systemwide standards.

Technical Strategies:

Technical strategies were grouped into six categories on the basis of the point in the algorithmic life cycle at which bias was addressed: data collection (strategies that involved the collection of higher-quality data from representative patient populations), algorithm design (strategies related to the selection of the outcome, predictor variables, and algorithms used for the prediction problem), preprocessing (strategies that occurred before training the algorithm—that is, the process of teaching it to make predictions or decisions based on a data set—including changing the data via weighting1822 or sampling18,2325 methods to make it more representative of the population in which it would be used), processing (strategies that altered the training of the algorithm, including adjusting the algorithm’s objective function to incorporate some aspect of fairness in addition to statistical fit;2431 this could include a mathematical formula that attempted to maximize or minimize the algorithm’s goal—for instance, to minimize the errors while ensuring that error rates across race groups were similar), postprocessing (strategies that updated the results after prediction through recalibration32 or varying cut points or thresholds at which the model output was used to define a category of risk or to recommend that a clinician take action to achieve fairness20,3335), and monitoring postdeployment (strategies related to tracking the performance of the algorithm after it had been trained, such as checking that the algorithm’s performance was not degrading over time and measuring the impact on treatment allocation and health outcomes).

In exhibit 3 we summarize some of the most common technical strategies for mitigating racial and ethnic bias. These include stratifying or calibrating algorithms by race, weighting methods, and adjusting the algorithm’s objective function. These strategies span clinical applications and algorithms. Almost every study reported some success in mitigating bias using these strategies, although this sometimes came at a cost to other statistical performance measures.21

Exhibit 3 Selected technical bias mitigation strategies reported in empirical health care applications, by category, from a review of studies on bias in clinical algorithms, 2011–22

CategoriesStrategies reportedStudies
Algorithm designNew outcome variableObermeyer, 2019 (26); Landy, 2021;a Pierson, 2021 (40)
Remove race and ethnicity and social determinants of health from the modelSamorani, 2020 (37); Gama, 2021 (38); Park, 2021 (39); Buckley, 2022 (27); Huang, 2022 (10)
Add race and ethnicity and social determinants of health to the modelHammond, 2020;a Weissman, 2021 (18); Segar, 2022 (28)
Determine when to add or remove sensitive variablesYan, 2022 (29)
Use different algorithmPierson, 2021 (40); Segar, 2022 (28)
Stratify models by raceShores, 2013;a Akbilgic, 2018 (23); Do, 2020;a Borgese, 2021 (24); Afrose, 2022 (33); Puyol-Antón, 2022 (19); Segar, 2022 (28); Foryciarz, 2022 (21)
PreprocessingWeighting methodsCoston, 2021 (22); Radovanović, 2019 (25); Allen, 2020 (30); Park, 2021 (39); Mosteiro, 2022 (31)
Sampling methodsAfrose, 2022 (33); Puyol-Antón, 2022 (19); Park, 2022;a Reeves, 2022 (46)
Data augmentationBurlina, 2021 (34)
Disparate impact remover to debias variablesPark, 2022a
ProcessingAdjust the algorithm’s objective functionbSamorani, 2020 (37); Adeli, 2021 (35); Park, 2021 (39); Pfohl, 2021 (32); Foryciarz, 2022 (21); Mosteiro, 2022 (31); Puyol-Antón, 2022 (19); Perez Alday, 2022a
Bias correctionAfrose, 2022 (33)
Adversarial and transfer learningRadovanović, 2019 (25); Gao, 2020;a Toseef, 2022a
PostprocessingVarying cut points or thresholdsRadovanović, 2019 (25); Gianattasio, 2020;a Thompson, 2021 (20); Rodolfa, 2021a
RecalibrationBarda, 2021a

SOURCES Studies included in the authors’ scoping review. Numbers in parentheses refer to endnotes in the text. NOTES These techniques were sourced from a variety of clinical applications with different clinical algorithms. The bias-mitigation strategies that we report excluded data collection suggestions such as using more recent data or collecting more (diverse) data and monitoring postdeployment.

aComplete citation in appendix exhibit A3 (see note 17 in text).

bThere are numerous methods that can be used to adjust an algorithm’s objective function, including regularization and constrained optimization.

Operational Strategies:

Operational strategies were those applied across algorithms deployed within organizations. These included governance of algorithms, incorporation of design principles (including accountability, explainability, interpretability, transparency, and usability) into algorithm development and maintenance, and the engagement of interdisciplinary teams and varied stakeholders (exhibit 2).

Operational strategies reflect the need for technical expertise, as well as institutional knowledge regarding potential sources of bias. For example, in health care settings, it is often not possible or feasible to measure the outcome of interest, such as the need for care, so a proxy for that outcome, such as the cost of care or the number of health care visits, must be used. Given well-documented disparities in access and care, proxy variables such as cost reflect these differences and may exacerbate disparities.36 Interdisciplinary teams and multistakeholder engagement can bring necessary perspective and debate to decisions about what outcomes to use and the potential risks of bias. For example, the inclusion of race in clinical algorithms requires careful consideration: Some studies found benefit in removing race and ethnicity from clinical algorithms,8,27,3739 whereas others concluded that it was preferable to include them.27,37,40 Different algorithms have different objectives, data inputs, and intended applications, all of which can influence the decision to include or exclude race. Governance boards can help monitor these complicated decisions, and design principles such as transparency ensure that all algorithm users know when predictor variables such as race are included in algorithms and the rationale for their inclusion.

Systemwide Strategies:

Studies presenting systemwide strategies were largely authored by researchers, industry leaders, and government officials and were often written collaboratively by authors affiliated with different institutions and oriented toward broader health and social policy issues. Systemwide strategies included updates to training and education about risks for algorithmic bias, collaborative platforms to aid organizations with algorithmic auditing, the creation of algorithm standards, increased incentives for bias mitigation, and regulation of algorithms (exhibit 2).

Gaps In The Literature

Of the sixteen studies that presented frameworks for addressing bias in clinical algorithms, only five identified health equity as a key component, and only three tested frameworks in real-world clinical settings.4145

Of the forty-five health care applications and tools tested in health care settings, nearly all addressed bias-mitigation strategies used during predeployment, such as sampling and weighting techniques and regularization methods. There was less evidence regarding clinical algorithms that had been deployed (a central component discussed in the frameworks), and only one study26 presented a mitigation strategy applied in clinical practice. We did not identify any prospective studies. Only eighteen of forty-five health care applications included links to the code used to implement the mitigation methods.

Although the studies presented numerous techniques that could be used to mitigate racial and ethnic bias in algorithms, most did not address the specific and complex questions of when and how they should be used. Only seven studies compared the performance of different techniques to mitigate racial and ethnic bias specifically.19,21,31,34,37,39,46 We also found minimal or no information regarding the involvement of nonphysician stakeholders in the design, evaluation, or deployment of or reporting on clinical algorithms.

Discussion

This scoping review included 109 studies describing health care applications, frameworks, reviews and perspectives, and tools related to mitigating racial and ethnic bias in algorithms used to guide health care decisions. Although other related reviews have been conducted,6,10 we believe that this review was the largest, most comprehensive summary of bias-mitigation strategies and methods in the academic and grey literatures to date.

The bias-mitigation approaches we reviewed tended to be either highly specific technical guidance or high-level, nontechnical surveys of strategies. This dichotomy is particularly challenging because bias in clinical algorithms, depending on the mitigation category, requires solutions based on social science expertise, statistical expertise, clinical expertise, or some combination of the three, underscoring the need for a professionally diverse, appropriately trained workforce. For example, social scientists tend to conceive of bias as cognitive dispositions or inclinations in human thinking and reasoning, to be addressed during algorithmic design or preprocessing. Statisticians and data scientists, however, often consider bias as estimate errors to be programmatically addressed during the preprocessing and processing stages. Further downstream, clinicians view bias as a contributing factor to health inequities, which can be exacerbated by disparities in health care access, allocation, and outcomes. Increasing use of clinical algorithms has resulted in the need for new competencies among health professionals,47 to enable them to identify sources of bias across the algorithmic lifecycle and apply a health equity lens to evaluate and inform their use.48

In our review, we identified mitigation techniques that specifically addressed racial and ethnic bias.

In our review, we identified mitigation techniques that specifically addressed racial and ethnic bias. This required special consideration, because the data used to train algorithms often reflect structural inequities in health care systems arising from racism and its interactions with social determinants of health. The topic of whether to include race and ethnicity as a predictor was an important focus. Although some study authors included race and ethnicity as predictors, others opted to remove them. In most studies, it is important to acknowledge that race and ethnicity are not biological factors but, rather, social constructs historically used to categorize and differentiate groups. Therefore, the decision to include or exclude race or ethnicity in a clinical model must be considered carefully in every context, and its impact on health equity should be thoroughly examined.49 When feasible, relying on more direct measures, rather than proxies, of the outcomes being studied can prevent biases associated with the use of race and ethnicity as predictors.

Our review highlighted the numerous choices required during the process of mitigating bias in clinical algorithms.

Our review highlighted the numerous choices required during the process of mitigating bias in clinical algorithms, encompassing the choice of variables to include as predictors, the selection of fairness metrics used to measure bias, and the choice of mitigation methods. Even quantifying algorithmic bias (often measured using fairness metrics) can be difficult. Fairness metrics involve inherent trade-offs, and static fairness criteria may lead to delayed long-term harm.

Finally, mitigation methods are numerous, and few studies addressed the selection of appropriate methodologies, which depends on such factors as the type of clinical algorithm, the specific clinical or research question being addressed, data availability, and ethical or legal considerations. Continued evaluation of these choices is needed to help determine in what clinical scenarios one strategy may work better than another.

Researchers, algorithm developers, and clinicians should follow procedures, document their results, and adhere to recognized standards for bias mitigation in clinical algorithms. A trustworthy organizational culture encourages clinicians and developers to prevent bias and reduce inequities by reporting discrimination so that root-cause analysis can be performed and identified risks can be removed from the system. This not only helps ensure that algorithms are safe, fair, and transparent but also provides needed evidence on what strategies are effective in practice.

Health equity should be fundamental to designing, evaluating, and deploying clinical algorithms in real-world clinical settings to ensure that their use does not result in discrimination. One promising approach, recently articulated by the Office of the National Coordinator for Health Information Technology, is health equity by design.50,51 Analogous to the concept of quality by design that undergirds the principles of good clinical and regulatory practice,52 health equity by design is a multifaceted approach in which equity is a core feature that can be used to deliberately mitigate discriminatory effects of clinical algorithms across pre- and postprocessing stages. In this scoping review, we found few examples of such principles operationalized in practice, and we hope that this will be a focus for organizations moving forward.

Policy Implications

On the basis of our findings, we offer the following concrete recommendations for bias mitigation in clinical algorithms.

Ensure Professional Diversity:

Developers who build algorithms and the health care organizations that deploy and monitor them should cultivate and sustain professionally diverse, appropriately trained workforces that comprise the different areas of expertise (clinical, social science, technical) needed to identify and mitigate bias.

Require Auditable Clinical Algorithms:

Developers and end users of algorithms must provide clear and accurate information about the intended use and risks of clinical algorithms.

Foster Transparent Organizational Culture:

Developers, health care organizations, and journals should openly disclose limitations as part of communicating algorithmic outputs, acknowledge biases, and report the results of mitigation strategies.

Implement Health Equity By Design:

Developers should incorporate principles of health equity by design throughout the development and deployment of clinical algorithms to mitigate discriminatory effects across pre- and postprocessing stages.

Accelerate Research:

Funding agencies and health care organizations should increase research efforts focused on bias-mitigation methodologies to expand the empirical evidence base informing choices regarding mitigation strategies.

Establish Governance Structures:

Policy makers should encourage covered entities to establish governance structures and evaluation schemes to prevent harms arising from algorithmic bias.

Amplify Patients’ Voices:

Developers must engage diverse patients and local communities in the design and preprocessing of clinical algorithms to inform patient-centered and culturally affirming care.

Moving Forward

In the context of HHS’s proposed rule, these recommendations provide the path forward for researchers, developers, health professionals, and policy makers to identify and mitigate harmful bias and prevent discrimination amid the increasing use of clinical algorithms in health care decision making.

Regulatory measures aimed at ensuring that patients do not face discrimination as a result of clinical algorithms are appropriate.

Many of the studies in our review included specific examples of patient harms, however inadvertent, that can result from bias in algorithms designed to support clinical decision making.26 Given the risks posed by the rapid implementation of algorithms in health care, regulatory measures aimed at ensuring that patients do not face discrimination as a result of clinical algorithms are appropriate. However, despite evidence of the potential for harm from biased algorithms, our review demonstrated that bias mitigation remains a nascent field of research: Most of the studies we reviewed were published in the past few years. Furthermore, we observed wide variation among published mitigation strategies meant to be applied at various stages across the entire algorithm development life cycle. Some were as general as recommendations to assemble diverse team members for algorithm development, whereas others were as specific as a weighting methodology that included published code for replication. Given the real risks posed by algorithms being rapidly deployed in health care, we believe that HHS is right to revise Section 1557 of the Patient Protection and Affordable Care Act to ensure that patients do not face discrimination because of clinical algorithms.

Our review demonstrated that researchers, algorithm developers, and health care professionals have a broad range of available methods to identify and mitigate algorithmic bias. At the same time, there is no single or discrete set of approaches that HHS could require covered entities to use to eliminate algorithmic bias. Therefore, more research is needed to determine which bias-mitigation methods are optimal and in what scenarios, depending on factors such as patient population, clinical setting, algorithm design, and types of bias to be addressed. Regulators and policy makers should promote the sharing of resources for identifying and mitigating bias and should support further research to generate empirical evidence to inform the selection of effective mitigation strategies. Policy makers should also encourage covered entities to draw from the principles and approaches presented in the literature we reviewed to establish governance structures and evaluation regimens to ensure that the algorithms they deploy do not harm patients.

Conclusion

Our scoping review serves as a significant response to the HHS notice of proposed rulemaking. It provides stakeholders, including developers, health professionals, and policy makers, with a comprehensive, up-to-date analysis of strategies, resources, and recommendations for mitigating harmful racial and ethnic bias in algorithms used in clinical decision making. By conducting an extensive and wide-ranging review, we gathered valuable insights across various technical, operational, and systemwide strategies. Our review encompassed a wide range of approaches and interventions that address racial and ethnic bias specifically but that can be applied more generally to prevent algorithmic discrimination.

ACKNOWLEDGMENTS

Research reported in this publication was supported by the National Center for Advancing Translational Sciences, National Institutes of Health (Award No. UL1TR002553). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. The authors acknowledge the following institutions for their support: Duke University School of Nursing, Duke University Clinical and Translational Sciences Institute, Northwestern University Feinberg School of Medicine (National Institute on Disability, Independent Living, and Rehabilitation Research; NIDILRR Grant No. 90ARHF0003), University of California Berkeley, and University of Chicago Booth School of Business and Center for Applied Artificial Intelligence. The authors acknowledge the following individuals for their contributions to this project: Jonathan McCall, Donnalee Frega, Judith Hays, Seanna Horan, Shelley Rusincovitch, and Margaret Graton. This is an open access article distributed in accordance with the terms of the Creative Commons Attribution (CC BY 4.0) license, which permits others to distribute, remix, adapt, and build upon this work, for commercial use, provided the original work is properly cited. See https://creativecommons.org/licenses/by/4.0/. To access the authors’ disclosures, click on the Details tab of the article online.

NOTES

  • 1 Centers for Medicare and Medicaid Services. Nondiscrimination in health programs and activities. Fed Regist. 2022;87(149):47824–920. Google Scholar
  • 2 Department of Health and Human Services. Covered entities and business associates [Internet]. Washington (DC): HHS; 2015 Nov 23 [last updated 2017 Jun 16; cited 2023 Aug 18]. Available from: https://www.hhs.gov/hipaa/for-professionals/covered-entities/index.html Google Scholar
  • 3 Forrest S. Artificial intelligence/machine learning (AI/ML)-enabled medical devices: tailoring a regulatory framework to encourage responsible innovation in AI/ML [Internet]. Silver Spring (MD): Food and Drug Administration; [cited 2023 Aug 18]. Available from: https://www.fda.gov/media/160125/download Google Scholar
  • 4 Schwartz R, Vassilev A, Greene K, Perine L, Burt A, Hall P. Towards a standard for identifying and managing bias in artificial intelligence [Internet]. Gaithersburg (MD): National Institute of Standards and Technology; 2022 Mar [cited 2023 Aug 18]. (NIST Special Publication No. 1270). Available from: https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.1270.pdf Google Scholar
  • 5 Food and Drug Administration. General principles of software validation; final guidance for industry and FDA staff [Internet]. Silver Spring (MD): FDA; 2002 Jan 11 [cited 2023 Aug 21]. Available from: https://www.fda.gov/media/73141/download Google Scholar
  • 6 Agency for Healthcare Research and Quality. Evidence-Based Practice Center systematic review protocol, Impact of Healthcare Algorithms on Racial and Ethnic Disparities in Health and Healthcare [Internet]. Rockville (MD): AHRQ; 2022 Jan 25 [cited 2023 Aug 30]. Available from: https://effectivehealthcare.ahrq.gov/sites/default/files/product/pdf/racial-disparities-health-healthcare-protocol.pdf Google Scholar
  • 7 National Institute of Standards and Technology. AI Risk Management Framework [Internet]. Gaithersburg (MD): NIST; 2023 Jan [cited 2023 Aug 18]. Available from: https://www.nist.gov/itl/ai-risk-management-framework Google Scholar
  • 8 Coalition for Health AI. Blueprint for Trustworthy AI Implementation Guidance and Assurance for Healthcare [Internet]. McLean (VA): Mitre Corporation; 2023 Apr 4 [cited 2023 Aug 18]. Available from: https://coalitionforhealthai.org/papers/blueprint-for-trustworthy-ai_V1.0.pdf Google Scholar
  • 9 White House, Office of Science and Technology Policy. Blueprint for an AI Bill of Rights: making automated systems work for the American people [Internet]. Washington (DC): White House; [cited 2023 Aug 18]. Available from: https://www.whitehouse.gov/ostp/ai-bill-of-rights/ Google Scholar
  • 10 Huang J, Galal G, Etemadi M, Vaidyanathan M. Evaluation and mitigation of racial bias in clinical machine learning models: scoping review. JMIR Med Inform. 2022;10(5):e36388. Crossref, MedlineGoogle Scholar
  • 11 Kaur D, Uslu S, Rittichier KJ, Durresi A. Trustworthy artificial intelligence: a review. ACM Comput Surv. 2022;55(2):1–38. CrossrefGoogle Scholar
  • 12 Daneshjou R, Smith MP, Sun MD, Rotemberg V, Zou J. Lack of transparency and potential bias in artificial intelligence data sets and algorithms: a scoping review. JAMA Dermatol. 2021;157(11):1362–9. Crossref, MedlineGoogle Scholar
  • 13 Bear Don’t Walk OJ 4th, Reyes Nieva H, Lee SS-J, Elhadad N. A scoping review of ethics considerations in clinical natural language processing. JAMIA Open. 2022;5(2):ooac039. Crossref, MedlineGoogle Scholar
  • 14 Pearson A, Wiechula R, Court A, Lockwood C. The JBI model of evidence-based healthcare. Int J Evid Based Healthc. 2005;3(8):207–15. MedlineGoogle Scholar
  • 15 Tricco AC, Lillie E, Zarin W, O’Brien KK, Colquhoun H, Levac Det al. PRISMA extension for Scoping Reviews (PRISMA-ScR): checklist and explanation. Ann Intern Med. 2018;169(7):467–73. Crossref, MedlineGoogle Scholar
  • 16 Page MJ, Moher D, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CDet al. PRISMA 2020 explanation and elaboration: updated guidance and exemplars for reporting systematic reviews. BMJ. 2021;372(160):n160. Crossref, MedlineGoogle Scholar
  • 17 To access the appendix, click on the Details tab of the article online.
  • 18 Weissman GE, Teeple S, Eneanya ND, Hubbard RA, Kangovi S. Effects of neighborhood-level data on performance and algorithmic equity of a model that predicts 30-day heart failure readmissions at an urban academic medical center. J Card Fail. 2021;27(9):965–73. Crossref, MedlineGoogle Scholar
  • 19 Puyol-Antón E, Ruijsink B, Mariscal Harana J, Piechnik SK, Neubauer S, Petersen SEet al. Fairness in cardiac magnetic resonance imaging: assessing sex and racial bias in deep learning–based segmentation. Front Cardiovasc Med. 2022;9:859310. Crossref, MedlineGoogle Scholar
  • 20 Thompson HM, Sharma B, Bhalla S, Boley R, McCluskey C, Dligach Det al. Bias and fairness assessment of a natural language processing opioid misuse classifier: detection and mitigation of electronic health record data disadvantages across racial subgroups. J Am Med Inform Assoc. 2021;28(11):2393–403. Crossref, MedlineGoogle Scholar
  • 21 Foryciarz A, Pfohl SR, Patel B, Shah N. Evaluating algorithmic fairness in the presence of clinical guidelines: the case of atherosclerotic cardiovascular disease risk estimation. BMJ Health Care Inform. 2022;29(1):e100460. Crossref, MedlineGoogle Scholar
  • 22 Coston A, Rambachan A, Chouldechova A. Characterizing fairness over the set of good models under selective labels. PMLR. 2021;139:2144–55. Google Scholar
  • 23 Akbilgic O, Langham MR Jr, Davis RL. Race, preoperative risk factors, and death after surgery. Pediatrics. 2018;141(2):e20172221. Crossref, MedlineGoogle Scholar
  • 24 Borgese M, Joyce C, Anderson EE, Churpek MM, Afshar M. Bias assessment and correction in machine learning algorithms: a use-case in a natural language processing algorithm to identify hospitalized patients with unhealthy alcohol use. AMIA Annu Symp Proc. 2021;2021:247–54. MedlineGoogle Scholar
  • 25 Radovanović S, Petrović A, Delibašić B, Suknović M. Making hospital readmission classifier fair—what is the cost? Paper presented at: Central European Conference on Information and Intelligent Systems; 2019 Oct 2–4; Varazdin, Croatia. Google Scholar
  • 26 Obermeyer Z, Powers B, Vogeli C, Mullainathan S. Dissecting racial bias in an algorithm used to manage the health of populations. Science. 2019;366(6464):447–53. Crossref, MedlineGoogle Scholar
  • 27 Buckley A, Sestito S, Ogundipe T, Roig J, Rosenberg HM, Cohen Net al. Racial and ethnic disparities among women undergoing a trial of labor after cesarean delivery: performance of the VBAC Calculator with and without patients’ race/ethnicity. Reprod Sci. 2022;29(7):2030–8. Crossref, MedlineGoogle Scholar
  • 28 Segar MW, Hall JL, Jhund PS, Powell-Wiley TM, Morris AA, Kao Det al. Machine learning–based models incorporating social determinants of health vs traditional models for predicting in-hospital mortality in patients with heart failure. JAMA Cardiol. 2022;7(8):844–54. Crossref, MedlineGoogle Scholar
  • 29 Yan M, Pencina MJ, Boulware LE, Goldstein BA. Observability and its impact on differential bias for clinical prediction models. J Am Med Inform Assoc. 2022;29(5):937–43. Crossref, MedlineGoogle Scholar
  • 30 Allen A, Mataraso S, Siefkas A, Burdick H, Braden G, Dellinger RPet al. A racially unbiased, machine learning approach to prediction of mortality: algorithm development study. JMIR Public Health Surveill. 2020;6(4):e22400. Crossref, MedlineGoogle Scholar
  • 31 Mosteiro P, Kuiper J, Masthoff J, Scheepers F, Spruit M. Bias discovery in machine learning models for mental health. Information (Basel). 2022;13(5):237. CrossrefGoogle Scholar
  • 32 Pfohl SR, Foryciarz A, Shah NH. An empirical characterization of fair machine learning for clinical risk prediction. J Biomed Inform. 2021;113:103621. Crossref, MedlineGoogle Scholar
  • 33 Afrose S, Song W, Nemeroff CB, Lu C, Yao DD. Subpopulation-specific machine learning prognosis for underrepresented patients with double prioritized bias correction. Commun Med (Lond). 2022;2:111. Crossref, MedlineGoogle Scholar
  • 34 Burlina P, Joshi N, Paul W, Pacheco KD, Bressler NM. Addressing artificial intelligence bias in retinal diagnostics. Transl Vis Sci Technol. 2021;10(2):13. Crossref, MedlineGoogle Scholar
  • 35 Adeli E, Zhao Q, Pfefferbaum A, Sullivan EV, Li F-F, Niebles JCet al. Representation learning with statistical independence to mitigate bias. IEEE Winter Conf Appl Comput Vis. 2021;2021:2512–22. MedlineGoogle Scholar
  • 36 Mullainathan S, Obermeyer Z. On the inequity of predicting A while hoping for B. AEA Pap Proc. 2021;111:37–42. CrossrefGoogle Scholar
  • 37 Samorani M, Blount LG. Machine learning and medical appointment scheduling: creating and perpetuating inequalities in access to health care. Am J Public Health. 2020;110(4):440–1. Crossref, MedlineGoogle Scholar
  • 38 Gama RM, Clery A, Griffiths K, Heraghty N, Peters AM, Palmer Ket al. Estimated glomerular filtration rate equations in people of self-reported Black ethnicity in the United Kingdom: inappropriate adjustment for ethnicity may lead to reduced access to care. PLoS One. 2021;16(8):e0255869. Crossref, MedlineGoogle Scholar
  • 39 Park Y, Hu J, Singh M, Sylla I, Dankwa-Mullan I, Koski Eet al. Comparison of methods to reduce bias from clinical prediction models of postpartum depression. JAMA Netw Open. 2021;4(4):e213909. Crossref, MedlineGoogle Scholar
  • 40 Pierson E, Cutler DM, Leskovec J, Mullainathan S, Obermeyer Z. An algorithmic approach to reducing unexplained pain disparities in underserved populations. Nat Med. 2021;27(1):136–40. Crossref, MedlineGoogle Scholar
  • 41 Dankwa-Mullan I, Scheufele EL, Matheny ME, Quintana Y, Chapman WW, Jackson Get al. A proposed framework on integrating health equity and racial justice into the artificial intelligence development lifecycle. J Health Care Poor Underserved. 2021;32(2):300–17. CrossrefGoogle Scholar
  • 42 Richardson S, Lawrence K, Schoenthaler AM, Mann D. A framework for digital health equity. NPJ Digit Med. 2022;5(1):119. Crossref, MedlineGoogle Scholar
  • 43 Sikstrom L, Maslej MM, Hui K, Findlay Z, Buchman DZ, Hill SL. Conceptualising fairness: three pillars for medical algorithms and health equity. BMJ Health Care Inform. 2022;29(1):e100459. Crossref, MedlineGoogle Scholar
  • 44 Uche-Anya E, Anyane-Yeboa A, Berzin TM, Ghassemi M, May FP. Artificial intelligence in gastroenterology and hepatology: how to advance clinical practice while ensuring health equity. Gut. 2022;71(9):1909–15. Crossref, MedlineGoogle Scholar
  • 45 Nelson ALH, Zanti S. A framework for centering racial equity throughout the administrative data life cycle. Int J Popul Data Sci. 2020;5(1):1367. MedlineGoogle Scholar
  • 46 Reeves M, Bhat HS, Goldman-Mellor S. Resampling to address inequities in predictive modeling of suicide deaths. BMJ Health Care Inform. 2022;29(1):e100456. Crossref, MedlineGoogle Scholar
  • 47 Lomis K, Jeffries P, Palatta A, Sage M, Sheikh J, Sheperis Cet al. Artificial intelligence for health professions educators. NAM Perspect. 2021 Sep 8. [Epub ahead of print]. Crossref, MedlineGoogle Scholar
  • 48 Raza S (University of Toronto, Toronto, ON). Connecting fairness in machine learning with public health equity. ArXiv [preprint on the Internet]. 2023 Apr 8 [cited 2023 Aug 18]. Available from: https://arxiv.org/abs/2304.04761 Google Scholar
  • 49 Vyas DA, Eisenstein LG, Jones DS. Hidden in plain sight—reconsidering the use of race correction in clinical algorithms. N Engl J Med. 2020;383(9):874–82. Crossref, MedlineGoogle Scholar
  • 50 Argentieri R, Mason TA, Hefcart J, Henry J. Embracing health equity by design. Health IT Buzz [serial on the Internet]. 2022 Feb 22 [cited 2023 Aug 18]. https://www.healthit.gov/buzz-blog/health-it/embracing-health-equity-by-design Google Scholar
  • 51 Wong WF, LaVeist TA, Sharfstein JM. Achieving health equity by design. JAMA. 2015;313(14):1417–8. Crossref, MedlineGoogle Scholar
  • 52 Clinical Trials Transformation Initiative. Quality by design [Internet]. Durham (NC): CTTI; [cited 2023 Aug 18]. Available from: https://ctti-clinicaltrials.org/our-work/quality/quality-by-design/ Google Scholar
   
Loading Comments...