Wu, E. et al. How medical AI devices are evaluated: limitations and recommendations from an analysis of FDA approvals. Nat. Med. 27, 582–584 (2021).
Kakarmath, S. et al. Best practices for authors of healthcare-related artificial intelligence manuscripts. npj Digital Med. 3, 134 (2020).
Steyerberg, E. W. & Vergouwe, Y. Towards better clinical prediction models: seven steps for development and an ABCD for validation. Eur. Heart J. 35, 1925–1931 (2014).
Van Calster, B. et al. Calibration: the Achilles heel of predictive analytics. BMC Med. 17, 230 (2019).
Vickers, A. J., Van Calster, B. & Steyerberg, E. W. Net benefit approaches to the evaluation of prediction models, molecular markers, and diagnostic tests. BMJ 352, i6 (2016).
Harrell, F. Multivariable modeling strategies. In: Regression Modeling Strategies. Springer Series in Statistics. (Springer, Cham., 2015).
Steyerberg, E. W. Clinical prediction models (Springer Nature, 2009).
Efron, B. & Tibshirani, R. J. An introduction to the bootstrap (CRC press, 1994).
Futoma, J., Simons, M., Panch, T., Doshi-Velez, F. & Celi, L. A. The myth of generalisability in clinical research and machine learning in health care. Lancet Digital Health 2, e489–e492 (2020).
Wan, B., Caffo, B. & Vedula, S. S. A unified framework on generalizability of clinical prediction models. Front. Artif. Intell. 5, https://doi.org/10.3389/frai.2022.872720 (2022).
de Hond, A. A. H. et al. Predicting readmission or death after discharge from the ICU: external validation and retraining of a machine learning model. Crit. Care Med. 51, 291–300 (2023).
Austin, P. C. et al. Geographic and temporal validity of prediction models: different approaches were useful to examine model performance. J. Clin. Epidemiol. 79, 76–85 (2016).
Steyerberg, E. W., Nieboer, D., Debray, T. P. A. & van Houwelingen, H. C. Assessment of heterogeneity in an individual participant data meta-analysis of prediction models: an overview and illustration. Stat. Med 38, 4290–4309 (2019).
Debray, T. P. et al. A new framework to enhance the interpretation of external validation studies of clinical prediction models. J. Clin. Epidemiol. 68, 279–289 (2015).
Cowley, L. E., Farewell, D. M., Maguire, S. & Kemp, A. M. Methodological standards for the development and evaluation of clinical prediction rules: a review of the literature. Diagnostic Progn. Res. 3, 16 (2019).
Wynants, L. et al. Prediction models for diagnosis and prognosis of covid-19: systematic review and critical appraisal. BMJ 369, m1328 (2020).
Gulati, G. et al. Generalizability of cardiovascular disease clinical prediction models: 158 independent external validations of 104 unique models. Circ. Cardiovasc. Qual. Outcomes 15, e008487 (2022).
Futoma, J., Simons, M., Panch, T., Doshi-Velez, F. & Celi, L. A. The myth of generalisability in clinical research and machine learning in health care. Lancet Digit Health 2, e489–e492 (2020).
Burns, M. L. & Kheterpal, S. Machine learning comes of age: local impact versus national generalizability. Anesthesiology 132, 939–941 (2020).
de Hond, A. A. H. et al. Guidelines and quality criteria for artificial intelligence-based prediction models in healthcare: a scoping review. npj Digital Med. 5, 2 (2022).
Sperrin, M., Riley, R. D., Collins, G. S. & Martin, G. P. Targeted validation: validating clinical prediction models in their intended population and setting. Diagnostic Progn. Res. 6, 24 (2022).
Van Calster, B., Steyerberg, E. W., Wynants, L. & van Smeden, M. There is no such thing as a validated prediction model. BMC Med. 21, 70 (2023).
Collins, G. S., Reitsma, J. B., Altman, D. G. & Moons, K. G. M. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. Eur. Urol. 67, 1142–1151 (2015).
More Stories
Most medical college students get worried abortion rules will ‘hinder their foreseeable future care’
Edinburgh to host supercomputer process that may perhaps advance drugs, AI and strength
Drug-microbiota interactions: an emerging priority for precision medicine