Neurosurgeon MaineHealth Cape Elizabeth, Maine, United States
Introduction: This study aims to validate a predictive model against a novel data set. Using a retrospective dataset from a single academic medical center an Artificial Neural Network (ANN) was trained to predict symptomatic cerebral vasospasm (SCV) in patients with aneurysmal subarachnoid hemorrhage (SAH). The model demonstrated improved sensitivity and specificity in comparison to two Multiple Logistic Regression models. The validation of the model was performed on a small prospective dataset derived from the same institution. In an effort to test the generalizability of this predictive model a unique dataset was derived from another academic medical center.
Methods: A prospectively maintained database from a single academic medical center identified 86 patients suffering from an SAH. Input variables included age, gender, Glasgow Coma Score, Hunt Hess Score, Modified Fisher Score, hydrocephalus, aneurysm location, and treatment modality. The input variables were used to predict the occurrence of SCV (neurologic change corresponding to radiographic vessel spasm by either angiography or TCD).
Results: In this unique dataset used to validate a previously trained ANN the accuracy of the model in predicting SCV was 62.8%. The Sensitivity of the model was 52.6%, the Specificity was 65.7%, the Positive Predictive Value was 30.3% and the Negative Predictive Value was 83.0%. The underperformance of the model, in contrast to previous validation against an internal dataset, demonstrates the inability of this model to generalize accurately.
Conclusion : While Machine Learning will increasingly aid in diagnosis and outcome prediction, the key finding in the present study is that despite selection of the same data points a model can fail to generalize in a unique population. The implication of this is that as commercially available clinical support tools are developed and made available, the datasets that are used to train and build such models should necessarily consider generalizability across unique populations.