Over the past decades, we have witnessed an immense expansion in the arsenal and performance of machine learning (ML) algorithms. One of the most important fields that could benefit from these advancements is biomedical science. To streamline the training and evaluation of binary classifiers, we constructed a universal and flexible ML framework that uses tabular biomedical data as input.
Methods and results
Our framework requires the input data to be provided as a comma-separated values file, in which rows correspond to subjects and columns represent different features. After reading the content of this file, the framework enables the users to perform outlier detection, handle missing values, rescale features, and tackle class imbalance. Then, hyperparameter tuning, feature selection, and internal validation are performed using nested cross-validation. If an additional dataset is available, the framework also provides the option for external validation. Users may also compute SHapley Additive exPlanations values to interpret the individual predictions of the model and identify the most important features. Our ML framework was implemented in Python (version 3.9), and its source code is freely available via GitHub. In the second part of this paper, we also demonstrate the usage of the framework through a case study from the field of cardiovascular imaging.
The proposed ML framework enables the efficient training and evaluation of binary classifiers on tabular biomedical data. We hope our framework will serve as a useful resource for both learning and research purposes and will promote further innovation.
Right ventricular (RV) ejection fraction (EF) assessed by 3D echocardiography is a powerful measure to detect RV dysfunction. However, its prognostic value in routine clinical practice has been scarcely explored. Accordingly, we aimed at investigating whether RVEF is associated with 2-year all-cause mortality in patients who underwent diverse cardiovascular procedures and to test whether RVEF can overcome conventional echocardiographic parameters in terms of outcome prediction.
Patients and methods
One hundred and seventy-four patients were retrospectively identified who underwent clinically indicated transthoracic echocardiography comprising 3D acquisitions. The patient population consisted of heart failure with reduced ejection fraction patients (44%), heart transplanted patients (16%), and severe valvular heart disease patients (39%). Beyond conventional echocardiographic measurements, RVEF was quantified by 3D echocardiography. The primary endpoint of our study was all-cause mortality at two years.
Twenty-four patients (14%) met the primary endpoint. Patients with adverse outcomes had significantly lower RVEF (alive vs. dead; 48 ± 9 vs. 42 ± 9%, P < 0.01). However, tricuspid annular plane systolic excursion (21 ± 7 vs. 18 ± 4 mm), and RV systolic pressure (36 ± 15 vs. 39 ± 15 mmHg) were similar. By Cox analysis, RVEF was found to be associated with adverse outcomes (HR [95% CI]: 0.945 [0.908–0.984], P < 0.01). By receiver-operator characteristic analysis, RVEF exhibited the highest AUC value compared with the other RV functional measures (0.679; 95% CI: 0.566–0.791).
Conventional echocardiographic measurements may be inadequate to support a granular risk stratification in patients who underwent different cardiac procedures. RVEF may be a robust clinical parameter, which is significantly associated with adverse outcomes.