Order from us for quality, customized work in due time of your choice.
1. Dataset Description
UCI Machine Learning repository – Diabetes 130-US hospitals for years 1999-2008 Data Set
This research includes a publicly available dataset taken from the Center for Clinical and Translational Research, Virginia Commonwealth University. It consists of over a million records collected across 130 US hospitals and from various healthcare providers over 10 years (1999 2008) [1]. It consists of fifty features representing diabetic patients information, mainly regarding readmission. As per our research in the dataset, the essential features that can affect our model are:
- Admission source It consists of 21 unique parameters of patients admission
- Discharge disposition information Includes 29 values indicating patient discharge location
- Medication changes Includes information about patients medication changes
- Diagnosis information Consists of ICD-9 (International Statistical Classification of Diseases and Related Health Problems) code [2]
- Drug usage Lists drug dosage information among 23 different types of drugs.
- Readmission time Shows if patient readmission was within or after 30 days or no readmission at all.
The train-test split initially includes 80% training and 20% test set data. Also, 5 folds cross-validation is to be applied to get the best evaluation parameters for the given model.
2. Introduction
Background: A considerable number of problems have been solved in the healthcare sector using machine learning techniques. We plan on researching one such domain. Hospital readmissions not only prove costly but also risks the patients medical condition. Moreover, hospital readmission has been a decisive factor in ranking health center credibility. An increase in hospital visits after discharge is costly and time-consuming for both hospitals and patients [3].
Major studies [4] propose that if there is unplanned readmission within 30 days, it indicates treatment or diagnosis error, which could be avoided. However, if readmission is after 30 days, it depends on the patients lifestyle or several other factors [5]. So, an early prediction of readmitting the patients becomes an important task.
Current research and existing models on similar research predict readmission in less than 30 days after discharge [6]. Our research includes predicting unplanned readmission in diabetic patients using multiclass classification. It includes testing whether patients are readmitted within or after 30 days or not readmitted at all. The primary tasks to perform include data preprocessing steps such as data reduction, data cleaning, and data transformation. Furthermore, a good model requires extracting essential features. So, we plan on using various feature selection algorithms to obtain the best features. Using such features, different models such as Random Forest, Support vector machine, Logistic regression, Multilayer perceptron, Naïve Bayes, and Ensemble model is to be tested and compared to obtain the best evaluation parameters (accuracy, precision, recall, F1-score, AUC curve).
3. Methodology/Approaches
Following are the goals of our research:
Predict if the patient will be:
- Readmitted within 30 days ( Readmitted after 30 days (>30)
Order from us for quality, customized work in due time of your choice.