A Data-Mining Model for Predicting Low Birth Weight with a High AUC
Chapter in book
Authors/Editors
Research Areas
No matching items found.
Publication Details
Author list: Hange U, Selvaraj R, Galani M, Letsholo K
Publisher: Springer Verlag (Germany)
Place: BERLIN
Publication year: 2018
Journal: Studies in Computational Intelligence (1860-949X)
Journal acronym: STUD COMPUT INTELL
Volume number: 719
Start page: 109
End page: 121
Number of pages: 13
ISBN: 978-3-319-60169-4
eISBN: 978-3-319-60170-0
ISSN: 1860-949X
Languages: English-Great Britain (EN-GB)
View in Web of Science | View on publisher site | View citing articles in Web of Science
Abstract
Birth weight is a significant determinant of a newborn's probability of survival. Data-mining models are receiving considerable attention for identifying low birth weight risk factors. However, prediction of actual birth weight values based on the identified risk factors, which can play a significant role in the identification of mothers at the risk of delivering low birth weight infants, remains unsolved. This paper presents a study of data-mining models that predict the actual birth weight, with particular emphasis on achieving a higher area under the receiver operating characteristic (AUC). The prediction is based on birth data from the North Carolina State Center for Health Statistics of 2006. The steps followed to extract meaningful patterns from the data were data selection, handling missing values, handling imbalanced data, model building, feature selection, and model evaluation. Decision trees were used for classifying birth weight and tested on the actual imbalanced dataset and the balanced dataset using synthetic minority oversampling technique (SMOTE). The results highlighted that models built with balanced datasets using the SMOTE algorithm produce a relatively higher AUC compared to models built with imbalanced datasets. The J48 model built with balanced data outperformed REPTree and Random tree with an AUC of 90.3%, and thus it was selected as the best model. In conclusion, the feasibility of using J48 in birth weight prediction would offer the possibility to reduce obstetric-related complications and thus improving the overall obstetric health care.
Keywords
Birth weight, Data-mining, Imbalanced dataset, Low birth weight, SMOTE
Documents
No matching items found.