Close
Help




JOURNAL

Bioinformatics and Biology Insights

Development and Application of a Genetic Algorithm for Variable Optimization and Predictive Modeling of Five-Year Mortality Using Questionnaire Data

Submit a Paper


Bioinformatics and Biology Insights 2015:Suppl. 3 31-41

Original Research

Published on 08 Nov 2015

DOI: 10.4137/BBI.S29469


Further metadata provided in PDF



Sign up for email alerts to receive notifications of new articles published in Bioinformatics and Biology Insights

Abstract

The problem of selecting important variables for predictive modeling of a specific outcome of interest using questionnaire data has rarely been addressed in clinical settings. In this study, we implemented a genetic algorithm (GA) technique to select optimal variables from questionnaire data for predicting a five-year mortality. We examined 123 questions (variables) answered by 5,444 individuals in the National Health and Nutrition Examination Survey. The GA iterations selected the top 24 variables, including questions related to stroke, emphysema, and general health problems requiring the use of special equipment, for use in predictive modeling by various parametric and nonparametric machine learning techniques. Using these top 24 variables, gradient boosting yielded the nominally highest performance (area under curve [AUC] = 0.7654), although there were other techniques with lower but not significantly different AUC. This study shows how GA in conjunction with various machine learning techniques could be used to examine questionnaire data to predict a binary outcome.



Downloads

PDF  (562.03 KB PDF FORMAT)

RIS citation   (ENDNOTE, REFERENCE MANAGER, PROCITE, REFWORKS)

Supplementary Files 1   (516.84 KB PDF FORMAT)

BibTex citation   (BIBDESK, LATEX)

XML




Quick Links


New article and journal news notification services