Talks‎ > ‎

Feature Engineering in Health Informatics


Health informatics refers to the process of leveraging information technologies to improve the quality of healthcare delivery. In recent years, the application of data mining technologies into healthcare has aroused considerable interests in both data mining and medical communities. Derive and discover important and effective medical features is one key problem in many health informatics problems such as predictive modeling, cohort study, comparative effectiveness research and clinical pathway mining. In this tutorial, I will introduce the popular feature engineering technologies that have been applied in health informatics and point out future trends.

  • Introduction (15 min)
    • The current status of healthcare, why transformation is needed
    • Various data in healthcare
    • Where those healthcare data can help
    • What are the challenges if we directly use the raw data
    • What is feature engineering
    • The role of feature engineering in healthcare
  • Feature Construction and Representation for Healthcare Data (25 min)
    • Feature Construction
      • Electronic Health Record (EHR)
        • Structured EHR (Diagnosis, Medication, Lab, Procedure, Demographics ...)
        • Unstructured EHR (Free clinical text)
      • Medical Imaging
        • Raw features: Pixels, Voxels
        • Derived features: Photometric features, Geometric features
      • Drug Data
        • Chemical compounds
        • Protein targets
        • Therapeutic indications
        • Side-effects
      • Genotype Data
        • Gene expression
        • DNA sequence
        • Protein network
    • Feature Representation
      • Sequence/trace types of representation
      • Vector/matrix/tensor based representation
  • Feature Engineering in Health Informatics (70 min)
    • Feature Augmentation
      • Exploring temporal dynamics
        • Point time features
        • Interval/duration features
      • Exploring feature structures
        • Chain structure
        • Graph structure
      • Sparse coding/dictionary learning
      • Deep learning technologies
    • Feature Densification
      • Baseline approaches
      • Statistical approaches
      • Optimization approaches
    • Feature Reduction
      • Feature grouping
      • Feature selection
    • Feature Fusion
      • Flat approaches
      • Hierarchical approaches
  • Conclusions and Future Directions (10 min)
Comments