Team HELP

Team Members

Se Won Oh - Electronics and Telecommunications Research Institute, Data Scientist

Hyuntae Jeong - Electronics and Telecommunications Research Institute, Data Scientist

Seungeun Chung - Electronics and Telecommunications Research Institute, Data Scientist

Jeong Mook Lim - Electronics and Telecommunications Research Institute, Data Scientist

Kyoung Ju Noh - Electronics and Telecommunications Research Institute, Data Scientist

About Your Team

Our team is conducting a research project called HELP (Human Experience Learning and Prediction). We study human behavior in daily life by collecting sensor data and analyzing contextual information from everyday activities. Our research focuses on how people behave and adapt in real-world environments over time. Building on our experience with machine learning-based prediction studies, we joined this competition to contribute to identifying critical factors involved in sepsis prediction.

Summary of Approach

We believed that the key was to develop a machine learning model using a carefully curated training dataset, which involved selecting highly relevant variables from a large set and appropriately handling missing values.

Data Preprocessing

We initially selected 27 variables that showed strong associations with sepsis mortality from the full set of variables. Next, all non-numeric variables were converted into categorical variables, with "Unknown" or "Other" treated as disnct categories. To handle missing values, we imputed numeric variables using the mean value within each mortality group, and categorical variables using the most frequent category.

Modeling Approach

We selected the random forest algorithm for training our model, as it is known for its relatively fast computation and strong performance. Moreover, it offers the advantage of enabling interpretation of how individual variables inﬂuence the prediction outcomes.

Model Tuning & Optimization

We set the number of estimators (n_estimators) to 300 and the maximum depth (max_depth) to 10, while keeping all other hyperparameters at their default values.

On the other hand, we initially understood that only three submission attempts were allowed for the ﬁnal phase leaderboard. As a result, we were very cautious to avoid any ﬂagged or error submissions and conservatively set the prediction threshold to ensure on-time submission. However, after reviewing other teams' results, we knew that ﬂagged/error submissions were not counted toward the oﬃcial submission limit. In retrospect, had we known this earlier, we could have taken a more aggressive approach (e.g., experimenting with a wider range of thresholds and hyperparameters), which might have led to better performance.

Page updated

Google Sites

Report abuse