Missing data is a well-known problem in data science. If you are interested to know about feature engineering methods for handling this, then this document helps.
In statistics, imputation is the process of replacing missing data with substituted values.
Missing data can introduce a substantial amount of bias, make the handling and analysis of the data more arduous, and create reductions in efficiency.
Imputation preserves all cases by replacing missing data with an estimated value based on other available information.
No single imputation approach fit to all problems. Instead, based on the problem at hand, we need to decide right approach. Below are useful approaches for the same.
Ensure that it will not cause to lose generalizability in the models we build
Before dropping features outright, consider subsetting the part of the dataset that this value is available for and checking its feature importance when it is used to train a model in this subset. If in doing so you disover that the variable is important in the subset it is defined, consider making an effort to retain it.
Mean substitution(refer below pic), Regression technique are few examples of statistical methods. This document provides such different approaches.
https://en.wikipedia.org/wiki/Imputation_(statistics)
https://www.kaggle.com/residentmario/simple-techniques-for-missing-data-imputation
https://expertseoinfo.com/missing-data-imputation-feature-engineering/
https://images.app.goo.gl/4QtWY4SvKVJuVQqu8
https://images.app.goo.gl/hZvmaMtzY7hzVmv86