A model hyperparameter is the parameter whose value is set before the model start training. If you are wandering about its benefit, then this document can help.
Not having a clear definition for these terms is a common struggle for beginners, especially those that have come from the fields of statistics or economics.
In machine learning, a hyperparameter is a parameter whose value is used to control the learning process.
A model parameter is a configuration variable that is internal to the model and whose value can be estimated from data.
They are required by the model when making predictions.
They values define the skill of the model on your problem.
They are estimated or learned from data.
They are often not set manually by the practitioner.
They are often saved as part of the learned model.
In programming, you may pass a parameter to a function. In this case, a parameter is a function argument that could have one of a range of values. In machine learning, the specific model you are using is the function and requires parameters in order to make a prediction on new data.
For example, in below regression example, the blue line is specified by Y=mX + c . Here m and c are model parameters. They are learned via training and used in prediction.
A model hyperparameter is the parameter whose value is set before the model start training. They cannot be learned by fitting the model to the data.
They are often used in processes to help estimate model parameters.
They are often specified by the practitioner.
They can often be set using heuristics.
They are often tuned for a given predictive modeling problem.
Neural networks example of hyperparameters are learning rate, epoch count, batch size etc.
Hyperparameter tuning is needed to optimise machine (for example neural networks) for better prediction result.
When a machine learning algorithm is tuned for a specific problem, , then you are tuning the hyperparameters of the model.
We cannot know the best value for a model hyperparameter on a given problem. We may use rules of thumb, copy values used on other problems, or search for the best value by trial and error(bayesian optimiser, grid search etc).
Hyperparameter values varies based on the dataset (problem at hand).
Cross-validation technique helps to tune hyperparameter using cross-validation data set. Refer here for the detail
Hyperparameter tuning tools
It performs exhaustive search and so, it tries out all values in the search space indiscriminately. Result is optimal, however search time is a concern.
Its most popular tool. It reduces the search space by intelligent analysis of result based on past evaluations.
Bayesian optimisation in turn takes into account past evaluations when choosing the hyperparameter set to evaluate next.
By choosing its parameter combinations in an informed way, it enables itself to focus on those areas of the parameter space that it believes will bring the most promising validation scores. This approach typically requires less iterations to get to the optimal set of hyperparameter values. Most notably because it disregards those areas of the parameter space that it believes won’t bring anything to the table.
optuna is the python programming framework which can be used to perform hyperparameter tuning
Reference
https://machinelearningmastery.com/difference-between-a-parameter-and-a-hyperparameter/
https://en.wikipedia.org/wiki/Hyperparameter_(machine_learning)
https://www.geeksforgeeks.org/difference-between-model-parameters-vs-hyperparameters/
https://medium.com/vantageai/bringing-back-the-time-spent-on-hyperparameter-tuning-with-bayesian-optimisation-2e21a3198afb
https://en.wikipedia.org/wiki/Hyperparameter_optimization
https://optuna.org
https://sites.google.com/site/jbsakabffoi12449ujkn/home/machine-intelligence/role-of-cross-validation-data-in-machine-learning