ML Hyperparameters

ML hyperparameters (Gemini)

Categorized by type of model or general application:

General/Optimization Hyperparameters:

Learning Rate: Controls the step size at each iteration while moving towards a minimum of a loss function.

Number of Epochs: The number of complete passes through the entire training dataset during training.

Batch Size: The number of samples processed before the model's internal parameters are updated.

Optimizer: The algorithm used to minimize the loss function (e.g., SGD, Adam, RMSprop).

Regularization Strength (λ or α): Controls the penalty for complexity in the model, preventing overfitting (e.g., L1, L2 regularization).

Neural Network Specific Hyperparameters:

Number of Hidden Layers: The depth of the neural network.

Number of Neurons (in each hidden layer): The width of the neural network at each hidden layer.

Activation Function: The non-linear function applied to the output of each neuron (e.g., ReLU, Sigmoid, Tanh).

Dropout Rate: The fraction of neurons to randomly set to zero during training to prevent overfitting.

Weight Initialization: The method used to set the initial values of the network's weights.

Tree-Based Model Specific Hyperparameters (e.g., Decision Trees, Random Forests, Gradient Boosting):

Max Depth: The maximum depth of individual trees in the ensemble.

Min Samples Split: The minimum number of samples required to split an internal node.

Min Samples Leaf: The minimum number of samples required to be at a leaf node.

Number of Estimators (or Trees): The number of trees in an ensemble (e.g., in Random Forest or Gradient Boosting).

Criterion (for splitting): The function to measure the quality of a split (e.g., Gini impurity, entropy for classification; mean squared error for regression).

Support Vector Machine (SVM) Specific Hyperparameters:

Regularization Parameter: Controls the trade-off between achieving a low training error and a low testing error (i.e., preventing overfitting).

Gamma (for RBF/Polynomial Kernels): Defines how far the influence of a single training example reaches. A low

K-Nearest Neighbors (KNN) Specific Hyperparameters:

N Neighbors (k): The number of nearest neighbors to consider for classification or regression.

P (Power Parameter for Minkowski Metric): Determines the type of distance metric used (e.g., p=1 for Manhattan distance, p=2 for Euclidean distance).

Page updated

Google Sites

Report abuse