Classification
A classification problem is a problem where we are using data to predict which category something falls into. An example of a classification problem could be analyzing a image to determine if it contains a car or a person, or analyzing medical data to determine if a certain person is in a high risk group for a certain disease or not. In other words we are trying to use data to make a prediction about a discrete set of values or categorizes.
Regression
Regression problems on the other hand are problems where we try to make a prediction on a continuous scale. Examples could be predicting the stock price of a company or predicting the temperature tomorrow based on historical data.
An epoch is a term used in machine learning and indicates the number of passes of the entire training dataset the machine learning algorithm has completed. Datasets are usually grouped into batches (especially when the amount of data is very large).
In order to determine whether a neuron should be activated or not, the activation function in neural network calculate a weighted total and then adds bias to it. The activation functions’ goal is to make a neuron’s output less linear.
The features are the descriptive attributes, and the label is what you're attempting to predict or forecast.
Neurons in deep learning models are nodes through which data and computations flow.
The input layer of a neural network is composed of artificial input neurons, and brings the initial data into the system for further processing by subsequent layers of artificial neurons.
Hidden layer(s) are the secret sauce of your network. They allow you to model complex data thanks to their nodes/neurons. They are “hidden” because the true values of their nodes are unknown in the training dataset. In fact, we only know the input and output.
The output layer is the final layer in the neural network where desired predictions are obtained. There is one output layer in a neural network that produces the desired final prediction. It has its own set of weights and biases that are applied before the final output is derived.