Lesson 5 ❮ Lesson List ❮ Top Page
❯ 5.1 Preprocessing with sklearn
⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺⎺
EXPECTED COMPLETION TIME
❲▹❳ Video 7m 40s
☷ Interactive readings 5m
The main objective of data standardization is to bring the mean to 0 and the variance to 1.
In order to normalize the data, we can use StandardScaler.
First, we need to create a transformer (an object) for the scaler.
The transformer then need to be "fitted" to the data. The fit() function takes only arrays so we need .values to convert it.
We transform the data using transform
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.
If we would like to set a condition on the maximum and minimum values within a data set, we can use scaling to a range.
Scaling to a range, unlike standardization, will preserve 0 values that exist in the data.
The structure of the command is the same as for StandardScaler. MinMaxScaler or MaxAbsScaler are used instead.