In this lesson, we will revisit how to slice and dice the date and generally get the subset of pandas object. Pandas supports two main types of Multi-axes indexing; .loc() and .iloc().
.loc
Pandas provide various methods to have purely label based indexing. When slicing, the start bound is also included. Integers are valid labels, but they refer to the label and not the position.
.loc has multiple access methods like −
A single scalar label
A list of labels
A slice object
A Boolean array
loc takes two single / list / range operator separated by ','. The first one indicates the row and the second one indicates columns.
There is also reindex() method that is similar to loc. reindex() is recommended when some elements you specified might not exist.
.iloc
Pandas provide various methods in order to get purely integer based indexing. Like python and numpy, these are 0-based indexing.
The various access methods are as follows −
An Integer
A list of integers
A range of values
Using the provided df, do the following:
In df, get columns ['A1','A2','A3'] . Save it in a new variable called df1 then print it out.
In df1, get the first 10 lines. Save it in a variable called df2 then print it out.
In df2, get only the even number of lines. Save it in a variable called df3 then print it out.
In df3, get only the rows where the value of A1 is between -0.5 and 0.5. Save it in a variable called df4 then print it out.
In df4, get the element in the last row of column 'A3'. Save it in a variable called df5 then print it out.
The loc method can be useful when you need to cross-reference between files. Try to make a DataFrame consisting of two columns: fruit names (as the index) and the average score each fruit get. Then, print it out.