Reading Flat-File Form

Two common flat-file dataset is in the form of csv file or txt form. We will see how to read from those file types.

So far, we've read data from a .txt file. A common format is to have a header line that documents the purpose of each field, followed by another line for each record in the file.

Music by: www.bensound.com

Reading from a text file

Here, we will use the pandas library to perform the task. To open a .txt file, we can use the method read_table().

Try to run this file and compare the original .txt and the output.

Documentation: read_table

Reading CSV delimited format

A CSV file provides more formatting than a simple text file. Its header defines each of the fields. The entries are usually separated by comma. Let's use pandas to take print all the city name in the file cities.csv. Note that the result will be in DataFrame format.

Documentation: read_csv

In DataFrame, we can take any Series by specifying the column name in brackets []. Let's try to take the city name.

As we have done in the previous lesson, we can convert the Series into an array by using values method. You can also use len() to check the length of it.

< Prev. Lesson

Next Lesson >

Exercise 1.2

The csv file contains the age of oscar winner from 1928 until 2016. Print out the age of all the actresses.
Get the age into an array called actress_age then print out the average of the actresses.
The username.csv files is separated by semicolon. Read the read_csv() documentation and find out a way to read it. Then, print out all the data.
Find a way to read the time_table.html using pandas library. Then, print out all the data.

Page updated

Google Sites

Report abuse