Persistence and Google's Datastore API

Persistent data is:
  • information stored in a database or a file.
  • data that lives even after a program or web session ends.
An example of persistent data is your profile information on Facebook. When you submit a web form, the data you submit is stored in a database on the web sites servers. This data is persistent since it will be there next time you visit the site.

Files
One way to store data persistently with Python is to access files directly. Here's an example of writing to a file:

f = open('test.dat','w')
f.write("note 1\n")
f.write("note 2")
f.close()

In the example, f is a variable of type File. You open it, providing the file name as the first parameter. The second parameter is either "w" to say you are going to write from the file, or "r" to read from it.

You can read from an existing file with code like the following:

f = open("test.dat","r")
while True:
    line = f.readline()
    if line=="":
        break
    print line

Note that readline will return the empty string when it reaches the end of the file.

DBMS and SQL

A database management system (DBMS) provides a higher-level access to files and persistent data.

The idea is that many programs need to save things persistently, so DBMSs are like libraries of code to make that easier.

Most DBMS provide SQL access.

SQL stands for Structured Query Language and is the standard language by which data is stored and retrieved from a relational database.

Data is stored in tables. Each table has fields and each entry in the table is a record.

The following SQL command creates a table of customers (with no records yet).

CREATE TABLE customer
(First_Name char(50),
Last_Name char(50),
Address char(50),
City char(50),
Country char(25),
Birth_Date date)

After such a table is created, a program can add records to it with the SQL Insert statement:

INSERT INTO customer (First_Name, Last_Name, Address,City,Country,Birth_Date)
VALUES ('David', 'Wolber', '1329 Willard Street', 'San Francisco', 'USA', 'Jan-10-1999')

A program can query existing data with the Select statement:

            SELECT Last_Name
            FROM customer
            WHERE City='San Francisco'

Object-Relational Mapping

In some programming languages and environments, the programmer makes SQL statements within the code. For instance JDBC is a library that allows one to call SQL from a Java program.

There are now many programming environments that hide the SQL layer, to a certain extent, from the programmer. The programmer is allowed to work in the world of objects-- regular program variables-- which behave as persistent data.

We call this an object-relational mapping because items in an object-oriented language are 'mapped' to relational database items. The environment provides a way to define a class that represents a persistent entity, a table. The program then creates instances of the class, which are the equivalent to database records.

    class     <==>  table
    object    <==> record

One advantage of this approach is the programmer can generally stay within one language as opposed to talking in both the programming language and SQL.

In most programming environments (e.g., Ruby on Rails, Java's Hibernate) the programmer does have to do some work to set up the database and the object-relational mapping.

Google's Datastore API

Google's Datastore API allows a programmer to work in the world of objects with no database setup. Google supplies the servers and the underlying database. The programmer just creates and uses persistent objects with the Datastore API, and Google takes care of the plumbing. This is pretty incredible stuff and on the cutting edge of what is now called 'cloud computing'.

Here's an example of how to create a class that maps to a table.

class Customer(db.Model):
  last_name = db.StringProperty(required=True)
  first_name= db.StringProperty(required=True)
  city = db.StringProperty(required=True)

Because the class is defined as a db.Model, it is persistent. The fields are defined directly in the class and must be of some db.X type. In this sample, all the fields are strings.

Note that this code only creates the class (table), and not any objects (records).

The following code creates a record in the Customer table:

customer1 = Customer(last_name="Wolber",first_name="David") # object creation, in memory
customer1.city = "San Francisco"
customer1.put() # this stores to database

Note that the object creation statement only creates an object in memory. The call to put causes the object to be stored persistently (in the database).

The DataStore API also allows you to query the database with SQL-based statements. Here are some examples of queries to the Customer table:

customers=db.GqlQuery("SELECT * FROM Customer") # get all customers into the list 'customers'.
customers = db.GqlQuery("SELECT * FROM Customer WHERE last_name = 'Wolber' ")
city = "San Francisco"
customers = db.GqlQuery("SELECT * FROM Customer WHERE city = :1",city)

The first returns a list of all objects of type Customer.

The second returns only those whose last_name field is set to 'Wolber'
. Note that you must use double quotes around the whole query so you can place the single quotes around the string to match.

The third query returns all customers from San Francisco (or whatever is in some variable city). Note the :1 syntax for placing variables into a query-- the value of the variable city replaces the   :1. This allows you to put computed values into a query.

Ifthe queries above were executed in a controller, one could send the list 'customers' in the template values, e.g.,

template_values={'customers',customers}

In the HTML template, be sure to reference parts of each object, not the object as a whole:

{% for customer in customers %}
    {{customer.last_name}}<br/>{{customer.city}}
{% endfor %}

Your Turn

1. Write a Python command-line program that copies one file into another. You should ask for the two file names from the end-user.

2. Download the attached files into dbSample and run the program with dev_appserver (not on google). Play around with some queries.

3. Modify your Mastermind program to record the score of each game.

Attachments (5)

  • app.yaml - on Oct 28, 2009 1:28 PM by David Wolber (version 1)
    1k Download
  • app_controller.py - on Oct 28, 2009 1:28 PM by David Wolber (version 1)
    2k Download
  • index.html - on Oct 28, 2009 1:28 PM by David Wolber (version 1)
    1k Download
  • model.py - on Oct 28, 2009 1:28 PM by David Wolber (version 1)
    1k Download
  • sessions.py - on Oct 28, 2009 1:29 PM by David Wolber (version 1)
    18k Download

Recent site activity