App Engine Config

The app.yaml config file of a Google App Engine application allows you to specify properties and mappings for the application. You can specify:
  • Name and Version information
  • URL-Controller Mappings--The Python program files that should be invoked to handle each URL.
  • User authorization-- which pages require the user to be logged in.
  • Static files-- The URLs of static files (html, css, javascript).
Consider the following:

application: templates
version: 1
runtime: python
api_version: 1

handlers:
- url: /stylesheets
  static_dir: stylesheets
- url: /.*
  script: templates_controller.py
  login: required

The application name must match the name of the home directory where the application is stored and the Google application name. The version is used so that you can upload multiple versions of your application to Google. In the Google App Engine site you can specify which version is public. The runtime and api_version pertain to which runtime environment is used on the server-- currently only Python is available.

Mapping URLs to Handler Code

The first handler specifies that  /stylesheets is a static directory. So if your HTML links to a file in this directory:

<link href="/stylesheets/tstyles.css" media="screen" rel="Stylesheet" type="text/css" />

the server knows not to call controller code but to just send the file over the wire directly.

The second handler specifies that for all other URLs, the server controller code 'templates_controller.py' should be invoked. The system will run the code there and find the particular controller class that should be invoked for the requested URL.

The second handler also specifies that only logged-in users will be given access to the specified URLs. If the user is not logged in, the Google sign-in dialogue will be invoked before any controller. After the user signs in, control reverts back to the specified controller.

Advanced Pattern-Matching for URLs and Handlers

The url-handler mappings use POSIX extended regular expression syntax.

Here are the basics of regular expresssions, from Wikipedia:

A regular expression, often called a pattern, is an expression that describes a set of strings. They are usually used to give a concise description of a set, without having to list all elements. For example, the set containing the three strings "Handel", "Händel", and "Haendel" can be described by the pattern H(ä|ae?)ndel (or alternatively, it is said that the pattern matches each of the three strings). In most formalisms, if there is any regex that matches a particular set then there is an infinite number of such expressions. Most formalisms provide the following operations to construct regular expressions.

Alternation
A vertical bar separates alternatives. For example, gray|grey can match "gray" or "grey".
Grouping
Parentheses are used to define the scope and precedence of the operators (among other uses). For example, gray|grey and gr(a|e)y. are equivalent patterns which both describe the set of "gray" and "grey"
Quantification
A quantifier after a token (such as a character) or group specifies how often that preceding element is allowed to occur. The most common quantifiers are the question mark ?, the asterisk * (derived from the Kleene star), and the plus sign +.
? The question mark indicates there is zero or one of the preceding element. For example, colou?r matches both "color" and "colour".
* The asterisk indicates there are zero or more of the preceding element. For example, ab*c matches "ac", "abc", "abbc", "abbbc", and so on.
+ The plus sign indicates that there is one or more of the preceding element. For example, ab+c matches "abc", "abbc", "abbbc", and so on, but not "ac".

These constructions can be combined to form arbitrarily complex expressions, much like one can construct arithmetical expressions from numbers and the operations +, , ×, and ÷. For example, H(ae?|ä)ndel and H(a|ae|ä)ndel are both valid patterns which match the same strings as the earlier example, H(ä|ae?)ndel.

There are also 'lazy' quantifiers-- you can put a ? in front of a * or a +, to cause 'lazy' evaluation. Here's an example of this from http://www.regular-expressions.info/repeat.html

Suppose you want to use a regex to match an HTML tag. You know that the input will be a valid HTML file, so the regular expression does not need to exclude any invalid use of sharp brackets. If it sits between sharp brackets, it is an HTML tag.

Most people new to regular expressions will attempt to use <.+>. They will be surprised when they test it on a string like This is a <EM>first</EM> test. You might expect the regex to match <EM> and when continuing after that match, </EM>.

But it does not. The regex will match <EM>first</EM>. Obviously not what we wanted. The reason is that the plus is greedy. That is, the plus causes the regex engine to repeat the preceding token as often as possible. Only if that causes the entire regex to fail, will the regex engine backtrack. That is, it will go back to the plus, make it give up the last iteration, and proceed with the remainder of the regex. Let's take a look inside the regex engine to see in detail how this works and why this causes our regex to fail. After that, I will present you with two possible solutions.

Consider the following:

handlers:
- url: /profile/(.*?)/(.*)
script: /employee/\2/\1.py
The above would match, for example,  /profile/edit/manager and use edit and manager as the first and second groupings (1 and 2) in the script line.

Thus, /profile/edit/manager would map to the script at: employee/manager/edit.py.

In-Class Assignment:

1. What strings would match the following regular expressions:

    aa+bb
    .*(a | b)
    (0|1|2|3|4|5|6|7|8|9)\.(0|1|2|3|4|5|6|7|8|9)

2. Here's an example list of handlers from Google's documentation. Describe the meaning of each handler mapping and the site behavior as a whole.
handlers:
- url: /
script: home.py

- url: /index\.html
script: home.py

- url: /stylesheets
static_dir: stylesheets

- url: /admin/.*
script: admin.py
login: admin

- url: /.*
script: not_found.py

Recent site activity