In this project, you will create a small application that displays data from Google Fusion Tables. The work you do in this project byte is something you will build on throughout this class. This assignment has the following learning goals:
This project requires you to use Python 2.7 (please note the version number) and some additional libraries that are available for python. To learn more about Python, you may want to explore www.pythontutor.com. The textbook Introduction to Computing and Programming in Python is an excellent introductory book aimed at non programmers.
Google Cloud Platform is a development environment that will let you place your code on the web with relative ease. An excellent "Getting Started" tutorial will walk you through the initial creation of a simple application that displays plain text on the web.
The tutorial is quite detailed and helpful. Be sure to follow it all the way through until you can load your website on the web.
Here is what you will have accomplished after you complete the tutorial:
It is a requirement for this project that you use github to manage your source code. This has several benefits, including providing a way to turn in your source code. You can get an account for free at https://github.com/. Note that unless you pay, everything you put on github is public. Once you create your account, you will need to create a repository, which you should name [yourbytename], which you can do following this tutorial: https://help.github.com/articles/create-a-repo/
You can see, for example, that I have created the repository jmankoff-fusion2017 in my public github account.
You should create a repository for your project. You can then connect it to your app by clicking on the 'create repository' button shown below. It will ask you for a name -- use something like [yourbytename]-github
Once you create the repository, you have to specify what to link it to:
Click on 'automatically mirror from github or bitbucket' and select github from the 'Select hosting service' menu and select the repository you just created.
Once you have done this, you should check out the base code for this assignment, using the following process. In using this tutorial, be sure to type each single line at a time and hit enter after each single line.
First (this is a onetime only thing) you need to setup github on Google Appspot's cloud terminal. Run the following commands, using your email address, name and so on:
git config --global push.default simple
git config --global user.email "jmankoff@cs.cmu.edu"
git config --global user.name "Jen Mankoff"
In addition, you need to make sure you are in the correct directory (should be ~/src/[your byte name]).
To change directories, use the command cd
To look at the contents of a directory (such as to find the name of a file), use the command ls
You can google both of these commands for more information. Be aware that you can use a period to refer to the current directory, and double period to refer to the directory above the current one. So
ls .
produces a listing of the current directory (as does a plane ls) and
cd ..
moves you to the directory above the current directory. Once you have cd'd to ~/src/[your byte name] you will need to checkout the code from your repository. That will look something like this (you can find out the github repository name by going to github to your repository and clicking on the 'clone or download' button, and copying over the URL which will look something like 'https://github.com/jmankoff/jmankoff-fusion2017.git
git clone [your github repository]
... output ...
Cloning into '[your repository]'...remote:
Counting objects: 382, done.remote:
Compressing objects: 100% (276/276), done.
remote: Total 382 (delta 50), reused 382 (delta 50), pack-reused 0
Receiving objects: 100% (382/382), 776.03 KiB | 0 bytes/s, done.Resolving deltas: 100% (50/50), done.
Checking connectivity... done.
Next, you will need to check out the base code if you have not yet as part of the tutorial:
gcloud source repos clone python-gae-quickstart --project=[your byte name]
ls
(this will show the name of the directory that was created)
cd python_gae_quickstart[fill in the directory name based on what you saw in ls]
git remote remove origin
Finally, you will need to copy the base code into your repository, which we do by moving into your github project directory and then copying over the files from the python_gae_quickstart directory using the cp
command.
cd ..
cd [your github project name]
cp -r ../python_gae_quickstart[fill in the directory name]/* .
The last step is to push the changes back to github so that your github repository mirrors what is on your google cloud drive. This can be done any time you make a major change to your project, and is the first step in making use of the benefits of version control, which we strongly recommend you become familiar with. Note that the words "base code" below can and should be replaced with any string you think is descriptive of a change you make (but always in quotes).
git add *
git commit -m "base code"
git status
once you hit git status, do not be alarmed if you see red things. the red things mean that you are still in intermediary stage, not that there is an error. The output will look something like this.
... output ...
On branch master
Your branch is ahead of 'origin/master' by 1 commit.
(use "git push" to publish your local commits)
nothing to commit, working directory clean
git push
You'll need to enter a user name and password for your github account at this point. Once that is done, look at your repository. You should see all your files:
You can now add/commit and push your changes as you work on your assignment, giving you a backup of the assignment, a way to edit code on your local machine (if you prefer) and so on.
For this assignment we will be using the Code Editor, which allows you to develop your applications online in your Cloud Shell instance. To start your Code Editor click on the File icon in The Development tab.
Once the Code Editor starts, locate your project in the src folder in the directory tree.
Before we start, let's make sure we've included the correct set of libraries in our application. For this project we will be using:
Don't forget at this point to add all those files using
git add .
git commit -m "added libraries"
git push
Note that any time you commit, you can use the message of your choice to describe those changes (inside the quotes).
Now, set up your app.yaml with the correct information:
runtime: python27
api_version: 1
threadsafe: yes
# Handlers define how to route requests to your application.
handlers:
static_dir: js
application_readable: true
- url: /fonts
static_dir: fonts
application_readable: true
- url: /css
static_dir: css
application_readable: true
- url: /templates
static_dir: templates
application_readable: true
- url: /js
- url: .*
script: main.app
libraries:
- name: jinja2
version: latest
- name: webapp2
version: latest
Once again, remember to commit to github!
To begin serving HTML pages, follow the Jinja2 tutorial. We've already updated the app.yaml file. Now we have to update the main.py file to follow through. Add the following lines at the top of the file:
# Imports
import os
import jinja2
import webapp2
import logging
JINJA_ENVIRONMENT = jinja2.Environment(
loader=jinja2.FileSystemLoader(os.path.dirname(__file__)),
extensions=['jinja2.ext.autoescape'],
autoescape=True)
And change the hello()
route as follows:
@app.route('/')
def hello():
template = JINJA_ENVIRONMENT.get_template('templates/index.html')
return template.render()
Create a directory inside your application folder named 'templates' and create a file named 'index.html' inside.
The 'index.html' file should contain the following html.
<!DOCTYPE html>
<html>
<head>
<title>Byte 1 Tutorial</title>
</head>
<body> <h1>Data Pipeline Project Byte Example</h1> </body> </html>
Start your test instance in the console the same way you did in the tutorial, and start your application in your Web Preview. The result, when you load it, should look like this:
Now we want to add some bootstrap styling. The sample index.html and about.html files provided with your byte source code are based on a bootstrap theme, you can view more themes on the getting started. page at http://getbootstrap.com/getting-started/ and download sourcecode for example themes. Just be aware that you will need to modify these themes to reflect the directory structure of your Google App Engine application. Specifically, you should use 'css/...' to refer to css files, and 'js/...' to refer to javascript. You can also use a program to lay out bootstrap pages such as layitout.com or x-editable.
Remember that you have been working with the development app server. To make your application available to the public you need to deploy it (at this point you would have already created the application in the tutorial):
gcloud app deploy
You can quickly and easily test your scripts as you go using the development server. You can change the logging level using the command line when you start the development app server (e.g., set the logging level to "debug"):
dev_appserver.py --log_level=debug $PWD
You will then be able to see log entries on your command line. You can output your own debugging text there by using the python command Logging.info()
(or another debug level, depending on your needs).
Thus, a very good debugging and editing cycle is [Edit main.py
] [reload local web preview] [check results and log to make sure your code is doing what you think it is] [rinse and repeat].
Once you deploy your application you can keep track of your logs using the Google Cloud Logging interface. For more information see the logging documentation page.
You will first need to identify a data set that is of interest to you. Here is some advice from Google Fusion Tables' website on where and how to find interesting data. Here is another source of interesting data:
https://github.com/caesar0301/awesome-public-datasets
In case you use an existing Google Fusion Table, you should be aware that you will not be able to access all of the features of Google Fusion Tables. For this reason, you should make a copy (under the file menu) before proceeding. Click "View Copy" and proceed with the new table for the remainder of this assignment.
Start by following the tutorial for making a map (note that we can skip the first half of the tutorial since the data is "imported" as soon as you make your copy). For example, when I did this with the table of animal outcomes I copied from Louisville Animal Metro Services, I had the following display on my map:
Of course you can and should go further. For example, I customized my map to mark animals that were euthanized in with a different marker than those that were not using the method described in this tutorial. After following the tutorial, my map looked like this (you can find lots of interesting icon types at this change placemark icon tutorial).
Maps are only the beginning -- you can also explore the data using other types of charts:
The visualizations you create can be embedded in the [yourname-explore] application. You will need to configure the fusion table following the tutorial on embedding to be publicly accessible (mine is accessible to only those with the correct url). Once you have that set up, you can paste the iframe
code provided with your new map or chart into your index.html file.
At this point you should have a working version of code something like the reference application http://jmankoff-fusion.appspot.com/. The deeper thinking in this assignment requires that you select a Fusion table and display its contents in a way that correspond to a question you have designed and answered. Note that the example code does not demonstrate this. This is the first step (and of course will eventually be much more iterative) in any data pipeline: Figuring out what data you need to answer your question. You should:
What question does your application help to answer, and how does it let the end user answer that question?
Where is the URL for the working version of your assignment?