Docker Setup

Over the course of the semester, many labs will require the use of external software libraries. Rather than figuring out how to install all of the libraries on your own computers, there is a Docker container that you can download. A Docker container is a virtual environment that you can enable and treat like a computer within your computer: it will contain all of code and libraries that you need for the semester. The container will be hosted on Docker cloud, so if I update the container (for instance, to update a package version to work with a lab), you will be able to easily pull the newest version from the cloud.

To get started, you will need to visit the Get Started with Docker page to download Docker Community Edition. You will need to login or create a free account Docker before you can download the software. After you’ve downloaded the software, follow the instructions to install it.

When your container is running, you will need to mount one of your local folders so that you can read/write files. Since you’ll be pulling git repositories ("repos") to your computer in order to complete the labs, it is a good idea to make a folder on your computer as the parent directory to hold each of the lab repos.

There are two different images that you can use: one that just contains the installation for the coding parts of the assignment (harveymudd/cs159-student:spring2021) and one contains pandoc (harveymudd/cs159-student-pandoc:spring2021). The instructions below assume you'll want pandoc as well, but you can swap in whichever you prefer.

On Unix (Mac OS/Linux)

For the example below, my repos are at /home/xanda/Documents/nlp/.

To run Docker, use the following command (as a single line, and replacing the underlined part with your lab directory path):

docker run -it -h docker -v /home/xanda/Documents/nlp:/home/student/nlp harveymudd/cs159-student-pandoc:spring2021 bash


On Windows

Docker for Windows can be tricky to set up: until recently, it only supported Windows 10 Home, and it seems to cause some bumps on Windows. You can try out the regular Windows instructions here, or try setting it up with the Windows Subsystem for Linux 2 (WSL 2) . (Note that updating from WSL 1 to WSL 2 may delete all your Linux-bound files, so back things up!) I currently have this working with Ubuntu for Windows on Windows Enterprise and can try to help debug. If you'd like to avoid the headache, just use your CS login to the knuth server and work on things there; all the same support files and libraries should be present.

For the example below, my repos are at C:\Users\Xanda\My Documents\nlp.

In order to actually run this command, you'll first want to make sure Docker is running (that is, that Docker Desktop has an icon among your icons in your taskbar.) You'll want to open up either PowerShell as an administrator (you can do this by right-clicking on it in your Start menu and selecting "Run as administrator") or use your WSL distribution, e.g. Ubuntu, to run Docker commands. Using the regular built-in command line in your editor, e.g. VSCode, may not work because of the lack of administrative privileges. My usual approach is to open Ubuntu for Windows, check out code there using git, and then run code . there to open Visual Studio Code on the current directory open in Ubuntu for editing. I'll then keep using the command-line window to run Docker commands as needed.

Notice that in the command below, even though the directory is called “My Documents”, Windows allows you to just call the folder “Documents” at the command line to avoid problems with spaces. Also notice that Windows allows you to use ‘normal’ Linux forward-slashes to specify the path (as a single line):

docker run -it -h docker -v C:/Users/Xanda/Documents/nlp:/home/student/nlp harveymudd/cs159-student-pandoc:spring2021 bash

You’re welcome to change the location of your top-level folder to anywhere you want on your computer, but leave the part after the colon (/home/student/nlp) the same.

Although the container has both emacs and vi installed, you will probably find that using a more modern editor installed locally on your machine is easier to use. Since you’re editing files that are local to your computer, a local editor (e.g. atom or Visual Studio Code) is likely a better choice. Both of these also have extensions to support remote pair programming (like Teletype for atom or Visual Studio Live Share for VS Code). However, you're also welcome to just screen share for pair programming in group work, as long as you're good about switching regularly who is typing.

If your installation does not go smoothly or you have additional questions, please let Prof. Xanda or a grutor know on Discord. If it makes sense to remotely access the knuth server instead, here are some (slightly old) docs to get you started.

Pandoc Troubleshooting

(updated 2/3/2021) For most labs, we encourage you to use a Markdown file to make your writeup, compiled using the pandoc tool to a nice, LaTeX-styled PDF without the pain of writing the LaTeX code:

pandoc analysis.md -o analysis.pdf

Wherever you run this, you'll need both pandoc and LaTeX installed. The easiest thing to do is to use the Docker image that combines the NLP class files and pandoc, harveymudd/cs159-student-pandoc:spring2021. However, if you'd prefer to use the smaller student image, you can also just use a separate Docker image to run pandoc as follows:

  1. Outside of the class Docker image, pull down an image containing pandoc and LaTeX: docker pull pandoc/latex:latest

  2. Navigate to the directory where your .md file is (let's say analysis.md)

  3. Use the following Docker command to run the pandoc command on the file and save the output as a PDF:
    docker run --rm -v "`pwd`":/data pandoc/latex:latest analysis.md -o analysis.pdf

(Note 2/1/2021: Powershell seems to struggle with the command above on Windows. If you have WSL set up with e.g. Ubuntu, that may make this easier. If you need help navigating how to set this up, reach out to Prof. Xanda.)