LIN350 Analyzing Linguistic Data Spring 2025

Course: LINdocs.google.com/document/d/1YJ_Jw9GzQ6jdIUy2vwAJVwmJt3V3nCNyMuwxK5PeNWY/edit?usp=sharing350 Analyzing Linguistic Data, unique number 40145

Semester: Spring 2025

Course Canvas page: https://utexas.instructure.com/courses/1406622

Place and time: Tuesday/Thursday 12:30-2, WCP 5.102. Directions to WCP: click here. 

Instructor:  Katrin Erk. office RLP 4.734, email: katrin.erk@utexas.edu
Office hours:  to be announced. Until office hours are determined, please email me to set up a meeting.

Teaching Assistant: Sooji Lee.. Contact information on Canvas.
Office hours: to be announced.

Prerequisites: Upper-division standing.

Textbook and readings

Readings will be made available for download from the course website.

Flags: Quantitative Reasoning, Independent Inquiry 

Course Syllabus

Link

Course overview and objectives


Today, huge amounts of text are available in electronic form. We can poke these electronic text collections to answer questions about language, and questions about the people who use it. For example, we can test whether passive constructions are increasingly falling out of favor in English, and we can trace how words change their meaning over time. We can also study a politician's word choices in political debates to find out more about their personality, or we can see how inaugural addresses have changed over time.

This course provides a hands-on introduction to working with text data. This includes an introduction to programming in Python, with a focus on text processing and data exploration, with a "cookbook" of programming examples that will enable you very quickly to analyze texts on your own. Most of the conclusions that we want to draw from text are "risky conclusions", they are trends rather than yes-or-no answers, so the course also includes an introduction to statistical techniques for data exploration and for making and assessing "risky conclusions". The course also includes a course project where you can test your text analysis skills on a question of your own choice.

By the end of this course, you will:


Course project

Course project requirements:

In addition, each team member submits a short (half page) document describing their individual contribution and reflecting on what they learned in the project so far.

You will need to prepare slides for this, which you submit to the instructor ahead of time.

It is okay if you don't have all results in place at this point. This does not lead to points being taken away for the presentation.

If you build on previous work, you need to discuss it, and give references.
Published papers (at conferences, in journals) go into the references list at the end of the paper. Links to blog posts and the like go in a footnote. Also, links to websites containing data go in a footnote, not in the references list.

You need to take into account the feedback that you got on the Initial project description and Intermediate report.

In addition, each team member submits a short (half page) document describing their individual contribution and reflecting on what they learned in the project.

Course project ideas

Ideally, you pick a topic of your own that you are curious about. But to give you an idea of possible topics, here are a few pointers:

Please discuss your topic with the instructor to make sure that it is both substantial and feasible.

For your course project, you will need to apply statistical analyses yourself. Google books n-gram charts, while pretty, do not count.

Useful links

List of software we will use in the class

Python and Python packages:

To test your Python installation, use this Jupyter notebook.

Using Jupyter notebooks

We'll be using Jupyter Notebooks in class. To use a  "Code for download" file, download it to your computer. Your computer will probably complain that it doesn't know how to open the file. This is not a problem, ignore it. Then you have multiple options for how to open the file: (1) If you have Anaconda on your computer, you can open the file with notebooks. (2) If you have Anaconda, or another python, on your system, you can open a terminal, go to the directory with the notebook, and type the command jupyter notebook, or the command python -m jupyterlab. Or (3) if you have a Google colab account, you can open the file online in colab by selecting "Upload" from the left-hand side menu. 

For info on how to format text and write code in Jupyter notebooks, see this Jupyter notebook.

Learning Python

Learning Python:

General Python pages:

The Natural Language Toolkit:

Fun with statistics