Data Sciece for Decision Makers

A man's accomplishments in life are the cumulative effect of his attention to detail.

John Foster Dulles

This is a python-based, mathematics course taught at the United States Naval Academy, Annapolis, MD.

Spring 2023 Course Policy Statement

Section 3401 TR9 (0955 - 1110),  CH190

Section 5601 TR10 (1330 - 1445),  CH190

Since we meet on Tuesdays and Thursdays in CH190, we will cover 1.5 of MWF's lessons each day so that we are on par with the other SM208 sections. 

Online Textbook and Assignment Link.  

Online Textbook:  Computational and Inferential Thinking: The Foundations of Data Science

Course Website/Assignment Link:  SM208 Spring 2023

Installing Python:  Python Installation Instructions

Reference Sheet:  Basic Lines of Code

Random Name Generator:  SM208 3401

Random Name Generator:  SM208 5601

Course Content.

We'll be learning how statistics can be used to support decision making. This is in line with recommendations made by the 2016 GAISE College Report, endorsed by the American Statistical Association. It is also a direct response to remarks by our Superintendent, Vice Admiral Sean Buck, who encouraged us to develop a data science curriculum to make midshipmen more effective officers. The material in this course is based off pioneering work in the Data8 program at U.C. Berkeley and our course number pays homage to U.C. Berkeley's program. We've modified their curriculum to adapt to the needs of the Naval Academy, the Navy and the Marine Corps. Our curriculum development efforts have been supported by two generous grants from the Office of Naval Research. 

Data science is a modern approach to statistics that blends computation with statistical theory. We'll use Python, the industry-leading programming language for data science. Despite its broad capabilities, our course will focus on using Python for data manipulation. visualization, and statistical computation. Students wishing further instruction in computer programming are encouraged to take SI286: Programming for Everyone. Course content includes data organization and manipulation, data visualization with an emphasis on briefing senior leadership, probabilities and Bayes' Rule for updating probabilities in light of new information, hypothesis testing, confidence intervals via bootstrapping, applications of the Central Limit Theorem and an introduction to distributions, regression and inference for regression, predictive modeling and an introduction to machine learning, an overview of ethics in machine learning, and classes devoted to critical thinking in the context of decision making with data science. This course is mainly taken by midshipmen in with majors in the School of Humanities and Social Science and many examples have been chosen from these areas, as well as in applications of interest to the Navy and the Marine Corps.

Online Textbook.

The main text is Computational and Inferential Thinking, The Foundations of Data Science by Ani Adhikari and John DeNero with contributions by David Wagner and Henry Milner. It is available for free in an interactive online version at: https://www.inferentialthinking.com/chapters/intro.html. We have supplemented the text with readings created specifically for midshipmen. As well, bring your computer to class each day.

Section Leaders and Assistant Section Leaders.

Section 3401:  TR9  (0955-1110),  CH190

Section 5601:  TR9  (1330-1445),  CH190

Section Leaders and Assistant Section Leaders:  please take daily attendance (the link has been emailed to you). Type:

Midshipmen:  if you are going to be absent, you must email your Section Leader, the two Assistant Section Leaders, and myself.  The Section Leader and the two Assistant Section Leaders will keep track of this via our attendance sheet. 

EI, MGSP, Math Lab, Academic Center, and Midshipmen Study Groups.

Extra Instruction (Office Hours) and MGSP.

Here is the Math Lab Schedule for Spring 2023.  Boxed are faculty members who can definitely help with python troubleshooting.  I am not certain about the other faculty members but it doesn't hurt to go and ask. 

Course Goals.

There are four goals for this course, in brief: 

These goals address the technical preparation of our midshipmen. As well, we’ll get considerable experience assessing claims using data, which contributes to our ability to think critically. These are both important attributes and capabilities of our Naval Academy graduates.

Reading Quizzes.

Make sure to log in via your usna.edu email address, or else, it will record some other email address which we may not be able to identify who it belongs to, so you will not receive credit on it. You have infinitely-many tries per problem per reading quiz so if you are not happy with your first attempt, try again since each subsequent attempt will write over your previous grade. Submit your reading quiz grade once you are satisfied with your grade.

All resources are allowed on Quizzes and Exams, except help from other midshipmen.

Resources Allowed on Quizzes and Exams.  All resources (notes, online book, all python files, internet) on quizzes and exams are allowed except other students.

Labs (python files).  You may work with one other person, but each person must type their own alpha and submit their lab to Blackboard. If you worked with another midshipman on a lab, I will pass around a paper and you must write your name and your partner's name on the paper. You must have a different partner for each lab assignment, or else, points will be docked from the assignment.

Lab.  Working with a Partner Spreadsheet

Homework.  You may get assistance from other midshipmen but I will not be tracking their names (so there is no need to type their names on your python file or give their names to me). 

Uploading Homework, Labs, Quizzes, and Exams.  You must upload them to Blackboard and then hit submit before their respective due date.

Assignment Due Dates.

Reading Quiz Due Dates.  They are due before midnight (by 2359) of the due date. See below for the due dates. 

Quiz Due Dates.  All in-class quizzes must be submitted within the allotted timeframe, before the end of class. 

Lab Due Dates.  You have 7 consecutive days (1 week) to work on the lab. That is, if the lab is assigned on a Thursday, then it is due by the following Thursday before midnight (by 2359)

Homework Due Dates.  You have 7 consecutive days (1 week) to work on the homework assignment. That is, if homework is assigned on a Thursday, then it is due by the following Thursday before midnight (by 2359)

Group Work and Plagiarism.

The military ethic focuses on both individual responsibility and teamwork. In general, you are encouraged to work together outside of class. Looking at someone else’s code is fine, but then you should KEY THE COMMANDS IN ON YOUR OWN and not copy (and paste) directly. 

Typing in your ALPHA on ALL submitted files is equivalent to writing your name on all of your submitted assignments. 

If your own alpha is NOT on an assignment that you submitted, you will automatically receive 0 on that assignment. There are no excuses why you failed to type in your own alpha. 

Secondly, make sure that you spelled your alpha correctly. If you alpha has 5 numbers or 7 numbers, then it is incorrect. If you "accidentally" type in a friend's alpha, then it is still incorrect. So please double-check your alpha before you submit an assignment to me because it takes a very long time to track down the person with the wrong alpha.

On every python file, type in ONLY your alpha. When assigned work is not solely your own or if you received help on a lab assignment, you must indicate this in a comment in your file though there is no penalty for doing so (insert a new line in python and add your friend's name) I will pass around a paper and you must write your name and your partner's name on the paper. You must have a different partner for each lab assignment, or else, points will be docked from the assignment. As for homework, you may get assistance from other midshipmen but I will not be tracking their names.

In addition to recourse through the honor system, I reserve the option of assigning zero credit for any assignment or evaluation in which I suspect plagiarism. Similarly, cheating on a test or the final exam will result in a failing grade for the course.

Late Work.

Though most class information is located on our course website, grades will be published at the instructor's discretion. 

HWs and Labs.  Late work without a previously agreed-upon extension will be docked 25% credit and will not be graded if turned more than 7 days past the original due date (this includes weekends and Federal holidays)

Reading Quizzes.  Any late reading quizzes will be docked 25% credit

Quizzes and Exams.  If you fail to submit in-class quizzes and exams by the end of class (before the time that they are due), you will receive 0 on these assignments. 


Being absent on a quiz day.  If you're absent on a quiz day, you will be excused from that quiz.  But that quiz will still be emailed to you so when you have time, work through that quiz at your own time.  You can still drop the lowest quiz grade at the end of this semester. 


We will cover 3 lessons every 2 days. The first two reading quizzes are due on the first day of the first lecture, and the third reading quiz is due on the second day of the second lecture. 

Six and 12 week grades will be computed using the above proportions, adjusted because of the absence of some items. I understand that you sometimes need to miss class for movement orders. You will need to make up any graded work that you missed, but note that we drop some grades at the end of the course.

I reserve the right to fail anyone with F level performance in the last weeks of the semester, and to assign an A to whoever shows significant improvement as the semester concludes (and aces the final).

Striving for Excellence.

I require everyone (including myself) to strive for excellence in our class. At a minimum this includes: coming to class prepared (having done the reading for the day’s class, the homework from the last class and coming to class with your computer), asking and answering questions in class and taking part in group work. Moreover, on quizzes, tests, and (especially) on the final exam, I need to see both comprehensive preparation and full and sustained effort.

Break down of the Grades and Extra Credit.

20%   Homework (11, dropping the lowest homework at the end of the course)

10%   Labs (11, dropping the lowest lab at the end of the course)

05%   Reading Quizzes (31, dropping the lowest reading quiz at the end of the course)

10%   Quizzes (5, dropping the lowest quiz at the end of the course)

30%   Tests (2)

25%   Final Exam (1)

BONUS:  As long as you upload your python files to Blackboard before the due date, you will receive several bonus points (see the top of your python file to see how many bonus points you will receive).

How to Check Your Grades.

While you will upload all (python) files to Blackboard, you can view your grades through your individualized Google Spreadsheet (the link to this spreadsheet has been emailed to you). After I grade each assignment, your grades will automatically populate, but I will aim to let you know whenever an assignment has been graded. 

Grading Scale. 

A  (90.0 ≤ x )

A- (87.0 ≤ x)

B+ (83.0 ≤ x < 87.0)

B  (80.0 ≤ x < 83.0)

B- (77.0 ≤ x < 80.0)

C+ (73.0 ≤ x < 77.0)

C  (70.0 ≤ x < 73.0)

C- (67.0 ≤ x < 70.0)

D+ (63.0 ≤ x < 67.0)

D  (60.0 ≤ x < 63.0)

F  (x < 60.0)

Student Responsibilities. 

Professionalism. 

Flexibility.

We all need to be a little bit flexible. Indeed, if necessary, this course policy document may change during the course. If you need an extension on an assignment, I’m generally willing to give one if you ask well in-advance of the deadline or if you can show me significant partial progress towards completion of the assignment. I’ll also ask you to be flexible: we may need to reschedule topics depending on how quickly (or slowly) we move through the syllabus. I’ll give you advance notice if due dates need to be adjusted.

Disclaimer.  Instructors reserve the right to modify this course policy. You will be advised in advance of any policy change.

Remark.  I look forward to having you in my class!

Past Lessons.

Lesson 1 (Thurs 12 Jan): 

Lesson 2 (Tues 17 Jan):

Lesson 3 (Thurs 19 Jan):

Lesson 4 (Tues 24 Jan):

Grading (Wed 25 Jan):

Lesson 5 (Thurs 26 Jan):

Grading (Sat 28 Jan):

Lesson 6 (Tues 31 Jan):

Grading (Wed 1 Feb):

Lesson 7 (Thurs 2 Feb):

Grading (Sun 5 Feb):

Lesson 8 (Tues 7 Feb):

Grading (Wed 8 Feb):

Announcement (Wed 8 Feb):

Lesson 9 (Thurs 9 Feb):

Grading (Thurs 9 Feb): 

Grading (Sun 12 Feb):

Grading (Mon 13 Feb): 

Lesson 10 (Tues 14 Feb):    

Grading (Tues 14 Feb): 

Grading (Wed 15 Feb): 

Lesson 11 (Thurs 16 Feb):

Grading (Fri 17 Feb): 

Lesson 12 (Tues 21 Feb):

Grading (Wed 22 Feb):

Lesson 13 (Thurs 23 Feb):

Lesson 14 (Tues 28 Feb):

Grading (Wed 1 Mar):

Lesson 15 (Thurs 2 Mar):

Grading (Thurs 2 Mar):

Lesson 16 (Tues 7 Mar):

Grading (Tues 7 Mar):

Lesson 17 (Thurs 9 Mar):

Grading (Thurs 9 Mar):

Grading (Fri 10 Mar):

Announcement (13 Mar - 17 Mar):

Lesson 18 (Tues 21 Mar):

Lesson 19 (Thurs 23 Mar):

Grading (Mon 27 Mar):

Lesson 20 (Tues 28 Mar):

Grading (Tues 28 Mar):

Lesson 21 (Thurs 30 Mar): 

Grading (Sat 1 Apr):

Lesson 22 (Tues 4 Apr):

Lesson 23 (Thurs 6 Apr):

Grading (Fri 7 Apr):

Grading (Sat 8 Apr):

Grading (Sun 9 Apr):

Lesson 24 (Tues 11 Apr):

Lesson 25 (Thurs 13 Apr):

Grading (Fri 14 Apr):

Lesson 26 (Tues 18 Apr):

Lesson 27 (Thurs 20 Apr):

Grading (Sat 22 Apr):

Grading (Mon 24 Apr):

Lesson 28 (Tues 25 Apr):

Grading (Tues 25 Apr):

Grading (Wed 26 Apr):

Lesson 29 (Thurs 27 Apr):

Grading (Fri 28 Apr):

Announcement (Fri 28 Apr):

Lesson 30 (Tues 2 May):

Grading (Wed 3 May):

Announcement (Wed 3 May):

Final Exam Review Session  (Thurs 4 May1000 - 1200):

Announcement (Sat 6 May, 1000 - 1900):

Final Exam  (Tues 9 May0755 - 1055):

Announcement  (Tues 9 May):

Alternate Final Exam  (Wed 10 May1300 - 1600):

Announcement  (Wed 10 May):

Current Lessons.

Annnouncement  (Wed 10 May):

On my midshipmen's presentation, on the last slide. 

Flowers are blooming nicely this year. 

Spring 2023 Schedule on Google Spreadsheet.

The spreadsheet below will take you to the files you need for each lesson. 

SM208 Spring AY 2023 Schedule

Announcements and Emails.

Announcement  Wed 3 May 2023 at 1730:

Enjoy the concert by OneRepublic!

Announcement  Wed 3 May 2023 at 1624:

In preparation for your Final Exam, this is a friendly reminder to organize all the files in your Spring 2023-SM208 folder according to the day we covered the material by creating a folder for each Day, and when naming these folders, you should include the main topics covered on that day.  For example, here is how I organized and named my folders.

"DONE" here means we either covered the topic in class or I have graded all the assignments in that folder  (so "DONE" is really a note to myself).

Announcement  Sun 23 Apr 2023 at 1846:

Practice Final Exams  will be posted after Friday 28 April!  Feel free to download the exams and begin working through them. 

Announcement  Mon 13 Feb 2023 at 1030: 

Please make sure to check your grades via Google Sheets periodically.  If there is an error/typo, please let me know during class or via an email.  Also, a midshipman pointed out to me that there is an error in your grades for RQ09-- I will troubleshoot it and get it fixed before submitting your 6-week grades. 

Announcement  Fri 10 Feb 2023 at 1030: 

I strongly suggest that if you lost any (partial) points on any of your assignments or quizzes, go back and correct them  (see the feedback that I have been emailing you for a correct way to code in python). 

Announcement  Thurs 9 Feb 2023 at 0930

Between Wednesday 15 Feb to Friday 17 Feb, do NOT email or receive any ipynb, jpg, etc. attachments from others as it will trigger a notification to ITSD and us. 

Announcement  Wed 8 Feb 2023 at 1325

If you are going to be absent on Test 00 day (Thursday 16 Feb), please let me know (via an email) at your earliest convenience. 

Announcement  Tues 7 Feb 2023 at 1830: 

Your Test 00 will not occur during an X period (0655 - 0745) since we need to accommodate everyone in our classes.  Since we have Plebes all the way up to Firstees taking SM208, no X period will work for everyone in our course.  This means you will take Test 00 in your regularly scheduled period.  For our sections, your Test 00 will be on Thursday 16 Feb

Just for Fun  Sat 21 Jan 2023 at 1900

It's not always all about work-- US Army Corps of Engineers created a 2023 calendar with giant cats on their infrastructure

Download the calendar from the USACE Digital Library, or download their images from this link.

Announcement  Fri 20 Jan 2023 at 2000

The AI community is moving at a rapid pace but we still have a long way to go. 

"US Marines Use Cardboard Box to Defeat DARPA Robot Trained to Detect Humans"

"DARPA spent a week with a group of Marines at a test site in order to help train an AI robot by attempting to defeat it. The robot was parked 'in the middle of a traffic circle' and the Marines had to approach it undetected. Scharre explains, 'If any Marines could get all the way in and touch this robot without being detected, they would win.'

In a great example of how terrible AI still is, all [eight] of the Marines managed to remain undetected. DARPA trained the AI to detect humans walking, but not much else. So Scharre explains how two of the Marines 'somersaulted for 300 meters,' another two hid under a cardboard box, and another 'stripped a fir tree and walked like a fir tree.' Apparently a lot of giggling was involved, which is another feature of humans the robot hadn't been trained to detect.

Those 'simple tricks,' Scharre said, were 'sufficient to break the algorithm.'"

Announcement  Thurs 19 Jan 2023 at 2319

An application of Artificial Intelligence:  stable diffusion;  it is a text-to-image diffusion model capable of generating realistic images given any text input. 

I typed in "rabbit and drone and puppy playing tag with Pokemon". 


A sample, freely available stable diffusion python code

Announcement  Thurs 19 Jan 2023 at 2231: 

ChatGPT is an Artificial Intelligence bot that engages in almost human-like dialogue based on a prompt. 

I told chatGPT to "write a song about programming in Python with accompanying chords". 

Announcement  Thurs 19 Jan 2023 at 1205

Since you are getting used to opening and executing python files, do the reading (or go through the python file) in the green box on your own before doing the Reading Quiz to the left of the link. 

Announcement  Wed 18 Jan 2023 at 1733

Day 8 and Day 9 course material have been swapped (so Day 9 should follow nicely from Day 8), and all website, all folders, and documents have been updated to reflect this.  

Announcement  Tues 17 Jan 2023 at 1625

Yes, you will get  bonus points  for turning in (uploading) your assignments early and on time (such bonus point opportunities are indicated on the top of the python file). 

Announcement  Wed 11 Jan 2023 at 2021

If you are off the yard, you need to VPN in order to link Jupyter and Google Drive. You can VPN using Cisco AnyConnect Secure Mobility Client or https://sslvpn.usna.edu/ . Then use a Command Prompt to link them:  mklink /J "C:\Users\m123456\DFS_Link_Jupyter" "G:\My Drive"

We will use the python template from each day for interactive lectures and discussions. However, do NOT delete cells as you may run into errors or you may introduce bugs when you run your python template. Also, when I grade your homework, labs, quizzes, exams, etc., my python codes may not read or misread your solutions. So if you accidentally delete some cells, the best course of action would be to download the (original) python template again from the Lessons Link.

This is NOT a programming course; this is a data manipulation course.

Create folders for each lesson/day, i.e., Day 01, Day 02, Day 03, Day 04, etc.

Practice exams will be available in order to better prepare you for your exams.

Announcement  Mon 9 Jan 2023 at 1438: 

You can also install Google Drive from this link in case you cannot find Google Drive from the Software Center.

Email sent out on  Sun 8 Jan 2023 at 1622: 

Good afternoon, 


BLUF: Before class on Thursday, you need to take ~10 mins to download software utilized throughout the semester.


Intro

Welcome back to the Yard. My name is Professor Im and I'll be your SM208 Instructor this Spring. We will have plenty of time to introduce ourselves and the course on the first day of class. I am going to keep this short since I'm sure you are receiving multiple reform emails, in addition to emails from the rest of your instructors. 


Course Website

The following link is to the course website. The course website contains the entire semester's syllabus, course material, assignments, etc. I will walk you through this on the first day of class. 

https://sites.google.com/usna.edu/ds4dm-spring-aye23/home


Furthermore, since we meet on Tuesdays and Thursdays, we will follow the schedule here on my SM208 course website: Mee Seong Im, Doctor of Philosophy (Ph.D.) in Mathematics - Spring 2023 SM208 (google.com)


Due-Out: Downloading Python + Google Drive

Throughout the semester, we will be utilizing the most popular programming language around called Python. In addition to downloading Python onto your computer, we are also going to do one additional step to allow you to access the files you save in your personal google drive. The following PDF contains step by step instructions on accomplishing this task. It shouldn't take more than 10 minutes. 


Step 1: Follow the instructions to download the "Anaconda Navigator" onto your computer. This will be the interface we use for Python. 


Step 2: Download "Google Drive" (not Google File Stream, ITSD updated this) directly onto your computer. The instructions in the PDF call this "Google Drive File Stream", which has since been shortened to just called "Google Drive." When completing "Step 5" of this process, ensure to pay attention to all the spaces in that single line of code you place into the command prompt. For example, there is a space after the "J" and "C", in addition to a space between "My" and "Drive". You will see what I mean in the PDF under Step 5. You will know you did this correctly if it says something about a junction being created.


We will troubleshoot in class, but we do not have time to wait for everyone to download Anaconda as this takes most of the time.


Failure to have this set up properly means you will quickly fall behind in this course. 


Reach out if you have questions, and I will see you in class on Thursday.


Professor Im