Introduction to Probability and Statistics

Announcements


Meeting Time and Location

Monday, Wednesday, and Friday
10:00 - 10:55 am
Baxter Lecture Hall

Instructor Contact Information and Office Hours

205 Baxter Hall
x4218
Office hours: 1:30–3:00, Fridays

TA Contact Information and Office Hours

TA Office Office Hours and Location Recitation Section Time and Location
Hélène Rochais (Head TA) 1-K Math Building Friday 5:30 - 7:30 pm
B111 Downs
Section 3 - 10 am B127 GCL
Section 9 - 9 am B127 GCL
Tamir Hemo 1-D Math Building Friday 4:30 - 5:30 pm
B111 Downs
Section 4 - 10 am 269 LAU
Yuhui Jin 1-A Math Building Thursday 8 - 9 pm
B111 Downs
Section 8 - 2 pm 142 KCK
Jane Panagaden 1-H Math Building Monday 5 - 6 pm
B111 Downs
Section 1 - 9 am 102 STL
Sunghyuk Park 1-C Math Building Thursday 5 - 6 pm
B111 Downs
Section 5 - 10 am 102 STL 
Nathaniel Sagman 1-B Math Building Monday 6 - 7 pm
B111 Downs
Section 7 - 1 pm B122 GCL
Forte Shinko 1-E Math Building Saturday 4 - 5 pm
B111 Downs
Section 2 - 9 am 269 LAU
Jim Tao 2-M Math Building Monday 10 - 11 pm
B111 Downs
Section 6 - 1 pm 155 ARM 

Course Description

Introduction to the fundamental ideas and techniques of probability theory and statistical inference.

Probability will be covered in the first half of the term (using Pitman) and statistics (using Larsen and Marx) in the second half (see below for information regarding textbooks). Main topics covered are:

  • Properties of probability
  • Independence, conditional probability, Bayes' Law
  • Random variables, distributions, densities, and expectation
  • Joint distributions, marginals, covariance, correlation
  • The Law of Large Numbers
  • The Central Limit Theorem
  • Order statisitics
  • Important distributions
    • Bernoulli, Binomial
    • Uniform
    • Normal (Gaussian)
    • Exponential, Poisson
    • Gamma, Beta, Chi-square
    • Conjugate prior/posterior pairs
  • Introduction to stochastic processes
    • Random walk
    • Markov chains
    • Martingales
  • Estimation of parameters
    • Consistency, unbiasedness
    • Maximum likelihood estimation
    • Confidence intervals
    • Cramér-Rao lower bound
  • Testing statistical hypotheses
    • Significance tests
      • Likelihood ratio tests
      • Monotone Likelihood Ratio Property and the Neyman-Person Lemma
      • Type I and Type II errors
      • Power and assurance
      • Critical values
    • Specification tests
      • Kolmogorov-Smirnov
      • chi-square test, Fisher's exact test
    • Linear regression analysis
      • Gauss Markov-Theorem
      • ANOVA
    • Nonparametric tests
      • Wilcoxon, Mann-Whitney, Kruskal-Wallis
      • Spearman rank correlation

Prerequisites

Ma 1abc. In addition, some familiarity with a scientific computing language or program (e.g., Mathematica, Matlab, NumPy, Octave, R) is assumed.


Policies

Late Work:

As a rule, late work is not accepted. This is to protect the TAs, who are talented hardworking students, just as you are. At the discretion of the Head TA, late homework turned in the day it is due, but after the 4:00 pm deadline will be accepted with a 25% penalty. If there are extenuating circumstances, you must notify the Head TA by midnight the night before it is due and you must get a note from the from the Dean supporting the extension. As partial compensation, your lowest homework score will be discarded.

Grading:

Your course grade will be based on the weekly homework (40%), the midterm (25% or 35%), and the final (35% or 25%). The weights on the final and midterm will put the greater weight on the better exam. In computing the homework average, your lowest homework score will be dropped. (Since homework assignments vary by weight, a modified Kazatkin algorithm will be used to determine which score to drop.)

This year I am continuing the following practice. Each assignment will contain zero or more optional exercises. They are optional in the following sense: Grades will calculated without taking the optional exercises into account, but the maximum grade will be an A. If you want an A+, you will have to earn an A and also accumulate sufficiently many optional points. No collaboration is allowed on optional exercises.

As this course is for a letter grade, no one will be excused from the final.

Homework:

Homework will be typically be due at 4:00 pm on Mondays in the appropriate homework box outside 253 Sloan. (If Monday is a holiday [which happens twice this term] homework will be due on Tuesday. Assignment 0 is a major exception.) Problems (and later solutions) will be posted on this course webpage. You are encouraged to start the homework well in advance of the due date in order not to risk missing the deadline. Homework is turned in to locked boxes, so it can safely be submitted as soon as it is completed.

Collaboration:

Collaboration is allowed on the homework, but your write-up must be in your own words and may not be copied. The exception is that no collaboration is allowed on optional exercises. Collaboration is not allowed on the exams. Please ask for clarification if anything is unclear.

*Information is subject to change*


Textbooks

The required textbooks for the course are:

  • Jim Pitman. 1993. Probability. Springer, New York, Berlin, and Heidelberg. ISBN: 0-387-97974-8.
  • Richard J. Larsen and Morris L. Marx. 2012. An Introduction to Mathematical Statistics and Its Applications, fifth edition. Prentice Hall. ISBN: 0-321-69394-9.

There will be additional readings from time to time, either as handouts or articles available on line.

There are other books that you may find useful for this course or perhaps later in life. Here are, in no particular order, some of my recommendations.

  • Robert V. Hogg, Elliot A. Tanis, and Dale Zimmerman. 2015. Probability and Statistical Inference. Pearson, Boston. ISBN: 978-0-321-92327-1. This is a nicely written introduction that I am evaluating to see if it can replace the two books above.
  • Alex Reinhart. 2015. Statistics Done Wrong: The Woefully Complete Guide. No Starch Press, San Francisco. ISBN: 978-1-59327-620-1. This short (129 pages) book is written for scientists and covers many common misinterpretations of statistical methods and results in the analysis of scientific data.
  • Calvin Dytham. 2011. Choosing and Using Statistics: A Biologist's Guide. Wiley-Blackwell. ISBN: 978-1-4051-9839-4. This is a cookbook and reference geared toward biologists, but is a useful reference for almost everyone.
  • David E. Matthews and Vernojn T. Farewell. 2015. Using and Understanding Medical Statistics. Karger, Basel. ISBN: 978-3-318-05458-3.
  • Robert B. Ash. 2008. Basic Probability Theory. Dover, Mineola, New York. Reprint of the 1970 edition published by John Wiley and Sons. ISBN: 0-486-46628-0. This book, being published by Dover, is very affordable. (I think it's still just under $20.) The first chapter, especially sections 1.4 through 1.7 are very good at explaining how to count for combinatorial problems.
  • John B. Walsh. 2012. Knowing the Odds: An Introduction to Probability. American Mathematical Society, Providence, Rhode Island. ISBN: 978-0-8218-8532-1. I almost used this as the textbook for the course, but decided to stay with the status quo.
  • Kai Lai Chung and Farid Ait-Sahlia. 2003. Elementary Probability Theory with Stochastic Processes and an Introduction to Mathematical Finance. Springer-Verlag, New York, Heidelberg, and Berlin. ISBN: 978-0-387-95578-0. This is a very well-written introduction to probability theory. Chapter 3 on counting is especially good.
  • Richard Isaac. 1995. The Pleasures of Probability. Springer-Verlag, New York, Berlin, and Heidelberg. ISBN: 0-387-94415-X. Another good introduction to probability theory, but a bit too eccentric to use as the main text for this course.

Modern statistical practice is computationally intensive, but this course is not especially so. But you will have to use computers to do some of the assignments. Many of the people on campus that I have talked to recommend the statistical programming language R (the open source alternative to AT&T's S). Mathematica 9 and later claims to be highly integrated with R, but I haven't tried it yet. Others I have talked to rave about NumPy, an extension of Python that provides much of the functionality of Matlab. Still others continue to use other packages because they have invested a lot of effort in learning to use them. (I myself use Mathematica because I started using it in 1992, so my recommendation of R falls into the category of "do as I say, not as I do.") My son recommends R and this video as an endorsement.

Here are a couple of highly recommended books on R that I mostly have not read. But I find the first two to be useful.

  • Claus Thorn Ekstrom. 2011. R Primer. Chapman & Hall/CRC Press. Available for on-line reading from the Caltech Library.
  • Paul Teetor. 2011. R Cookbook. O'Reilly Media. ISBN: 978-0-596-80915-7.
  • Joseph Adler. 2012. R in a Nutshell, 2nd edition. O'Reilly Media. ISBN: 978-1449312084.
  • Alain F. Zuur, Elena N. Ieno, and Erik H. W. G. Meesters. 2009. A Beginner's Guide to R. Springer Science+Business Media, New York. ISBN: 978-0-387-93836-3.
  • Peter Dalgaard. 2008. Introductory Statistics with R, second edition. Springer Science+Business Media, New York. ISBN: 978-0-387-79053-4

Assignments

Date Posted Assignment Due Date
 Jan 3  HW 0  Jan 4, 8:00 pm
 Jan 8    HW 1
 Jan 16, 4:00 pm
 Jan 16 HW 2 Jan 23 4:00 pm
   

Exams


Collaboration Policies

Homework Exams
You may consult:
Course textbook (including answers in the back) yes yes
Other books yes NO
Solution manuals NO NO
Internet NO NO
Your notes (taken in class) yes yes
Class notes of others yes NO
Your hand copies of class notes of others yes yes
Photocopies of class notes of others yes NO
Electronic copies of class notes of others yes NO
Course handouts yes yes
Your returned homework / exams yes yes
Solutions to homework / exams (posted on webpage) yes yes
Homework / exams of previous years NO NO
Solutions to homework / exams of previous years NO NO
Emails from TAs yes NO
You may:

Discuss problems with others yes NO
Look at communal materials while writing up solutions yes NO
Look at individual written work of others NO NO
Post about problems online NO NO
For computational aids, you may use:

Calculators yes yes
Computers yes yes

* You may use a computer or calculator as a tool, but you must justify and explain what you asked the computer to do. Simply attaching your computer code and output is not an acceptable justification or explanation.

Ċ
Kim Border,
Jan 2, 2018, 9:11 PM
Ċ
Kim Border,
Jan 8, 2018, 8:37 AM
Ċ
Kim Border,
Jan 16, 2018, 12:56 PM
Ċ
Kim Border,
Jan 17, 2018, 1:14 PM
Ċ
Kim Border,
Jan 17, 2018, 1:14 PM
Ċ
Kim Border,
Jan 17, 2018, 1:14 PM
Ċ
Kim Border,
Jan 17, 2018, 1:14 PM
Ċ
Kim Border,
Jan 17, 2018, 1:14 PM
Ċ
Kim Border,
Jan 17, 2018, 1:14 PM
Ċ
Kim Border,
Jan 11, 2018, 4:10 PM
Ċ
Kim Border,
Jan 11, 2018, 4:10 PM
Ċ
Kim Border,
Jan 11, 2018, 4:10 PM
Ċ
Kim Border,
Jan 11, 2018, 4:11 PM