Lecture 02

For today you should:

1) Read Chapter 2 of Think Stats 2e on NB.

2) Watch Jake Porway at TEDx

Today:

1) Chapter 1 discussion

2) Project descriptions

3) Chapter 2 discussion

For next time you should:

1) Read the project descriptions and fill in the survey.  The deadline is Monday at 9am.

2) Read Chapter 3 of Think Stats 2e

Check out this AMA.

Quiz Tuesday on Chapters 1 through 3.

Project descriptions

Factors you might want to consider:

Chapter 2

Distribution: map from possible values to their probabilities.

"map" can mean a Python dictionary, other map type, function, or callable.

Histogram is a wrapper around a Python dictionary, maps from "values" to frequencies or counts.

thinkplot is a wrapper around matplotlib that knows about the classes in thinkstats2.

How would you describe these distributions?

Some of the characteristics we might want to report are:

Exercise: What's up with the bumps in the distribution of prglngth? 

1) Print the value_counts for this variable and see if you can identify a pattern.

2) Find the documentation of this variable, and generate possible explanations. 

Summary statistics

Obvious examples: mean and variance.

What's the difference between variance and standard deviation, and why would we prefer one or the other?

Effect size: a summary statistic intended to communicate the size of an effect: difference in mean, risk ratio, odds ratio.

Exercises:

1) Based on the results in this chapter, suppose you were asked to summarize what you learned about whether first babies arrive late.

Which summary statistics would you use if you wanted to get a story on the evening news? Which ones would you use if you wanted to reassure an anxious patient?

Finally, imagine that you are Cecil Adams, author of The Straight Dope, and your job is to answer the question, “Do first babies arrive late?” Write a paragraph that uses the results in this chapter to answer the question clearly, precisely, and honestly.

2) In the repository you downloaded, you should find a file named chap02ex.ipynb.  Make a copy called chap02mine.ipynb, and open it.

Some cells are already filled in, and you should execute them. Other cells give you instructions for exercises. Follow the instructions and fill in the answers.

3) Using the variable totalwgt_lb, investigate whether first babies are lighter or heavier than others. Compute Cohen’s d to quantify the difference between the groups. How does it compare to the difference in pregnancy length?