September 10, 2019
For today
Read Think Python 2e Appendix B, just section B.2 (no reading quiz)
Turn Notebook 02 in (see instructions in Lecture 3)
Prepare for a quiz on the workshop and Chapters 1-2
Watch Ned Batchelder's talk "Loop like a native".
Today
In class exercises
Analysis of algorithms, part 1
Annotated bibliography
For next time
Prepare for a quiz on the workshop and Chapters 1-2
Read Think Complexity Chapter 3 and do the reading quiz
Start your annotated bibliography (see instructions below) and turn in the first installment here.
Project proposals are due October 1.
Download this notebook from GitHub and save it in your ThinkComplexity2 repo, in the examples folder.
Launch Jupyter, open that notebook, and run it.
Work on the exercises (mostly the ones that were on the practice quiz).
Whether algorithm A is faster than algorithm B (or takes less space) depends on:
Details of computer architecture
Details of the data
The amount of data
In order to classify algorithms, computer science has made some disciplinary choices:
Von Neumann model with uniform memory access time and uniform operation time.
Worst case inputs
Asymptotic behavior as problem size grows
The nice thing about these assumptions is that they make classification possible. However:
They limit the usefulness of this analysis for practical purposes
Analysis of algorithms has taken on a disproportionate role in the concept of what computer science is, with bad consequences for technology and society
Some of the problems with classical analysis of algorithms:
Assumptions 1 and 3 clash: as problems get big, memory gets non-uniform
Worst case is often irrelevant to practice. Some of the most important algorithms are exponential in the worst case and nearly linear in practice.
The "crossover points" are often in the domain of interest.
Simple algorithms that are fast enough are better than complicated algorithms with better asymptotic behavior
When you are making software engineering decisions, analysis of algorithms might provide some guidance, sometimes.
A note on notation: O(g) is a set of functions, so even though we write
It would be better to think of that equals sign as "is an element of".
Exercises:
What is the order of growth of n3 + n2? What about 1000000 n3 + n2? What about n3 + 1000000 n2?
What is the order of growth of (n2 + n) · (n + 1)? Before you start multiplying, remember that you only need the leading term.
If f is in O(g), for some unspecified function g, what can we say about af+b, where a and b are constants?
If f1 and f2 are in O(g), what can we say about f1 + f2?
If f1 is in O(g) and f2 is in O(h), what can we say about f1 + f2?
If f1 is in O(g) and f2 is O(h), what can we say about f1 · f2?
Note: so far, in all examples problem size is described by a single integer, n. With graph algorithms, we will often consider number of nodes, n, and number of edges, m.
Over the next 1.5 weeks, you will find and "read" three papers, and write an annotated bibliography.
I put "read" in quotes because you will probably not read every word. You should read enough to answer the following questions and apply the following criteria.
Questions
1) What is the application domain? What is the system of interest?
2) What is the primary experimental question the authors address?
3) What kind of model do they use?
4) What methods do they apply to the model? Analysis? Simulation?
5) What work does the model do? Predict? Explain? Design?
6) What validation do the authors report?
7) Are the conclusions supported by the results?
Criteria
1) Good quality work (based on venue, citations, and your judgment).
2) An experiment that makes sense and answers a question (as opposed to a model without obvious motivation).
3) An experiment you can replicate with about 2 person-weeks of effort.
4) Available data and other supporting material.
5) Potential for extension, either your ideas or theirs.
6) Extensions that answer interesting questions (not just adding bells and whistles).
For Project 1, it should probably involve graphs/networks.
Places to look
2) Papers that cite any of the papers in the book (Erdos and Renyi, Watts Strogatz, Barabasi and Albert)
3) News search
4) Web search
5) This special issue, in the future
What goes in an annotated bibliography?
1) The title, with a link to the paper, it possible.
2) A complete citation, including the names of the authors, publication venue, date, and page numbers
3) Your name
4) A paragraph that answers the questions above.
Example:
Collective dynamics of 'small-world' networks
Watts, Duncan J; Strogatz, Steven H., Nature (Jun 4, 1998): 440-2.
Added by Allen Downey
Watts and Strogatz observe that some empirical networks have both high connectivity and short path lengths. They call this combination of features the "small world" property. They show that regular graphs and completely random graphs do not have this property; then they propose a way to construct random graphs that do. They suggest that this model will have applications in "biological, social, and man-made systems".
In this example, notice that I did NOT paste in the questions and answer them one at a time, in order.