Algorithms
data structures and complexity
SWEBOK Ch. 13 Section 7
Sorting Algorithms
Paper pieces 23 41 66 20 2 90 9 34 19 40 99
Bubble Sort (two at a time)
Selection Sort (do comparisons to select the lowest number in unsorted sublist and swap it into the end of the sorted sublist) demo
Insertion (get the next number and do comparisons to see where it should be inserted into sorted sublist / beginning to end)
Merge (divide and conquer)
People sort
Folk dance sorting! (intercultural knowledge!)
7. Algorithms and Complexity
Programs are not random pieces of code: they are meticulously written to perform user-expected actions. The guide one uses to compose programs are algorithms, which organize various functions into a series of steps and take into consideration the application domain, the solution strategy, and the data structures being used. An algorithm can be very simple or very complex.
7.1. Overview of Algorithms
Abstractly speaking, algorithms guide the operations of computers and consist of a sequence of actions composed to solve a problem. Alternative definitions include but are not limited to:
• An algorithm is any well-defined computational procedure that takes some value or set of values as input and produces some value or set of values as output.
• An algorithm is a sequence of computational steps that transform the input into the output.
• An algorithm is a tool for solving a wellspecified computation problem.
Of course, different definitions are favored by different people. Though there is no universally accepted definition, some agreement exists that an algorithm needs to be correct, finite (in other words, terminate eventually or one must be able to write it in a finite number of steps), and unambiguous.
7.2. Attributes of Algorithms
The attributes of algorithms are many and often include modularity, correctness, maintainability, functionality, robustness, user-friendliness (i.e. easy to be understood by people), programmer time, simplicity, and extensibility. A commonly emphasized attribute is “performance” or “efficiency” by which we mean both time and resource-usage efficiency while generally emphasizing the time axis. To some degree, efficiency determines if an algorithm is feasible or impractical. For example, an algorithm that takes one hundred years to terminate is virtually useless and is even considered incorrect.
7.3. Algorithmic Analysis
Analysis of algorithms is the theoretical study of computer-program performance and resource usage; to some extent it determines the goodness of an algorithm. Such analysis usually abstracts away the particular details of a specific computer and focuses on the asymptotic, machine-independent analysis.
There are three basic types of analysis. In worst-case analysis, one determines the maximum time or resources required by the algorithm on any input of size n. In average-case analysis, one determines the expected time or resources required by the algorithm over all inputs of size n; in performing average-case analysis, one often needs to make assumptions on the statistical distribution of inputs. The third type of analysis is the best-case analysis, in which one determines the minimum time or resources required by the algorithm on any input of size n. Among the three types of analysis, average-case analysis is the most relevant but also the most difficult to perform.
Besides the basic analysis methods, there are also the amortized analysis, in which one determines the maximum time required by an algorithm over a sequence of operations; and the competitive analysis, in which one determines the relative performance merit of an algorithm against the optimal algorithm (which may not be known) in the same category (for the same operations).
7.4. Algorithmic Design Strategies
The design of algorithms generally follows one of the following strategies: brute force, divide and conquer, dynamic programming, and greedy selection.
The brute force strategy is actually a no-strategy. It exhaustively tries every possible way to tackle a problem. If a problem has a solution, this strategy is guaranteed to find it; however, the time expense may be too high.
The divide and conquer strategy improves on the brute force strategy by dividing a big problem into smaller, homogeneous problems. It solves the big problem by recursively solving the smaller problems and combing the solutions to the smaller problems to form the solution to the big problem. The underlying assumption for divide and conquer is that smaller problems are easier to solve.
The dynamic programming strategy improves on the divide and conquer strategy by recognizing that some of the sub-problems produced by division may be the same and thus avoids solving the same problems again and again. This elimination of redundant subproblems can dramatically improve efficiency.
The greedy selection strategy further improves on dynamic programming by recognizing that not all of the sub-problems contribute to the solution of the big problem. By eliminating all but one sub-problem, the greedy selection strategy achieves the highest efficiency among all algorithm design strategies. Sometimes the use of randomization can improve on the greedy selection strategy by eliminating the complexity in determining the greedy choice through coin flipping or randomization.
7.5. Algorithmic Analysis Strategies
The analysis strategies of algorithms include basic counting analysis, in which one actually counts the number of steps an algorithm takes to complete its task; asymptotic analysis, in which one only considers the order of magnitude of the number of steps an algorithm takes to complete its task; probabilistic analysis, in which one makes use of probabilities in analyzing the average performance of an algorithm; amortized analysis, in which one uses the methods of aggregation, potential, and accounting to analyze the worst performance of an algorithm on a sequence of operations; and competitive analysis, in which one uses methods such as potential and accounting to analyze the relative performance of an algorithm to the optimal algorithm.
For complex problems and algorithms, one may need to use a combination of the aforementioned analysis strategies.