csc231 - Algorithms

Preliminary activities: 
1. Test your Java knowledge at www.adapt2.sis.pitt.edu/kt Your username and passwords have been distributed by now
2. Let's take a look at C++ programming in the attached pdf document on C++ introductory notes. 
or specifically the following topics:
a.  Basics of C++:
        ·         Structure of a program
        ·         Variables. Data Types.
        ·         Constants
        ·         Operators
b. Basic Input/Output
c. Control Structures
d. Compound Data Types:
        ·         Arrays
        ·         Character Sequences
        ·         Pointers
        ·         Dynamic Memory

Introduction
an algorithm is a procedure (a finite set of well-defined instructions) for accomplishing some task which, given an initial state, will terminate in a defined end-state. 

Download and read the attached document (introduction to algorithm.pdf) below to understand better, the concept of  Algorithm

Analysis of Algorithms
The computational complexity and efficient implementation of the algorithm are important in computing, and this depends on suitable data structures.
In Computer Science, it is important to measure the quality of algorithms, especially the specific amount of a certain resource an algorithm needs. Examples of such resources would be time or memory storage. Nowadays, memory storage is almost a non-essential factor when designing algorithms but be aware that several systems still have memory constraints, such as Digital Signal Processors in embedded systems. 
Different algorithms may complete the same task with a different set of instructions in less or more time, space or effort than other. The analysis and study of algorithms is a discipline in Computer Science which has a strong mathematical background. It often relies on theoretical analysis of pseudo-code.
To compare the efficiency of algorithms, we don't rely on abstract measures such as the time difference in running speed, since it too heavily relies on the processor power and other tasks running in parallel. The most common way of qualifying an algorithm is the Asymptotic Notation, also called Big O.

Big O notation is used in Computer Science to describe the performance or complexity of an algorithm. Big O specifically describes the worst-case scenario, and can be used to describe the execution time required or the space used (e.g. in memory or on disk) by an algorithm.

Anyone who’s read Programming Pearls or any other Computer Science books and doesn’t have a grounding in Mathematics will have hit a wall when they reached chapters that mention O(N log N) or other seemingly crazy syntax. Hopefully this article will help you gain an understanding of the basics of Big O and Logarithms.

As a programmer first and a mathematician second (or maybe third or fourth) I found the best way to understand Big O thoroughly was to produce some examples in code. So, below are some common orders of growth along with descriptions and examples where possible.

O(1)

O(1) describes an algorithm that will always execute in the same time (or space) regardless of the size of the input data set.

bool IsFirstElementNull(String[] strings)
{
	if(strings[0] == null)
	{
		return true;
	}
	return false;
}

O(N)

O(N) describes an algorithm whose performance will grow linearly and in direct proportion to the size of the input data set. The example below also demonstrates how Big O favours the worst-case performance scenario; a matching string could be found during any iteration of the for loop and the function would return early, but Big O notation will always assume the upper limit where the algorithm will perform the maximum number of iterations.

bool ContainsValue(String[] strings, String value)
{
	for(int i = 0; i < strings.Length; i++)
	{
		if(strings[i] == value)
		{
			return true;
		}
	}
	return false;
}

O(N2)

O(N2) represents an algorithm whose performance is directly proportional to the square of the size of the input data set. This is common with algorithms that involve nested iterations over the data set. Deeper nested iterations will result in O(N3), O(N4) etc.

bool ContainsDuplicates(String[] strings)
{
	for(int i = 0; i < strings.Length; i++)
	{
		for(int j = 0; j < strings.Length; j++)
		{
			if(i == j) // Don't compare with self
			{
				continue;
			}

			if(strings[i] == strings[j])
			{
				return true;
			}
		}
	}
	return false;
}

O(2N)

O(2N) denotes an algorithm whose growth will double with each additional element in the input data set. The execution time of an O(2N) function will quickly become very large.

Logarithms

Logarithms are slightly trickier to explain. We shall illustrate with the following common example

Binary search is a technique used to search sorted data sets. It works by selecting the middle element of the data set, essentially the median, and compares it against a target value. If the values match it will return success. If the target value is higher than the value of the probe element it will take the upper half of the data set and perform the same operation against it. Likewise, if the target value is lower than the value of the probe element it will perform the operation against the lower half. It will continue to halve the data set with each iteration until the value has been found or until it can no longer split the data set.

This type of algorithm is described as O(log N). The iterative halving of data sets described in the binary search example produces a growth curve that peaks at the beginning and slowly flattens out as the size of the data sets increase e.g. an input data set containing 10 items takes one second to complete, a data set containing 100 items takes two seconds, and a data set containing 1000 items will take three seconds. Doubling the size of the input data set has little effect on its growth as after a single iteration of the algorithm the data set will be halved and therefore on a par with an input data set half the size. This makes algorithms like binary search extremely efficient when dealing with large data sets.    For a more in-depth explanation, click on this link 

Sorting algorithm

From Wikipedia, the free encyclopedia

In computer science and mathematics, a sorting algorithm is an algorithm that puts elements of a list in a certain order. The most-used orders are numerical 

order and lexicographical order. Efficient sorting is important to optimizing the use of other algorithms (such as search and merge algorithms) that require 

sorted lists to work correctly; it is also often useful forcanonicalizing data and for producing human-readable output. More formally, the output must satisfy 

two conditions:

The output is in nondecreasing order (each element is no smaller than the previous element according to the desired total order);

The output is a permutation, or reordering, of the input.

Since the dawn of computing, the sorting problem has attracted a great deal of research, perhaps due to the complexity of solving it efficiently despite 

its simple, familiar statement. For example, bubble sort was analyzed as early as 1956.[1] Although many consider it a solved problem, useful new sorting

 algorithms are still being invented (for example, library sort was first published in 2004). Sorting algorithms are prevalent in introductory computer science 

classes, where the abundance of algorithms for the problem provides a gentle introduction to a variety of core algorithm concepts, such as big O notation, 

divide and conquer algorithms, data structures, randomized algorithms, best, worst and average case analysis, time-space tradeoffs, and lower bounds.


Read on from Wikipedia

http://en.wikipedia.org/wiki/Sorting_algorithm


1. Bubble Sort

Read the following pages on bubble Sort:

http://en.wikipedia.org/wiki/Sorting_algorithm#Bubble_sort

http://www.algolist.net/Algorithms/Sorting/Bubble_sort


2. Selection Sort

Read the following pages on Selection sort

http://www.algolist.net/Algorithms/Sorting/Selection_sort


3. Insertion Sort

Read:

http://www.algolist.net/Algorithms/Sorting/Insertion_sort

http://en.wikipedia.org/wiki/Sorting_algorithm#Insertion_sort


4. Quick Sort

Read:

http://www.algolist.net/Algorithms/Sorting/Quicksort

http://en.wikipedia.org/wiki/Sorting_algorithm#Quicksort


Assignment

Implement the following algorithms on a large data set containing English word in an unsorted manner

1. Bubble Sort and Insertion Sort using C++ codes

2. Selection sort and Quick sort using Java

On each implementation, state whether your algorithm is efficient or not, using valuable justifications


Searching Algorithm

In computer science, a search algorithm, broadly speaking, is an algorithm for finding an item with specified properties among a collection of items.

The items may be stored individually as records in a database; or may be elements of a search space defined by a mathematical formula or 

procedure, such as the roots of an equation with integer variables; or a combination of the two, such as a graph.


Graph

Read: http://www.algolist.net/Data_structures/Graph


Binary Search Tree

Read: http://www.algolist.net/Data_structures/Binary_search_tree


Depth First Search

Depth-first search always expands the deepest node in the current fringe of the search tree.

 The search proceeds immediately

to the deepest level of the search tree, where the nodes have no successors. As those nodes

are expanded, they are dropped from the fringe, so then the search "backs up" to the next

shallowest node that still has unexplored successors.

This strategy can be implemented by TREE-SEARCH with a last-in-first-out (LIFO)

queue, also known as a stack. As an alternative to the TREE-SEARCH implementation, it is

common to implement depth-first search with a recursive function that calls itself oln each of

its children in turn. 

Depth-first search has very modest memory requirements. It needs to store only a single

path from the root to a leaf node, along with the remaining unexpanded sibling nodes for each

node on the path. Once a node has been expanded, it can be removed from memory as soon

as all its descendants have been fully explored.  For a state space with

branching factor b and maximum depth m, depth-first search requires storage of only bm + 1

nodes. 

Additional readings on Depth first search:

http://www.algolist.net/Algorithms/Graph/Undirected/Depth-first_search


Further readings on search:

Read the attached PDF document file named search problems. Especially uniform Cost, Depth first and Depth Limited search



Ċ
Yetunde Folajimi,
Sep 1, 2010, 1:57 AM
Ċ
Yetunde Folajimi,
Aug 24, 2010, 3:41 AM
Ċ
Yetunde Folajimi,
Sep 1, 2010, 2:23 AM
Comments