This page contains a brief overview of the CS-6601 Artificial Intelligence course being taught at Georgia-Tech during Spring 2012. This is intended for any A.I enthusiasts in general and specifically for graduate students who are interested in the field of Artificial Intelligence. Hope you all find the information contained in this page useful and will be able to decide whether this course matches your interests and fits your plans !!
Foundation: Reading and Assignments
In the course, we completed 8 assignments on the
foundations of AI, after reading the relevant material in the textbook. The textbook 'Artificial Intelligence-A Modern Approach', Third Edition, by Russell and Norvig was the primary reading material. There were other reference papers, video lectures, and notes on various topics discussed in the class.
1) Search Algorithms : Reading material provided a good introduction on various tree and graph search algorithms such as breadth-first, depth-first, bi-directional, iterative deepening, A* etc., problem-solving agents, heuristic and uninformed search strategies, optimization problems, partially observable environments as well as online search algorithms. The assignment focused on getting a good grasp of detailed formulations for search problems as well as selecting good admissible heuristic functions.
2) Adversarial Search (Game Playing) and Constraint Satisfaction Problems (CSPs) : Reading material on Adversarial Search covered interesting strategies for optimal decisions in games such as minimax algorithm, alpha-beta pruning, and also introduced strategies for stochastic and partially observable games. Reading material on CSPs covered concepts of constraint propagation, backtracking search, and local search for CSPs. The assignment focused on the fundamental aspects of formulating game trees and related concepts such as minimax and alpha-beta pruning. It also had a problem on applying the strategy of back-tracking with forward checking, with minimum-remaining-values (MRV) and least-constraining-value heuristics to a cryptarithmetic problem.
3) Logic : Reading material covered topics on logical agents and introduced propositional and first-order logic, inferences and proofs using resolution, Conjunctive Normal Form (CNF), as well as forward and backward chaining. The assignment problems dealt with proofs using resolution, representing satisfiability (SAT) problems as CSPs, formulating axioms for logical reasoning systems, and using forward chaining algorithm for inference in first-order logic.
4) Bayes Nets : Reading material covered the basics of probability and bayes rule and then introduced the concept of probabilistic reasoning for representing knowledge in uncertain domains. It included topics on conditional independence, bayesian networks, exact inference in bayesian networks using enumeration and variable elimination algorithm, approximate inference using sampling methods, markov chain simulation, as well as relational and first-order probability models.The assignment focused on creating a bayesian network for a real-world problem and computing conditional probabilities associated with different variables in the network, which provided a sound basis on this topic.
5) Hidden Markov Models : Reading material covered different aspects about probabilistic reasoning over time and inferences in temporal models such as filtering, prediction, smoothing, and computing the most likely explanation. It covered topics on Hidden Markov Models, Kalman Filters, and Dynamic Bayesian Networks and introduced particle filtering algorithm as an effective approximation algorithm. The assignment required the formulation of Viterbi lattice and finding the probability of generating a particular sequence by an HMM.
6) Natural Language Processing (NLP) : Reading material covered various language models, text classification schemes, information retrieval and extraction methods, syntactic analysis and semantic interpretations. The assignment dealt with selecting a large monolingual corpus and finding its characteristics, generating texts based on various language models, and generate two parse trees for a probabilistic context-free grammar (PCFG) as well as computing conditional probabilities of the two trees, given the sentence.
7) Learning and Complex Decisions : Reading material covered concepts on Markov Decision Process (MDP), Value and Policy Iterations, Partially Observable MDPs, Game Theory, Supervised Learning Methods, Decision Trees, Dimensionality Reduction Techniques, Regression Analysis, Artificial Neural Networks, Nearest Neighbor models, and Support Vector Machines. The assignment problems consisted of computing probabilities of reaching different squares in a 4 X 3 world (with rewards and penalties) from a start square by a specific action sequence, and the implementation of a decision tree learning algorithm for different input attributes of a data-set using the concept of information gain and entropy.
8) Markov Logic Networks: Reading material provided an introduction to Markov Logic Networks as a unifying framework for statistical relational learning. It also covered other approaches such as Probabilistic Relational Models, Relational Markov Models, and covered some aspects on classification, modeling, recognition, learning and inference. The assignment focused on formulation and evaluation of grounded markov networks as well as a factor graph representation of the network.
What I learned and My Favorite Assignment:
All of these assignments made me understand the material more deeply because working on a particular topic always makes it easy to understand easily due to first-hand experience rather than just reading the topic. I feel that working on the assignments exposed some intricate details which would never become evident only by reading the material in the book or by listening to lectures in the class. Out of all these assignments, my favorite was the 2nd one. I especially liked the problem of solving the cryptarithmetic problem using the strategy of strategy of back-tracking with forward checking, with minimum-remaining-values (MRV) and least-constraining-value heuristics. This is because it made the whole idea crystal clear to me and I enjoyed solving it step by step. The figure below shows the problem (taken from the textbook).
There were three mini-projects in which I chose to research a problem that was supposed to be relevant to my future career. For each of these three projects, I proposed a solution, implemented it, and described it in a mini-conference paper.1) First Project: Evaluation of Genetic Algorithm and Simulated Annealing for N-Queens Problem
Project Partner: Ashley Edwards
Synopsis: In chess, a queen can move horizontally, vertically, or diagonally across the board. The N-Queens problem attempts to arrange n queens on an n x n board in such a way that no queen can attack another, i.e., each of the queens must be located on a unique row, column, or diagonal of the board. Solving this problem is NP-Hard, and as the values of N increase, the problem becomes intractable. However, addressing this problem is highly important and significant. Along with being a representative of general constraint satisfaction problems, solutions to the N-Queens problem have a wide-range impact on the industrial domain, such as VLSI circuit and other communication system design and testing, and computer resource management problems such as memory allocation schemes.
As a part of this project, we compared the performance results of two of the common heuristic search algorithms, Genetic Algorithms and Simulated Annealing, which have found solutions for the problem. We also modified the standard implementation of Genetic Algorithms by changing the otherwise random computation of the crossover and mutation step. We analyzed the effect of 2 evaluation functions, 5 crossover operators a) Single point, b) Double point, c) Cut and Crossfill, d) Gene Conversion, and e) PMX Crossover, and 3 mutation functions a) Swap Mutator, b) Inversion Mutator, and c) Frame-Shift Mutator on the performance of the genetic algorithms. We also evaluated the performance of the modified algorithm with that of standard Genetic Algorithm and Simulated Annealing algorithm for the N-Queens domain. Our tests also include results for different values of N to analyze the effect of the size of the problem in the comparative performance of the algorithms presented in this study. The results show the merits of the performance of an algorithm depends on various factors, and optimization in every step is crucial for best performance. Please refer the paper for detailed discussion on the results. The paper can be found here.
2) Second Project: Design and Analysis of a Modified RRT Algorithm for L-Shaped Peg-In-Hole Robotic Assembly Problem
Project Partner : Rowland O'Flaherty
Synopsis: Robotic assembly and manipulation tasks have become an important part of our day-to-day life. These tasks have widespread applications from manufacturing in industrial domains to elder-care in household environments. A problem known as peg-in-hole assembly is widely considered as a representative problem for robotic assembly and manipulation tasks. Our goal is to focus on this peg-in-hole assembly problem and plan a path in the joint space of a 3-DOF redundant planar robot-arm, such that the robot can successfully insert the L-shaped peg in the G-shaped hole. This problem is different from the traditional approaches because it deals with a more complicated geometry (i.e. L-shaped peg in a G-shaped hole) than conventional square/round shaped pegs and holes. By solving this problem we can address the general issue of planning problems in highly constrained space.
In this study, we addressed this task based on the kinematics, dynamics and geometric constraints of the robot. To address the peg-in-hole problem, rapidly-exploring random trees (RRTs) have been utilized. This is because RRTs are well suited for a large, continuous, and non-linear state-space, which is seen in the peg-in-hole problem. To the best of our knowledge, this is a novel approach to solve this standard problem. Our problem deals with planning in cluttered environments which has wide-spread societal applications such as factory automation, automatic car-parking, robotics, virtual prototyping, pharmaceutical drug design, and computer animation. For detailed discussion on the results, please refer the paper here.
3) My Favorite Project (Third Project) : Haptic Classification of Surface Properties using Supervised Machine Learning Techniques
Project Partner : Natesh Srinivasan
This third project on 'Haptic Classification of Surface Properties using Supervised Machine Learning Techniques' is my favorite project. So, let me explain this project in detail.
Problem Addressed: In this study, our objective is to model the traversibility of an object on a surface for sliding motions. The traversibility of an object depends on the interaction dynamics between the
Related Work: Material property classification is perhaps the most closely related work to our approach , , ,  and . These studies focus on estimating the surface properties such as texture, friction, and other surface characteristics of the objects themselves while neglecting the surfaces on which the objects are placed. Our work, on the other hand, focuses on the estimation of surface characteristics using interactions with objects placed on those surfaces. However, our work derives inspiration from the data-driven approaches mentioned in those previous studies. All the above studies focus on specific exploratory procedures for identifying material properties by directly interacting with the material; but in real-situations for manipulation purposes, this is hardly relevant. We, on the other hand, shall use an experimental paradigm which reflects “incidental contact” (not specific exploratory procedures) without directly touching the surface which is more representative of the general object-environment dynamic interaction behavior in real-world situations.
Approach: The objective of our work is to haptically classify surfaces into different categories without explicitly touching or probing them with specific exploratory procedures. This is possible only by modeling the dynamics of the force interactions between the surfaces and the objects placed on them. Such dynamic effects arise from the relative motion between the objects and the surfaces and are
We modeled the time-trends of forces for each of these categories using Hidden Markov Models. For modeling the surface characteristics in the three different categories, we modeled three HMM models for smooth, moderate, and rough categories. Our observations are the sensed forces from the interactions between the robot arm and the environmental objects. We modeled the observations in continuous domain using Gaussian Distribution to take into account the differences across all the data. Each hidden state has a non-zero probability of staying in the same-state and transitioning to the next-state. Please note that we modeled the HMM as a left-right HMM model which means there is no backward transition. This is because our HMM models the force-trends based on an increase in time. We varied the number of hidden states to check their effect on the performance of the classification scheme. We also compared the performance of our algorithm with that of other common supervised learning approaches such as k-NN and SVM Classifiers. For implementing k-NN, we reduced the dimensionality of the data using PCA. For Classification, we evaluated the performance of the classifiers using a 5-fold cross validation scheme across all the trials for all the object surface interactions. Our objective was to classify these platform-surfaces into three categories such as Smooth, Moderate and Rough. We also extended our classification algorithm to individual surface recognition situations.
Results and Discussion: The results of the classification scheme using HMM, k-NN and SVM are shown in the figure below. The figure also contains the results of individual surface recognition accuracy.
The total classification accuracy using HMM is 93.33%, using k-NN is 100%, and using SVM is 94%. The individual surface recognition accuracy is 97.33%. We also noticed that the performance of the HMM models vary across different categories and different number of states. As the number of states increase, the total accuracy is almost consistently high till 10 states. As the number of states are further increased, the total classification accuracy deteriorates consistently. This is because as the number of states become higher, they model the minute variations in the data across different categories. This also gets influenced by the inherent noise in the data. This issue is analogous to the problem of over-fitting in other machine learning domains. It should be noted that the classification accuracy for the smooth and rough categories are consistently 100% but the accuracy for the moderate category worsens with parameter changes. This makes sense because the data in the moderate category lies in between the boundaries (and sometimes gets overlapped) of the other categories and this makes it difficult for the HMM to classify correctly when minute variations are modeled. With low number of states, the model captures the required variations between the different categories but neglects the extraneous details. This result is definitely not generalizable and is completely dependent on the nature of the data being modeled. For experimental data with a lot of unmodeled uncertainties and dynamic effects, it might be necessary to model the minute and high-frequency variations amongst various surface categories and more number of states will definitely help.
Also, k-NN gives high accuracy because, k-NN does not care about the inherent modeling of the time-variation of the original data but instead works on the reduced dimension representation of the original data. The reduced dimension data gets rid of the inherent noise and extraneous details in the original high-dimension data. However, there is no loss of information as the low-dimensional data spans 100% of the variation in the data as seen from our principal component analysis results. The SVM algorithm gives less accuracy compared to the k-NN classifier because it directly operates on the high-dimensional data and hence, is affected by the minute variances in the data as well as the inherent noise.
What I learned and Why this is my favorite project:
I particularly liked the third project the most because the problem domain was highly related to my current research work on haptic classification schemes. In fact, for this project, I worked on some of the problems which I did not cover in my current research work and hence, the studies were complimentary. This greatly enhanced my intuition of the problem domain. I enjoyed working on it and I learned a lot from the entire process. In addition, all the three mini-projects were an introduction to the world of A.I research. I really liked and enjoyed the projects because these projects not only gave me an opportunity to some simple research problems but also gave me a hands-on experience on implementing A.I algorithms in real-world problems. These, in turn, made me work on some interesting A.I and machine learning tools in the coding domain. The process, in general, was also very helpful. At-first, the search to look for interesting problems in my area made me think deeply and search relevant literature in my field which was very helpful. Writing a proposal by keeping in mind the criteria makes it a habit to write proposals in this way which would immensely help in the future for writing big grant proposals. The process of submitting a paper in the end with relevant sections is a good practice for writing actual conference/journal papers. The last step in the process, which was the review-panel, taught me how difficult it is to judge a work in limited time and how difficult it is to coordinate with the whole board comprising of people with different opinions. From a proposal writer’s perspective, it made me learn the importance of writing an effective paper which is not verbose but would capture the attention of the reviewers very easily.
I think I did pretty well in the assignments, projects as well as the proposals. Two of my three projects (1st and 3rd) were selected as the best in the class and my assignment grades were also among the top ones. I personally think that the second project was also good but it’s the paper writing part which made it lose points because it was too verbose without any highlights to the important points. The reviewers’ feedback made us realize the importance of presentation style of the paper. Regarding the proposals, the most important challenge to me was to explain the problem clearly in a limited space and also portray its significance and importance to the reviewers who are from different backgrounds. Sometimes, I feel that writing a proposal is more difficult than implementing a project. This is because, once we know what we are supposed to do, we have a direction and implementing that is a different challenge. But, establishing a direction is not an easy task, and this is what a proposal is all about. For writing up a proposal, it is necessary to know about the latest developments in the field and also some of the interesting studies. Also, to compare against some of the well-known and established methods, it is necessary to read the relevant literature in the domain. Therefore, I feel like writing a proposal made me more knowledgeable about the developments in a field in general. Also, writing an effective and coherent proposal which can stick to rules and attract reviewers is not an easy task and hence, this practice of writing three proposals was beneficial and it helped to mold my thinking and writing process in general.
The course as a whole had a big impact on my academics. This served as an extensive introduction to the world of Artificial Intelligence. This course was more of a self-study on the topics covered in the book which made it more challenging as the classes were supposed to cover topics beyond what is given in the book. This was my first A.I class and hence, the learning curve was very steep. The breadth of this course is large and it does a pretty good job of introducing us to various interesting domains of A. I. I believe that this class coupled with a dedicated self-study will be the perfect recipe for anyone who wishes to get a good grasp of Artificial Intelligence. In the end, I would like to thank the instructors, T.A, and other fellow students of this class without whom this would not be as exciting an experience as it was for me.