Teun Hocks
https://github.com/thundergolfer
http://www.norvig.com/Lisp-retro.html
http://simpleai.readthedocs.org/en/latest/
https://github.com/simpleai-team/simpleai
http://www.game-theory-class.org/
http://www.openculture.com/freeonlinecourses
https://news.ycombinator.com/item?id=11897766 AI class Berkley
MARKOV Decision process
http://www.cs.brown.edu/research/ai/pomdp/tutorial/index.html
http://www.scholarpedia.org/article/Category:Machine_Learning
AI class
http://www.reddit.com/r/aiclass/
http://www.reddit.com/r/aiclass/comments/l8c89/anyone_up_for_implementing_the_algorithms_in/
http://larvecode.tumblr.com/tagged/ai-class
http://therohanaurora.com/stanford-ai-open-class-resources/
http://www.reddit.com/r/aiclass/comments/lf568/alternative_free_online_courses_in_ai/
http://inst.eecs.berkeley.edu/~cs188/sp09/lectures.html AI Berkeley Class
http://www.stanford.edu/class/cs221/
http://aima.cs.berkeley.edu/code.html
http://code.google.com/p/aima-python/
TRANSCRIPT http://www.wonderwhy-er.com/ai-class/
explain away effect
d-separation
http://www.reddit.com/r/aiclass/comments/lpw21/a_challenge_to_the_over_achievers_who_really/
BAYES
http://allendowney.blogspot.com/2011/10/all-your-bayes-are-belong-to-us.html
http://en.wikipedia.org/wiki/Bayes%27_theorem
http://en.wikipedia.org/wiki/Bayesian_inference
1. P(A|B) = P(A and B)/P(B) - it introduces time: A happend after B ??
2. P(B|A) = P(A and B)/P(A)
--------------------------------
P(A|B)/P(B|A) = P(A)/P(B) -> result of divison 1. /2.
P(A|B) = P(B|A)*P(A)/P(B)
P(A|B) = [P(B|A)P(A)]/P(B) bayes theorem
P(A,B) = P(A|B)P(B) product rule
P(A) = P(A,B) + P(A, not B) total probability : let B' = not B
P(A) = P(A|B)P(B) + P(A|B')P(B') total probability
P(A1,A2,A3,A4) = P(A1|A2,A3,A4)P(A2|A3,A4)P(A3|A4)P(A4) chain rule - can be extended to n variables
When two features are conditionally independent, we can calculate their co-occurrence as a simple multiplication: P( A +B | C) =P(A|C)*P(B|C)
Bayes Networks: how many parameters
A independent from B |given C -> ? A independent from B > NO
A independent from B -> ? A independent from B | given C > NO
Expectation Maximization vs K-Mean Clustering
Laplace Smoothing
http://en.wikipedia.org/wiki/Additive_smoothing
applied to http://en.wikipedia.org/wiki/Multinomial_distribution
Additive smoothing is a type of shrinkage estimator, as the resulting estimate will be between the empirical estimate xi/n, and the uniform probability 1/d.
theta(i) = (x(i) + k)/(N+k*d)
x(i) - observation of success events ( size of vector x is d)
N - total number of events
k - smoothing param
d - number of parameters (size of theta and x >) vector), in another words - d is the number of values that the variable x can take on ( for spam / nospam d=2)
above: 12 is total # of unique words ( spam+ham); 9 - # of words in spam; 15 - # of words in ham
http://171.64.93.201/ClassX/system/users/web/pg/view_subject.php?subject=CS229_FALL_2011_2012
Propositional Logic First Order Logic
Logical implication p->q produces a value of false just in the singular case:
the first operand is true and the second operand is false.
p → q is equivalent to ¬p ∨ q
0
0
1
1
0
1
0
1
1
1
0
1
0
0
0
1
0
1
1
1
http://www.oursland.net/aima/propositionApplet.html
http://www.inf.unibz.it/~franconi/teaching/propcalc/
http://www-users.cselabs.umn.edu/classes/Fall-2008/csci5511/PredCalculusPractice.pdf
1) ;
2) ;
;
Законы поглощения:
1) ;
2) ;
1) ;
2) .
Situation calculus
Poss(a,s): special binary predicate- denotes the executability of action a in situation s
Markov decision Process
http://en.wikipedia.org/wiki/Markov_decision_process
http://en.wikipedia.org/wiki/Bellman_equation
http://www.karenkopecky.net/Teaching/eco613614/Notes_ValueFunctionIteration.pdf
Berkeley AI Class
http://inst.eecs.berkeley.edu/~cs188/fa11/lectures.html
REINFORCMENT LEARNING
http://en.wikipedia.org/wiki/Reinforcement_learning
http://en.wikipedia.org/wiki/Q-learning
http://people.revoledu.com/kardi/tutorial/ReinforcementLearning
http://en.wikipedia.org/wiki/Temporal_difference_learning
http://www.scholarpedia.org/article/Temporal_difference_learning
HW 5.1
http://www.reddit.com/r/aiclass/comments/m80hq/home_work_51_could_someone_please_clarifying/
http://www.reddit.com/r/aiclass/comments/m73wm/problem_with_sarsa_formula_in_hw_51/
http://www.aiqus.com/questions/9802/q-learning-in-a-stochastic-environment
Q(s,a) = Q(s,a) + alpha ( R(s) + gamma*Q(s',a') - Q(s,a))
alpha=0.5
gamma=0.9
R(s)=100 ??
We went north but came to east !!!! Initial state: Q(33, any action)=0
Q(33, North) = 0 + alpha ( R(s) + gamma*Q(s',a') - Q(s,a)) = 0.5 *(100 +0.9*100) = 95
Q(33, North) = 0 + alpha ( R(s) + gamma*Q(s',a') - Q(s,a)) = 0.5 *(0 +0.9*100) = 45
Q-matrix
STATE / ACTION : NORTH WEST EAST SOUTH
(3,3) 0 0 0 0
R-matrix
STATE / ACTION : NORTH WEST EAST SOUTH
(3,4) 100 100 100 100
other states: 0 0 0 0
Assume we at (3,4)
Q=0
FOR each episode
select random init state
DO WHILE NOT REACH GOAL STATE
select one among all possible actions
using this possible actions consider going to next state
get MAX = MAX [ Q(next state, all actions)]
Q(s,a) = R(s,a) + gamma * MAX
Set next state as current state
ENDDO
END FOR
Q(33 , 34) = R(33, 34) + gamma * MAX {Q(34-> all direction) } = 0 + 0.9*100= 90;
HW 5.2
http://www.reddit.com/r/aiclass/comments/m92qr/rewrite_of_hw_52/
HW 5.3
http://www.reddit.com/r/aiclass/comments/m75ui/important_the_clarification_of_hw_53_at_the/
http://www.reddit.com/r/aiclass/comments/m63wf/hw_53/
http://www.aiqus.com/tags/hw5-3/
http://mostafaabedi.blogspot.com/2011/11/general-tips-for-homework-5-of-ai-class.html
Particle Filter
http://en.wikipedia.org/wiki/Particle_filter
A* algorithm
Admissible heuristic that never overestimates the cost of reaching a goal.
A*: f(s) = g(s) + h(s) here: g - distance from the root to current state; h - heurisic distance from current state to goal.
We are minimizing the total cost f(s).
A search algorithm is admissible if it is guaranteed to find a minimal path to a solution whenever such a solution exists. Breadth first search is admissible, because it looks at every state at level n before considering any state at level n+1. Breadth-first Search is an A* algorithm in which f(n) = g(n) + 0
Eight puzzle Example: The heuristic of number of tiles out-of-place is certainly less than the actual number of moves to the goal state. So this heuristic (combined with best-first search) is an admissible algorithm. So is the heuristic sum of the distances of out-of-place tiles, because it too underestimates the actual number of moves required to reach a goal state.
If we overestimate this distance, however, it is not guaranteed to give us the shortest path. In such cases, we have what is called an "inadmissible heuristic."
HW 6
problem3: HMM
P(A0 | X0) = P(X0|A0)*P(A0)/P(X0) = 0.1 * 0.5 /(0.1*0.5 + 0.8 *0.5) = 1/9 = 0.111
P(A1|x0) = P(A0|x0) * P(A->A) + P(B0|x0)*P(b->a) = (1/9)*0.5 + (8/9)*0.5 = 0.5
P(a1|x0,x1) = P(a1|x0)* P(x1|x0)
P(x1| x0) =
P(a0)*P(x|a)*P(a->a)*P(x|a) + ... ( 0.5 *0.1 * 0.5 * 0.1)
P(a0)*P(x|a)*P(a->b)*P(x|b) + ... ( 0.5 *0.1 * 0.5 * 0.8)
P(b0)*P(x|b)*P(b->b)*P(x|b) + ... ( 0.5 *0.8 * 0.5 * 0.8)
P(b0)*P(x|b)*P(b->a)*P(x|a) + ... ( 0.5 *0.8 * 0.5 * 0.1)
=0.25*0.1*0.9 + 0.25*0.8*0.9=0.25*0.9*0.9=0.2025
GAME THEORY
Dominant strategy (does not depend from opponent move)
Pareto equilibrium
Every game has equilbrium point then nobody wants to change strategy (Nash)
However, Nash equilibrium does not necessarily mean the best payoff for all the players involved; in many cases, all the players might improve their payoffs if they could somehow agree on strategies different from the Nash equilibrium: e.g., competing businesses forming a cartel in order to increase their profits.
Game 2*2
4p-3(1-p) = -5p +5(1-p) --> 7p-3=5-10p --> 17p=8 --> p = 8/17
4q - 5(1-q) = -3q+5(1-q) --> 9q-5 = -8q +5 --> 17q=10 -> q=10/17
Utility = 4*p*q - 5*p*(1-q) -3*(1-p)*q+5*(1-p)*(1-q)=
4pq + 5pq -5p -3q+3pq+5*(9/17)*(7/17) =
12pq-5p +5*(9/17)*(7/17)=
12*(8/17)*(10/17) -5*8/17 +5*(9/17)*(7/17)
(12*8*10 -5*8*17+5*9*7) / (17*17) = 2.06
960-680+315=595
http://alchemy.cs.washington.edu/ Open Source AI