compsci2(fall2008)
Wed 09/24/08
Friday: exam 1
chap 1,2,3 and 4.1
Labs 1,2,3,4 homework from 5
subset of a regular language is not necessarily regular
Σ* is regular
.. then everything would be regular
but a finite language is always regular
lab 5 .. last question clarification?
do something different
.. i can draw using rgular expressions .. can feed that into a drawing program
uurr .. up up right right
(uurr)*
so if there's a *, draw a couple and ...
.. one of them is not the same
4 .. if you can't find a sloution for 3e .. pikc one of the either ones
any of the algs are fair game but not the whole things
not a lot of proofs
Closure properties of regulr language
a set is closed under operation op iff a,b in S then a opb in S
(a in S then op(a) in S)
addition and integers: a, b are ints
then a+b is also an int
but divisoion in integers is not closed
17/3 is not an integer
Theorum:
Let L[1] and L[2] be regular languages
Then
a) L[1] U L[2] is regular ie, the st of regular languages is closed under union
b) L[1] n L[2] is regular
c) the compliment is regular (DFA)
d) L[1]L[2] is regular .. can do that with NFA or regexp or grammar
e) L^R is regular .. closed under reversal .. did that with NFA
f) L[1]* is regular .. nfa, regexp, grammar
some of these are really nice
compliment: for the not aa or aaa .. show the compliment!
intersection:
L[1] is some language by dfa M[1]
L[2] represented by M[2]
w = α1α2 ... αn ... w is accepted
how many states tdo we go through ... n+1 states in l1
in l2 .. the same
.. the length of the computation is always the same
so we need to simulate processing a1 in both machines
-------
Mon 09/29/08
Chapter 4
------
- closure properties .. a set is closed under a operation if beginngin with elements of the set , the rsult of applying the operation is also in the set.
ints
a+b is an int
so the itngers are closed under addition .. and multiplication
but a / b may or may not be an int
that's 4.1
4.2 .. decideable problems on regular languages
4.3 .. how to show not regular
theorum
Let L[1] L[2] be regular languages
then
a) L1 U L2 is regular
b) L1 n L2 is regular
c) L1L2 is regular
L1-compliment is regular
L1* is regular
L1^R is regular
Union .. can do with reegexp
concatination .. same deal .. L1 = L(r1) ..
complement .. dfa
.. L(m2) = L1
complment with grammar or regexp is a royal pain
L(r[1]*)
we did reverse with an nfa .. start state become the final and the final became the start.
intersection .. prove that the intersec of regular langaages is regular
Proof 1.
demorgan's laws
compliment of the intersection is equal to the compliment of the unions
L1 regular tell sme that l1 comp is regular
l2 regular implies that l2-comp is regular
l1 comp u l2 comp is regular because regular languages are closed under union
therefore (l1-comp u l2-comp)-comp is regular because we're closed under complimentation
how would I have a machine for this?
different proof by construction
Let l1 = l(m1), DFA M1 = (Q1, Σ, δ[1], q0, f[1])
let l[2] = l[m20 = M2 (q[2], Σ, δ2, p0, F2)
as an aside
let's let L1 = {abw | w in gS*) and let L2 = w| the number of b's i w is even
see atached.
so we have to do like nfa construction ..
only accept if both machines sit in a final state.
need to model the path
the cool thing : with a dfa each of them have but one path
start iwh the start state q0p0
attached
possible size of new machine .. # of startes in first machine times # of states in 2nd machine .. no bigger than.
no q1p1 .. becaasee we'll never get there
no q0p1 either .. these two are unreachable.
construct m3 = (Q1 cross q2, Σ, δ3, q0p0, f1 cross f2)
δj( .. = (δ19qi, α), δ2(qi, α)
showw in L3 iff w in l1 n l2
.. δ*3(qi, α) in F1 cross f2 iff δ*1(qi,α) in F1 and δ*2(qi, α) in F2
how could I altar this construction to get L1 U L2
f3 = (Q1 x f2) U (f1 x Q2).
.. big Q.
how do I use that to show things are regular
problem:
let L be a reguar lagnuage
show L1 = {w | w in L and length of w < 5} is regular
Let l2 = { w | |w| <5}
λ, a, b, aa, ab, ba, bb
2^0 + 2^1 + 2^2 +2^3 + 2^4 ..
who cares .. L2 is finite!
therefore L2 is regular
since any finite languages is regular
now L1 = L n L2
and L anddL2 iare regular => L n L2 is regular .. therefore L1 is regular
another one:
suppose L is regular.
Let L1 = {w in L st |w| %3 = 0}
show L[1] is regular
Let l2 = w in Σ* st |w| % 3 is 0
we can make a machine for that.
once again = l n l2 ..
regular
therefore l1 is regular
can construct regular languages from other eregular lagnuages
so that's section 4.1
4.2 - decidable properties
algorithms for
theorum : let L be a regular language
(a) For any w in Σ* ., there is an algorithm that deecides whether or not w is in L
(b) there is an algorithm that determines whether or not L is empty
(c) thre is an algorithm that determines whether L is finite or infinite
whether or not is very important .. that is what makes them decideable
there is no "it might be"
in context free we can't answer all of those.
describe algorithms
proof for a)
L is regular
there is DFA M st L(M) = L.
process w via M, δ*(q0,w) = qj
if qj in F, then w in L
if qj not in F, w not in L
done .. my algorithm is my dfa
(b) whether or not empty.
L is regular
is regular expression the empty set?
if yes, L = empty
if not, L is not empty
or
L = L(M), M is a dfa
is a final state reachable from q0?
alg .. enumerate all paths .. that was lab last week.
if yes, L is not empty
if no, L is empty
Last one .. we also did in lab!
L is regular
L = L(M) M is a dfa
algo .. determine all reachable (q0 .. qj) .. that I can get tjere
2. determine all useful .. reachable and reaches a final state
3. is there a useful state on a loop
.. for that we did a dfs on useful has a backedge
.. then infinite if yes
and if no .. then finite
nice to know whether an language is infiite, empty
part d.
L1 and L2 are regular
there exists an algorithm that decides whether or not L1 = L2
one thought .. take the dfas M1 and M2 , minimalize both of them and then look for graph isomorphism with labels .. that would work. .. but very costly!!
cheap and easy way: week 1
s1 = s2 iff (s1 n s2-comp) u (s-comp n s2) = empty
we proved this already
here
L1 = L2 iff (L1 n l2-comp) u (l1-comp n l2) = empty set
is taat regular?
compliment regular
closed under union and interset
so that whole mess is regular
thereffore there exists an alg to decide if L is empty -- use it!
that goes away for context free languages
how do I use some of this stuff
a couple little problems
question:/ probleem:
let L be a regular language
Does L contain an infinite number of even length strings?
L1 = {w in L | |w| is even}
is L1 infinte?
L1 = L n {w in Σ* st w is even}
L is regular
{} is regular .. 'cuz we 've done it .. dfa
therefore the intersection is regular
there fore exists alg to decide of infinite or not.
done!
more interesting problem:
Let L be a regular language.
Is there w in L st w^r is also in L
try each will never terminate if the answer is "no"
algorithms must terminate!!
w in L and w^r in L
same as w in L and w in L^r
L regular implies L reverse is regular
so L n L^r is regular .. determine if empty using algorithm
if yes .. no such w
if no , then there is a w
4.3 Languages that are not regular
a^nb^n .. all the a's before bs and equal number
w in {a,b}* where the number of a's = number of b's
{ w in {a,b}* where w = w^r}
ww^r
ww
wcw^r
waw
not regular!!!
and here's why:
a^nb^n
my problem is I've gotta count the number of a's .. and I need states to do that.
as soon as we loop, we lose a count.
and remember def of dfa .. Q is a FINITE set of states
that language is clearly infinite so needs a looop
and can't count if I have a loop
.. so looping is going to tell us that something is not regular
.. proof by contradiction.
prove that there is not finite # of states.
-------
Wed 10/01/08
Cahpter 4 , sections 1 & 2
homework on the lab
homomorphism
the set of regular languages are closed under homomorphism
ie, L is regular and h is a homomorphism ==> h(L) is regular
homomorphism . def.
h: Σ→ Γ*
Σ = {a,b}
Γ = {a, 0, b 1}
h(a) -> ab
h(b) -> 01
(h: Σ* → Γ*)
when i apply h to a string, I apply it to each of the characters individually
h(L) = h(w) where {w in L}
if L is regular then h(L) is regular .. because if you take a regular expressio for L, you can map it into a reg exp that's good for h(L) .. we proved this in lab
mapped r into T st h(T) = L(T)
created a regular expression for this particular language
and since T is a regexp, L is regular
but the reverse isn't true
L = {a^nb^n | n≥0} .. not regular
h(a) = 1
h(b) = 1
h(L) = {1^2n | n≥0} .. even strings of 1 .. i know that we can do that
h(L) regular and h homomorphism DOES NOT imply that L is regular
homework .. if you do it right it should be pretty short and sweet
"show there's an alg" .. ely on the fact that there is one for λ , empty , aad infinite
Identifying non-regular languages
L = {a^nb^n | n≥0} .. can't do this for any length n
show not regular
something else
Theorum . A pumping lemma.
Let L be a regular infinite language. Then there exists a positive integer m st any w in L |w|≥m can be decomposed as w = xyz
with |xy| ≤ m and |y| ≥1
st x y^i z is in l for i = 0,1,2,3 ...
this sis talking abouu loooping
m is the number of states
if I have a w that's longer than the number of sttes
then I can process it as x (the first part of the string .. then y .. some kind of loop .. and hten I can get to z .. and the final state
by the time I finish y, i have to have visited no more than m states .. if I have less than |w| states, I have to start repeating them.
if i = -, xz in language
i2 .. xyyz
etc etc.
so I"m loopoing
we'll assume regular and apply the pumpoing lemma
goal: find a w and an I that doesn't work
example:
show L is not regular
1. show L is infinite
{a^nb^n | n≥0} = {λ, ab, aabb ..}
not much to do - just make sure you're dealing with an infinite lang
2. assume L is regular
3. apply the pumping lemma m ≥0 is given .. i don't know what it is but it's there .. there is this special number m
choose w in L, |w|≥m
choose w = a^mb^m
w in L and |w| = 2m , which is ≥ m
I should be able to cut this guy up so that this is tru .. the y looping thing.
4. identify x and y .. mostly identify y .. what are the possible values for y
since w = xyz
|xy| ≤ m
xy is some place along the a's .. quits before the first b
what is y .. ≥ 1
y = .. at least 1 a but it could be even more than that .. nothing says that x can't be empty .. so this is the pita .. taking about all possible cases for y
y = a, aa, aaa .. at most m a's
so x could be empty ..
5. find i
goal xy^iz is not in L
if I choose i =0,
xy^0z = a^n-kb^m is not in L, since n-k does not =m since k ≥ 1
contradiction of the pumping lemma therefore, L is not regular by the pumping lemma
but another w works
w = a^(m/2)b^(m/2) , assuming m is even else choose m/2 +1
|w| = m, which is ≥ m
but now it's gross what comes next
w = xyz
where |xy| ≤m
|y| ≥1
what can y look like
case 1.
y = some a^k where k ≥ 1 but ≤ m/2
case 2.
a^kb^j
case 3.
b^k 1≤k≤m/2
gotta find an i that contradicts each on eof the
for 1. if i=0, ... not in l
same thing with the thrid case
a^m/2b^m/2-k which is not in L
but I can't choose zero for case 2 .. = a^m/2-k b^m/2-j
.. is that in L, .. well it might be .. if k=j
so that 's aproblem .. can't be sure k isn't j
but there still is an i that works
if I chooose i =2
.. a^m/2 .. b^j a^k b^m/2
which is not in the language. .. all the a's must preceed all the b's
judicious choice of w makes finding i easier
you can pick w's that will not disprove ..
if the language is u u where w in Σ*
.. string that's compied
if I choose w = a^ma^m .. this will not give ma a contradiction.
.. certainly is a string that's in the language and long enough but the problem is if I take out some a's or add some a's .. if I take out an even number of a's or add an even numbr o a's, this guy is still in my language
so w = a^mba^mb will . becaue I've destingushed the front part from the back part.
-------
Mon 10/06/08
h
extended to string
extended to language
h(L) = h(w) fro w in L
goal was to show that
h(l1) u h(l2) = h(l1 u l2)
h(l1) n h(l2) ≠ h(l1 n l2)
H(l1)H(l2) = h(l1l2)
in labb
input regular expression r for st L = l(r)
output was a regular expression h(r) = r-hat
l(r-hat) = h(L)
so regexp
h(r1) u h(r2) = h(r1 + r2)
.. that's what the program proves
gotta make a cas4e a little not a lot
for intersect
h(a) = ...
h(b) = ...
...
technically not a proof but not far from one
4.3 Showing a language is nt regular
a pumping lemma
Let L be an infitie regular language
there exists m> 0 for w in L w at least m
w can be deompsed as
w = xyz
where
|y| ≤ m and y ≥1
given that xy[i]z in L for i = 1,2,3 ..
Show l = {uu | u in Σ*} is not regular
abab
bbbba bbbba
proof is by contradiction:
1. assume L is regular
2. clearly L is infinite
that says I can apply th pumping lemma
m ≥ 0 is given
m is related to the number of states
choose w in L where |w| ≥ m
w = a^mba^b
u = a^m b
|w| = 2m +2 ≥ m
5. identify all possible y st w = xyz, |xy| ≤ m
and |y| ≥1
remember w =
aaaa .... aabaaaa ... aab
so the max length xy can be is m
and we can cut y whereever we want
so by choosing this sepcial string, we can automaticaally say thatt y is some number of a's
.. smallest being 1 , most being m
y = some a^k at least 1 and ≤ m
gotta know all m characters so we only have to describe y in one way
6. xy^iz = a^m+i-1k b a ^m b
if i is 1, we better be back to xyz
for i = 0, a^m-k b a^m b
remove y .. and that's not in L
since u = a^m-k b .. not equal to a^mb
remember k is at least 1
contradiction to # 3 . pumping lema does not hold
either not regular or not infinite ..
so not regular
lema must be true for all w .. so all we have to do is find one that doesn't work.
when picking the w .. the most important work is this choice .. you wanna know the first m characters
and 2 .. need to know where to split .. so putting that b at the end .. had to be separate
pattern: comparison of infinite
and patterns that can be of any length
.. palendromes
comparing two unbounded values
one more!
show L = {a^l b^N| were n≥l is not regular
proof is by contradiction
1. Assume L is regular
2. show L is infinite √
3. apply the pumping lemma
m > 0 is given
choose w
w = a^m b^m
.. lets make him marginally in there
|xy| ≤ m andd |y| ≥ 1
y = a^k , k ≤m k≥1
6. i=0 xy&0z = a^m-k b^m in L
sorry.
i = 2 xy^2z = a^m+k b^m, not in L because m+k > m, so the number of a's exceeds the number of b's .. not in my language
therefore, the pumping lemma does not hold,
therefore, L is not regular
so we have some not-regular languages
context free - a^nb^n ..
uu is not even context free. he's worse .. he's a decideable language
we know how to get regular languges .. need to id a context-free language
just because it's not regular does n't mean it's context free
btwe, reuglar => context free ,, since its a subset
chapter 5 - context free grammars
chapter 6 - normal forms of context free grammars
chapter 7 - push down automata .. stack machines .. where a dfa has one pice of memory .. push down has two .. stack and state
5 and 7 simotaneously
definition:
a grammar G = (V, Σ, S,P) is a context free grammar if all rules are of the form
A -> X
where A is a V and x is nsome number of variables or symbols .. pssibily none
more than one variable on the right side of the rule
L = {a^nb^n | n ≥ 0}
G:
S --> aSb | λ
not all grammars are context free
consider:
aA --> Aa .. context- sensative
L = a^l b^n where n ≥ l
S -> aSb | Sb | λ
ambiguous grammar!
parse trees
- internal nodes are variables
external nodes are terminals (leaves)
S -> aSb | Sb |λ
S -> Sb -> aSbb -> aSbbb -> abbb
can have multiple trees:
s
s b
a s b
s b
λ
and
S -> aSb _. aSbbb -> aSbbb -> abbbb
s
a s b
s b
s b
λ
for the string abbb which is in L, we have two parse trees, which impolis that my grammar is ambiguous
we don't like it .. but we don't hate it here in theory
non -amb
S -> aSb | B
B -> Bb |λ
trying to prove a context free grammar is non ambigulous is an undecidable problem
the following is also a context free grammar that does not generate the same langaage
S -> AB
A -> aA |λ
B -> bB | λ
no way of comparing the number of a's and b's
L(G) = a^m b^n n ≥0, m ≥0
.. and we know that that's retgular
L = w st # of a's in w is the same as the # of b's in w
this is also not a regular language
S -> aSb| bSa | SS | λ
highly ambiguous
a gazilion ways to derive λ
abab .. sure!
"proof"
after one step, I either have one of each or 0 and 0
.
so after any one step, i have an equal number of a's and b's .. by induction .. i'm equal any step along the line
each sentential form will again hae an equal number of a's and b's
context free grammars give a little more power .. to count and compare
.. things on either side of the variable
palendromes
w = w^r
show not regular .. a^mba^m
abba
have to match up
S -> aSa | bSb | a | b | λ .
that was 5.1 and part of 5.2.
pushdown automata - chapter 7.
Cahpter 2, remember was Deterministic finite automatata
nondterminisitic finite autamata
dfa's eq nfa's
in this chapter , Push down automata
1. non deterministic pushDown automata
NPDA
deterministic pda - DPDA
they are not! equivalent
context free go with NPDA
def. 7.1
an npda M = (Q, Σ, Γ, δ, q0, z, F)
Q - finite set of states
Σ - input alphabet .. also finite
Γ - stack alphabet .. also finite
q0 = initial state
F is a subset of Q .. final states (accepting states)
z in Γ start stack symbol .. not all pda's use this guy .. he's what prevents us from poping from an empty stack
.. so there's a z at the bottom of the stack
δ
input a[1]a[2] ...
stack
state
δ: Q cross Σ u {λ} cross Γ (top of stack) -> finite subst of Q x Γ*
remember, gama is my set of stuff on the stack
given the state, and read or not read symbol
and what's on stop of stack
going to pop that stack , push Γ and go to the next state
δ(q2, a,c) = {(q3, cd), (q4, λ)}
input: aba
state: q2
stack:
c
x
.
..
.
.
z
option 1, move into state q3,
read the a, so input is now ba
and my stack says whatever was there up to the x is still there, now push cd
c
d
x
..
.
.
.
or the second waa
process a, now in state 4 .. push lambda so my new stack top is the x
always a pop -- something comes off the stack
define δ .. now a 3d table
and I have choices npda
need to know state, input symbol and top of stack
so push pop and top are all i need
a^nb^n
qo, a, z = q1, a
δ q0,
δ (q1, b, a) = ...
done when if in q1 and thee's a z, then push the z and go to final
aren't going to be listing δ explicitly
-------
Wed 10/08/08
lab homework from section 4.3
Last time:
pumping lemma stuff for non-regular languages
- started chapter 5 - context free grammars
- chpater 7 .. push down automata
L = {a^nb^mc^m | n ≥0, m≥0}
show L is not regular
last week in lab(chap4)
L is regular, h is a homomorphism
then h(L) is also regular
Assume that L[1] is regular
h(a) = λ
h(b) = a
h(c) = b
h(l1) = {a^m b^m | m ≥0}
which is not regular
therefore L[1] is not regular
using the pumping lemma
L[1] { a^n b6M c^m | at least 10 a's
w = a^10b^m c^m
|xy|≤m
xy comes out of front and we have special cases for y again
but if we
assume L[1] is regular
then L[1]^R is regular
which is
c^mb^ma^n
w = c^m b^m a^10
xy ≤ m
y ≥ 1
y is some number of c's
xy^0 z = is .. which is not in L because m-k ≠ m
so L[1]^r is not regular
so L1 is not regular
suppose L2 = {a^nb^l where n≠l
show L2 is not regular
NOT the compliment of a^n b^n
1. L2 is clearly infinite
2. suppose L2 is regular
3. Apply the pumpiing lemma m>0 is given
chooose w
w = a^(m+1) b^m
which is in L and length is 2m+1 , which is > m
show that for every decomposition of w= xyz
where xy ≤ m and y ≥ 1
that there's some i st xy^i z is not in L
w = a^m+1 b ^m
I know that
y= a^k 1 ≤ k ≤ m
xy^0z = a^m+1-k b^m
but this guy is in L as long as M+1-k ≠ m
.. as long as K ≠ 1
xy^iz = a^m+1+(i-1k) b^M
and those exp don't equal m for i ≥2 and k ≥1
we need
1+(i-1)k = 0 and it's always not zero
.. I always am in the language, our string doesn't work
otta grab a string that always allows me to make the two parts equal
so this is a trick
choose w= a^m! b^(m+1)!
which is in L
|w| > m .. by a lot!
|xy| ≤ m and 1≤ |y| ≤ m
y = a^kzy^i z
a^(m! + (i-1)K b^ (m+1)!
Find i so that m! + (i-1)k = (m+1)!
(i-1)k = (m+1)! - m!
i-1 = ((m+1)! - m!)/k
i = ((m+1)! - m!)/k +1
((m+1)! - m!)/k +1
(m!(m+1 - 1)) /k
= m! (m/k) +1
but k is less than m!!! so we know that the factorial has a factor
so the k's cancel and I have an integer!
this is hard!
not on test :-D
dealing with not regular languages .. context free languages
chapters five and 7
context free grammar
G = V, T, S, P
V = finite set of variables
T = finite set of Terminals .. ie, alphabet
S in v - start symbol
P {a -> x| x can be anything at all!
derivation tree
ambiguous grammar (for some w in L w has 2 or more distinct parse trees)
L1 = {a^nb^m st n≠m}
l2 = (a^n b^m c^m | n = m or m ≥ k}
l3 = [a^nb^mc^m | K = n+m}
l1 = {a^n b^m | n ≤ m +3}
write context free grammars for each of them
l4 is trickly to understand
ab, aaa, aaab are all in my languages
L(g1) = K1
no a's some b's
no b's some a's
n >m , n<m
G1
S -> aSb | aA | Bb
A -> aA | λ
B -> Bb | λ
L3.
# of c^s = # of a's + # of b's
a^5c^5, b^5c^5 or λ
aaabbccccc
S → aAc | λ | bBc
A → aAc | λ | bBc
B → bBc | λ // can't produce more a's . you mess up order
L2
a^3b^3c^2 or a^4b^5c^5 or a^4b^5c^3
λ
all c's
S→ AC | BD
A→ aAb | λ
C→ λ | cC
B→ aA | λ
D→ bDc | λ | bD
l4 -
S → aSb | λ | A |B
A → a | aa | aaa
B → bB | λ
-------
S→ λ | aAb | Ab | Aab | aaaAb | aaaAb
a → aAb | Ab | λ
L5 = { ucv | u,v in {a,b}* c≠a, c≠b
# of a's in u > # of b's in v}
baabbcab
what do I care abbout b's
just a's in first half more than a's in second half
push a's until a c is read
after the c, for each a, pop from the stack
if empty and need to pop I'm in trouble -- not in L
else it is in L
writing a grammar for this guy is hard .. messy .. need to allow all possible patterns
NPDA M = Q Σ , Γ, ...
from a state, we read a symbol (or not), pop from the stack and then I change states perhaps, and I push a string .. which could be λ
attached.
how we describe computations
def. NPDA M = (q, Σ, Δ, S1, q0, q ....
L(M) = {win δ*, | ... (q0, w, z proves
start from q0, w, and z
start state, input, stack
after some number of setps I"ve read all of the input .. in a final state
and the stack contents are irrelevant
HI TAY,OR HOW ARE YOU .. THIS IS A VERY INTERESTING SYSTEM.
-------
Fri 10/10/08
npda m = q, Σ, Γ, δ, q0, F
L(m) = { all strings from sigma store gts to some final configuration where qf is a final state and u in Γ*
example from last time.
computation for a particular string.
instantaneous description
state, input, stack context
so when I satar any computation i start in q0
w = bacbab .. that's my ininital input
start starts off wwth a z
(q0, bacbab, z)
next step
(q0, acbab, z) .. according to the ddagram .. read b, pop z, push z.
-> q1, cbabab, az
-------
Mon 10/13/08
npdfa's
wcwR
.. all are context free and not regular
lab due wednesday
w = a^m b^j c^2m
j ≤ n ≤ 2j
a^mb^m
a^2mb^m
compliment of u = ww^r
2a - give examples for l1 l2 where their union is regular
.. use the complimente
Chapter 5, 7
7 - npda
5 - context free grammars
.. they give me context-free languages
if i only deal with npdfa .. they're harder to write .. and don't give me everything .. they're less!!
wed -- to and from npdfa
6 -- conext freee in forms that we like
parsiig -- startt from the start symbol ..
produce some sentencial form .. and hopefully eventually i get w
S -> x1 --> x2 --> ... w in L
someimtes it's really hard to know how to get to the w
osometimes do exhaustive search parsing
S -> SS | aSb | bSa | λ
how do I get there .. we'd like to have an lag that lest me say yes or no after some finite number of steps
so this is not such an alg but it can be adapted
.. leftmost -- we always expand the leftmost
round 1: S-> SS
or S -> aSb
S -> bSa
S -> λ
those possibliilites for sentiential forms
with babaa as my goal, we toss out
have two possibilities for a first step that might get me going
Round 2 = S => SS
aSbS
or SSS
or bSaS
or S
other possililites
bSa
→ bSSa
→ baSba
→ bbSaa
→ ba
so we can trhow out :
ba, bbSaa, baSba, aSbS
and we'll toss S becaase we've already been there
.. that's a lot of book keeping
thank goodness we can throw away
worst case
round 1 at most P.
round 2 p^2
round 3 p^3
....
sum = 1 + a + a^2
multiply by a
aS = a a^2 ..
- -aS
.. .. you remember this
so 1, we don't know how far to go .. .. the problem is the lamda .. allows me to shrink back
λ makes it hard to count
.. if I"m always adding a variable, the stuff on the right get's longer .. only gets smaller with λ rule
so I dont know max # of times
1- p^(k+1) / 1-|p| -1
O(|P|^k+1) .. and we don't know what K is
so rules on the grammar to tell me that
Theorum: Let G = (v,t,s,p) be a context free grammar st there are no rules of forms A -> λ or A→B
.. don't want to just exachange vone variable for another
if that's the case, then and only then, exhaustive search parsing can be made into an alg that decides whether or not w is in L(G).
.. produces a parsing
and it can do so .. O(|p|^(2|w|+1))
S→ .. if I don't have any rules that derive λ then i know that ht elngth oo the right is at least 1 .. it hasjn't gotten any smaller
since there are no rules of the type A→B,
(S→AB or S→a are allow)
so one of two things haps to happen
if the length remains the same, if at some point .. x1 → x2 → x3 ..
it's possible that the lengths are the same |x2| = |x3|
if so, x3 has one less variable than x2
or similarly x3 has one more terminal than x2
and that's the worse possible case
.. this happens at most |w| times
because if w has 5 terminals, at most I could have created 5 terminals
.. so worst case, at most I make 1 terminal at a time .. so that would take |w| tsteps in which the legnth of the sentential forms dont' change
if each variable gives me 1 terminal, |w| variables
S → AA
→ AAA
.. at most |w| times
so the length of the sentential form either grows by at least 1 or I replace one variable with a terminal.
so under the privisions of the theorum .. my k is 2|w|
i should have it after 10 for a string of length 5 (?) and if not I get to quit
S→ SS | aSb | bSa | ab | aa | bb | ba
w = babbaa
S→ SS
→ aSb
→ bSa
→ bb
ba
ab
ba
through out all but 2
round 2:
....
but this guy is parseable
shorter becuase I always produce two terminals
exponential wiith respect to the size of the produciions .. totally unacceptabble .. so in chaptr 6 we use an n^3 alg
Ambiguous grammar G
there exists 2 distinct parse trees (or more) for some w in L
A language L is called inherantly ambiguous if every grammar G st L = L(G) is unambiguous
.. can't prove
a^nb^mc^k = st n=m or m =k
I can't count the number of a's and b's and make sure everythng matches as c's
so have to have two routes
so
a^2 b^2 c^c .. has two ways to go
we can't put terminals on the left in context-free .. so this is the only way to go..
inherently ambiguous .. haven't proved it .. just given an plausible argument
-------
Wed 10/15/08
lab - chp 6
today 7.2
Def;
let G = (v, T, S,P) be a context free grammar
then L(G) = {w in t* st s can dervie in 0,1, or many stes w ..
then that language is a contest free language .. so in this case we use the grammar for the definition like dfa for reg language
theorum - easy contruction
for any context free language L
there exists npda M st L(M) = L
theorum painful construction
- for any Npda M, exists a context free grammar G st L(M) = L(g) .. lnaguage accepted by any npda is also a context free language
so to show context free .. give grammar or npda
Def:
a cfg G = (V,T, S,P)
is in Grebach normal form if all rules look like
A → ax, where a in T and x in V*
(2) A cfg G = (V,T,S,P) is in chompsky normal form (CNF)
if all rules are of the form / look like
A→a or A→BC where a is a terminal , A, B, and C are variables
Theorum: if λ is not in L(G) then the cfg G can be written in GNF and CNF, in other words, for any oontet free language that doesn't include λ then i can write the grammar in either greback or chomsky normal form
we can always handle special case if needed
so tday need to assume greback
gnf is really close to regular grammar .. terminal followed by perhaps many variables
so I know now if I'm generating a tring , I'm generating theelft mot symbols first
example 1:
a^n b^n n≥1
any cfg .. can write this guy:
S→ aSb | ab
but this is not in greback
GNF
add B→b
S→aSB | aB
leftmost derivation:
S→aSB
→aaSSBB
→aaaBBB
→aaabBB
→aaabbB
→aaabbb
so that's a full derivation of a^3 b^3 .. it's a little longer than in top but that's ok ..
need to take a grammar and convert it into an npda
to npda
terminals are what i've read variables are stack
see paper.
any context free grammar can be constructed as an npda of three states
Proof of for any cfll L there exists npda M st L(M) = L
G : (V, T, S, P) in GNF
M: (Q ...
({q0,q1,qf}, T, {z} u V, δ, q0, qf}
where δ(q0,λ, z) = {(q1, Sz)}
δ(q1, λ, z) = {(qf, z)}
for each A →αx st αin t, x in v*
(q1,x) in δ(q1, α, A)
in q1, read alpha, a on stack .. stay in q1 and push x.
if λ in L(g) then δ(q0, λ, z) = {(q1,sz), (qf, z))
show that L(M) subst L(G)
Show L(g) subset L(M)
time out for another example
example 2:
L = { w|number of a's = # of b's}
S→ aSb | bSa | SS| λ
asside:
show L(G) = L
Proof is by induction on the length of the derivation
-> equal number of a's, and b's in any sentential form
Base Case: S, since 0=0 (ie, after zero steps of the derivation
otherwise consider
S →aSb
or S→bSa
1 = 1
0 = 0
0=0
so they all help.
my IH would say: at steps 0,1,2 .. k, sentiential form has an equal number
IS: one more time
Any expansion of S
S→ λ or S→SS .. number of as and bs unchanged .. so still equal
S→aSb or S→bSa , number of a's increased by 1 , number of b's increased by 1, so since equal to start with, still equal (ih said they were equal up to that point)
but anyway, example 2 has to go into GNF
a couple of problmes
step 1
S→ aSB | bSA
B→b
A→a
SS is a problem
S→aSB | bSA | aSBS | bSAS
// substitute in with the rules allowed
B→b
A→a
S→λ
npda attached
always have to read all the input
Let w in L(G) ie, S ==>*w
in G
S→a1x1→a1a2x2→a1a2a3x3→ ... → a1a2 .. an
S→ a1A1A2..Aj →a1a2B1B2..BnA2..Aj
a[i] are terminals
in grebach it takes |w| steps to produce w .. becuase every step produces exacty one terminal
in M
(q0,w,z) |-- (q1,w,Sz) (1) S→a1 x1 in G in M (q1,x1) is in δ(q1,a1,s)
|-- (q1, a2 .. an, xz)
x1 = some A[1]u[1] where u[1] in V*
A[1] →a[2]
(1) S→a1A1A2 .. Aj . in G
we put (q1,
read A1, it's gone , poped the S and so all these guys are on my stack
|-- (q1,q2
because step 2 from above, in M
(q1,B1 .. Bk) in δ(q1, a2, A1)
|-- (q1, a3..an, B1..BkA2 ..Aj,z)
if at any point here
am .. an V1V2 ..Vt then I've got a rule that says, I'll read the first part, pop the first variable and there is a rule that allows me to continue
.. by the rules of G give δ in M
|-- (q1, λ, z)
|-- (qf, λ, z)
nice construction .. fairly easy to do and undersand if you look at stack and input vs the derivation
if I give you one, you ougt to be able to draw the other even without the machine
the other direction is pretty gross
given NPDA M produce dfg G
Construction depends on the ollowing conditions on the npda:
1. Single final state and it is only entered when the stack is empty .. much mor restrictive but equivalent form.
λ, α, z to my new state .. whole job to read nothing and pop whatever is there .. pop everything from the stack
and then λ, z, λ over to our new qf.
second criteria is about how we can affect the stack
only gooing to be allowed to increase or decrease the size of the stack by 1
if δ(qi, α, A) = {ci, cj, .. ck}
or
Then Ci = (qi, λ) .. decreased the stack by 1
ci = (qj, BC)
...... increased the size of the stack by 1 .. that's all i'm allowed to do
so now what do we do with rules where .. se paper.
.. need a lot of one time use states
so we also aave a problme with states that push many things.
see paper
so we've shown that these two restrictions are reasonablee.. they don't stop us .. but we wouldn't want to have to write machines like that.
-------
Fri 10/17/08
Section 7.2
Theorum. For any npda M - L(M) is a context-free language
ie, for any npda M, exists a context free grammar g st L(M) = L(G)
need to construct a grammar from the machine
Asummptions about M
- 1 final state, entered iff the stack is empty
- all moves change the stack size by 1 (+/-)
q[j], λ in δ(qi, a, λ)
or δ(q1, a, A)
idea varrables of g: q[i]Aq[j] where qiqj in Q and A in Γ
that just a var\/
rules of G (qiAqj) in some number osf stepst derives V
true iff (qi, vu, Ax)
|--- (qj, u, x)
iff from qi we read V, removing A from the stack and ending in state qj
(q0, z, qf) ==>* w
^^^^^^^^^^ start symbol
iff (q0, w, z) |--* (qf, λ,λ)
this is the goal of how I make the rules in my grammar
we know that a grammar ..
construct G from M
M = (Q, Σ, Γ, δ, q0, z, F)
G= (V,T,S,P)
T = Σ
V = qiAqj st qi, qj in Q and A in Γ .. |v| ≤ Q^2 * |Γ|
so this could be really big realy fast -- going to try to avoid that
S = q[0]zq[f] .. that's my start symbol becuase when I start, I've got a z on the stack and need to get to q[f]
P = production rules have to come from the transitions in the machine
we have two kinds of rules in the machine
for all rules like (qj, λ) in (qi, a, A), add the grammar rule
taken A off the tack and read a from in put
(qi, A, qj)=>a
those are the easy ones
horrible is this guy
for a (qj, BC) in (δ, qj, a, A) ADD (qi, a, qk) -> a(qiBql) (qlCqk)
a on the sstack .. need to dervie the a but the variables represent moving from qi to qjk and ending up in qk ..
for all combinations of l and k
|Q|^2 of these guys for every one of these I just added |Q|^2 rules
that's the construction .. tough to prove that it works -- see handout
end up with a grammar with 21 rules
lots of variables - possibly 18 - Q^2 * Γ
how many variables on the left size?
so we can throw out a bunch
not responsible for this derviation .. must know that it exists
know where the variables came from , but no questions on it .. maybe a homework problem.
that concludes chapter 7.
7.3 and 7.4 .. deterministics context-free languages.
.. so there are things I can do with an npdfa that I can't do with a dnpda
not going to go there
done with chapter 5 / 7
chapter 6 - simplification of conext free grammars
and it has normal forms
simplification - doesn't mean smaller .. often enlarges the # of rules .. goals are to be able to apply algorithms that won't infinitely loop
simple substitution
eliminate usless variables and productions
eliminate λ productions (A -> λ)
eliminate unit productions .. don't make much progress .. only change state.
A→B
substitution rule.
Theorum 6.1
Let G = (V, t, S, P) be a cfg
suppose A→x1Bx2 in P
and B -> y1Cy2 is in P
....
and there are no
Then G6hat - V, s, t, p-hat)
where p-hat = rules of P - A→x1Bx2
and adding A→x1y1x2 or x1y2x2 or ... x1ymx2 }
L(g) = L(G-hat)
the process of substitution we'll be doing to eliminate these other guys
example
------
S→aBb| aS | ab
B→aSb|Sb
substitute for B in the rule S→aBb
G-hat
S→ aS| ab | aaSbb | aSbb
B → aSb| Sb
so i've made a substitution
in this grammar we are now doen with B, but in larger grammars might not be, so we keep the B production around.
"proof"
S→aBb→AASbb → aaabbb in G
G-hat:
S→aaSbb→aaabbb
.. a valid way of transforming a grammar
Proof of theoum . show that L(g) = L(G^)
w in L(g) means that S derives in some number o fstemps using only rules of G w
if in this derviation
A→x1Bx2 is not used, then S→ in G^ W .. if we never use the rule, all the other rules are still there .. we only took one rule out of G to make G^
If it is, let's lookat the first occurance
S → using rules of G .. I got to this lovely little critter u1Au2 ..
use the rule u1Bx1x2
and then ....
then in G^ since that was the first occurance, up until in this point, all this is sitting in G^ as well
and now I"m at the piint where I used the rule A→x1Bx2 .. and I dont have that anymore
but we do have thre rule
A→ .. use another rule we made in teeorum
repeat using next occurance until no occurances remain
.. i know there are fininte number of occurances since aderivation is finite
so we know we finish
therefor S dervices, in G^, w
therefore, L(g) is a subst of L(gg^)
gotta go the other way, but you use the same stuff
useless
Definition: G = (V, T, S, P) is a cfg.
a variable A in V is useful iff there eexists w in L st S dervices in some number of steps I get
xAy from which I get w
.. so I reach it and I use it
it's involved in a good derivation
else it's useless
two ways then:
1. a variable is useless if it never terminates
2. A variable is useless if it's not reachable
First, idea of terminates
ALg for determining which variables terminate
1. Let V[1] = empty set
2. Repeat until no new additions to V[1]:
for all A in V, if A→x1x2..xn where xi in V[1] u T, then add A to V[1]
G^ + (vtsp)
P^ = rules of P that use only symbols of V[1] u T .. if the variable didn't make it into V[1] , don't use it
S→ AB | aBB
A →abB | bC
B→bB|b
C→CD
D→ dD | λ
V[1] = {B, D} first time
V[1] = {B, D, S, A} 2nd pass
V1 = {B, D, S, A} .. so I'm done!
get to throw out anythng with a C
S→ AB | aBB
A →abB
B→bB|b
D→ dD | λ
then the next part .. how do I kknow who I reach from S
1. Draw the dependancy graph on variables using all rules
attached.
2. depth first search or breadth first search from S eliminating any variable not visited.
S→ AB | aBB
A →abB
B→bB|b
eliminate becaase reachable and not terminate causes me grief
-------
Mon 10/20/08
Friday, Oct 31 - Exam 2
covers 4, 5, 6, 7, parts of 8
Last time:
Chapter 6 - simplify grammars
normal forms
substitution rule:
in G if S→ x1Ax2 and
B → y1y2 .. yn
and this is all the frules for B
then in G^
S → x1y1x2 | x1y2x2 ....
and keep all the B' rules
so G^ is whatever G was plus those rules
we proved that L(G) = L(G^)
Theorum G = (V, T< S, P) is a cfg
then there exists cfg G^ = V^, T, S, P^
st G^ has not useless variables and L(G) = L(G^)
algorithm to convert G to G^ was
1) determine all terminating varaibles and removie non-terminating rules from the grammar
2) determine all reachable variables and remove non-reachable variables and associated rules, resulting in G^
useless: non-terminating and not reachable
in this order because if we do reachable before non-terminating, we could end up with somebody not reachable
S→bB | a
B→ bBA
A → a
reachable.
everbody's reachable
then when I do terminating, I never add B
S→a
A→a
.. now I still have a not reachable
if A uses B to terminate, then they would both terminate but then if B is not reachable then A would not be reachable
proof: L(G) = L(G^)
if w in L(g) then S dervies in G , w
S dervies W in G, since G^ is subset in some loose terms of G
and S dervies in G could not have used anything useless.
.. if it's useless, it's not involved i the derivation of a string
Terminating, reachable, nullable
Def. Any rule of cfg of the form A→λ is called a lambda production or a λ rule
any variable for which I can derive λ in 0,1, or many steps is called a nullable variable
we don' like nullable variables because they lengthen a derivation
Theorum - for any G+ (V, T, SP) st λ not in L(G)
there exists G^ = (V, T, S, P^) st
1) L(g) = L(G^)
2) G^ has no λ productions
Proof of the theorum is by construction of G^
!) V[n] = empty
2) for all A st A→λ, add A to V[n]
3. Repeat until no new additions:
IF A→ x22x2 .. xn for all Xi in Vn, add to Vn .. if each guy can generate a λ, then we can produce nothing
4. Remove all A→λ from P
For all A st A→ y1y2 .. yn in P, add rules that remove all nullable variables from the right handside in every possible combination.
Do not add any A→ λ
Example:
S→aABC
A → BB | aA | C
B → λ | bB
C → CB | c
first pass,
V[n] = {B}
--- check if using left variable on right??
second pass:
V[n] = {B, A}
.. all combinations ->:
S→aABC | aBC | aAC | aC
A → BB | aA | C | B | a // don't add any λ back!
B → bB | b
C → CB | c | C
so this is G^ .. not smaller! .. but elimainates the problem of a sentential form getting smaller 'cuz no one can → λ
less steps.
the proof:
what rules did I add .. see book for proof
If i didn't use a λ rule, it's just fine
if I did, some x1Ax2 and then I got down to some x1x2 in G .. how did I get there .. have to back up and talk about the rules I added to take care of that problem.
unit rules .. don't like them
A→B etc.
we want to make some sort of progress and there's a way to replace those
when I eliminated λ , I introduced a unit
Def.
Any rule of a cfg of the form A→B where A and B in V is called a unit rule
Theorum:
For any G = (V, T, S, P) st λ not in L
exists G^ = (V, T, S, P^) st
L(G) = L(G^) (G has no λ productions)
and G^ has no unit rules
idea
A → B |aA | AB
B→ aa|bB
can't we just do substitution
A→aa|bB|aA|AB
B→aa |bB
problem: when we have larger grammar rules
to construct G^
1. Build a dependancy graph using only unit rules
2. Identify all of pairs of variables st A →* B
3. Remove all unit rules from P
4. For each pair A→B, add rules A→x1x2 ...xn
where B derived x1x2 ..xn
throw away A→B put all that in.
key:for each pair!
example
S→ Aa | B
B→ A | bb
A→ a | ba | B
step 1 .. dependancy graph based on unit rules
list all pairs
S→B
A→B
B→A
S→* A
so I have four guys to deal with
so G^
copy non- unit rules
for each pair, add
S→ Aa | bb | a | ba
B→ bb | a | ba
A→ a | ba | bb
adding things based on the grammar I had right after I threw evrything out.
.. otherwise, ciclic
now, B's unreachable, so the order is important
first eliminate λ
then unit rules
then useless
Theorum :
Let G = V, T, S, P be a cfg st λ not in L(g)
Then there exists G^ st L(g) = L(g^)
where G^ is a cfg with no unit , no λ, no useless
Proof:
1. Eliminate λ
2. Eliminate unit
3. Eliminate useless
6.2 - two normal forms
A cfg is in Greback normal form if all rules are of the form
A→ ax, a in T, x is some number of variables
.. no λ , no unit, and hopefully nothing useless
A cfg is in Chompsky normal form if all rules are of the forms:
A→BC
A→a
where a in T and A, B, C in V
again, no unit, no λ, and hopefully no useless
greyback nice for taking a cfg into an npda
the strength of Chompsky ..
and order w^3 for determining whether or not w in L(G)
so exhaustive search parsing was exponsential
.. this is polynomial in termins of the length rather than exponential .. that's huge
CYK algorithm
S→AB
A→BB | a
B→AB | b
this guy is in CNF .. two variables or one terminal
is w = abbbb in L(g)?
define V[ij] = { A st A derives αi .. αj}
so
abbb = α1α2α3α4
V[1][3]
v[2][4] = all variables sut that A can derive bbb
V[1][4] all A such that Asdervies my while string
w in L(G) iff S in V[1][4]
if S can generate abbb, then it's in my language, else can't and not in my language
start at top and work our way down, using the idea of concatination and there's where the CNF is essential
V[1][1] = {A}
V{2][2] = {B}
V{3][3] = {B}
V{4][4] = {B}
we do this for a w
did all substrings of length 1
now all of legnth 2
V[1][2] = V[1][1]V[2][2] = S, B
// need everybody that produces the substring 1 to 2
V{2]{3] .. who can give a BB
= {A}
....
V[1][3] = V[1][1] V[2][3] or V[1][2] V[3][3]
= {A}
V24 = {S, B}
.....
V[ ..
V{1][4] = {S, B}
.. S in V[1][4] so w in L(G)
-------
Wed 10/22/08
w = aab
S→AB
A→BB | a
B→ AB | b
v11{A}
v22 = {A}
v33 = {B}
V12 = empty
V23 = {S,B}
V13 = {S, B}
.. yes, the string is in my language because S can dervice it
dynamic programming
slove subproblem and merge the solutions of the sub problems into a solution for the bigger problem
dynamic programming array -- need everbody to the left and to the right
.. from code, probably
using the other alg
S A B
a b
ab bb ab
aab, bab aab
bbb bbb
aaab
abbb
...
what's important is the S column - everyone derivable from S
dynamic programming? well it's much more linear
.. each row depends on the previous row being complete
Chapter 8: Properties of CFL's
8.1 Two pumping lemmas
-one
8.2 closure properties and decision algorithms for CFL
regular languages are closed under union, concatination, intersect, complimentation, star closure, reverse, difference, symmetric dif, nor
for regular languages, we have algorithms for:
λ in L?
L == empty?
L is finite?
L[1] = L[2]
L[1] subset of L[2]
w in L?
Set of Context free languages are closed under:
concatination
star closure
reverse
union
not intersection -- we only have one stack for an npda
because of logic rules => no compliment
no difference -- L1 - L2 = L1 n L2-comp
Theorum - the set of CFL is closed under
1. union
2. concatination
3. star closure
4. reversal
Proof:
by construction of appropriate grammars
L1 is CFL therefore = L(G1)
= V1, T1, S1, P1
L2 is a CFL = L(G2) = V2, T, S2, P2, a CFG
V1 n V2 is empty
union - construct G3 st L(G3) = L1 U L2
Let G3 =
(V1 u V2 u S, T, S, P1 U P2 u S→ S1 | S2
Show L(G3) = L1 U L2
w in L1 u L2
without LOG, let w be in L1 .. either in L1 or L2
that says that S1 dervis using rules of G1 in 1 or many steps w
but then, S → S1 →* w and both in G3 since p1 subset of P3 and S→S1 in P3
similarly
S→(g3) w , w in L2
therefore L1 u L2 sbuset of L(G3)
Let w in L(G3)
(Show w in L1 u L2)
w in L(g3) S→* (g3) w
So i can use the rules of g1 or g2 to get to w
S→S1→ w
S→S2→w
w in L(g1) = L1
w in L(g2) = l2
therefore, w in L1 u L2
------
concatination
G4 = (v1 u v2 u S4, T, S4, P1 u P2 u S4→S1S1)
Show L(G4) = L1L2
show L1L2 subset of G4
show L(g4) subset of L1L2
w in L(g4_ iff S4 → * w
iff S4→ (g4) S1S2 derives uv = w
...
star closure
L5 = L1*
G5 = (V1, T1, S1, P1 u {S1 →S1S1|λ})
proof
----
w in L5, w = u1u2..un st ui in L1
S1 → S1S1 → S1S1S1 → ... S1 .. Sn .. n-1 applications of S1→S1S1
→* u1S1 .. S1
→* u1u2 S1 .. S1
→* u1u2 .. un in L4 (G5)
special case, w = λ
Show that w in L(G5) → w in L1*
reversal:
L1^R = L6
G6 = (V, T, S, P6)
where A → x in P1 =>
A → x^R in P6
proof .. not now.
fact. {a^nb^nc^n | n≥0} is not a context free lang
.. can count a's and b's but not c's
{a^nb^nc^m | n≥0, m≥0} = L1 - is context free
{a^kb^nc^n | n≥0, k≥0} = L1
intersection = {a^nb^nc^n | n ≥ 0} which is not a cfg
so an example of not closed under intersection
Theorum:
Let L1 be a cfl
Let L2 be a regular language
Then L1 n L2 is a cfl
Decision algorithms:
Given a CFL L
(1) L empty?
(2) λ in L?
(3) is L finite?
(4) w in L?
(5) L1 = L2? --> can't do that.
(1) L is empty iff S does not terminate! and we have an algo that determines which variables terminate
(2) λ in L iff S is nullable!
and we have an algorithm for that , too
(4) We just did CYK or Top down from cohen, exhaustive parse
(3) .. what do we need for infinite
reachable variable A
which A =>* Axy
A→A .. not enough to make the language ihnfinite .. need to produce something
and A needs to terminate
and |xy| ≥ 1
can we without L O G, assume our grammar G has no λ, no unit, and no useless productions / variables .. sure! because we know how to get rid of them.
then we need A
A→Axy for xy in V u T
.. so just looking for that loop
algorithm.
1 - eliminate λ rules, unit rules useless rules and variables
.. get rid of the crap
2. construct the dependancy graph on remaining variables.
3. If there is a loop (cycle) in the graph, L(G) is infinite, else its finite.
-------
Fri 10/24/08
exam 2
4.1, 4.2, 4.3
5.1 5.2
61. 6.2 6.3 - converting and normal forms and cyk
chapter 7 - npdas 7.1 .5(7.2)
8.2
closure and decidable for regl langauges
cfg . chapter 5
transofrmationssof grammars .. cnf gnf membership .. chapter 6
npda . chap 7.
really everything abot being context free
tuesday - problem session
old exam 2 - posted
write cfg
ignore a
regulr => context free
lab: handout given is the solution to problme 15
do example 7.9 in text.
(1) CFG into G^ in CNF
algoirthm by which I do this
1. remove all λ rules from G giving G1 .. in the usual way -- transformation
2. remove all unit rules from G1 giving grammar G2
3. remove all useless variables and rules from G2 giving G3
4. a) Introduce new variables and new rules X[a] -> a for all terminals
b. Given those, for all rules A→x, where |x| ≥ 2,
- replace all terminals in x with appropriate X[α]
c) for all rules A→y, where |y| >2, introduce new variables and new rules
= y1y2..yn, n≥3, we know that they're vriables
then introduce the followig
A→y1C1
C1→y2C2
C2 → y3C3
...
C[n-1] → y[n-1]y[n]
.. one -time use variables.
C[i]'s are all unique and don't appear anywhere else.
so we haven't made this grammar smaller .. the only one of these steps that does that is when we remove useless
S→aA|Bb|ab | BB
S→ aAb | a | ABB
B → bB | aA | BB |b
step 1.
Xa→a
Xb→b
S→XaABXb | X[a]X[b] | BB
A→XaAXb | a| ABB
B → XbB | XaA | BB |b
Xa→a
Xb→b
S→ X[a]X[b] | BB | XaC1
C1→AC2
C2→BXb
A→XaC3 | a| AC4
C3→AXb
C4→BB
B→ b | XbC5 | AC6
C5 → bXa
C6 → BB
//makes the derivation longer, but allows me to do the cyk alg
skipping going from CNF to greyback
Theorum: Let L[1] be a context freel language,
L[2] be a regular language.
Then L[1] n L[2] is a CFL
application of this:
L1 is any CFL
L2 = { w | |w| is even}, which is regular .. ((a+b)(a+b))*
by the theorum
L1 n L2 is CFL
= {w| w in L2 and |w| is even}
are there any even length strings in L1?
Is L1 n L2 = empty? if yes, then no, L1 does not contain .....
If no, then yes L1 does contains .....
we have an algorithm for this!
so we need to prove the theorum
to prove context free
need either grammar or npda
proof is by construction
NPDA M1 st L1 = L(m1)
DFA M2 st L2 = L(M2)
construct NPDA M[3] st L(M3) = L1 n L2
therefore, L1 n L2 is a cfl
can do this because M2 doesn not need the stack!
M1 = (Q, Σ, Γ, δ1, q0, z, F1)
M2 = (P, Σ, δ2, p0, F2)
M3 = (Q x P, Σ, Γ, δ3, q0p0, z, F1 x F2)
state push ..
((qk, pl), x) in δ3((qi, pj), a,b)
iff (1) (qk,x) in δ1 (qi,a,b)
pl = δ2(pj, a) for a≠λ
or (2)
a=λ
(qk, x) in δ1(qi, λ, b) and pl = pj .. I don't move
L1 = {ww^R | w in {a,b}*}
see attached
-------
Mon 10/27/08
Exam 2 - Friday!
4.1., 2,3
....
Today: 8.1
Tomorrow: problem session - watch email
Wednesday: chapter 9
Fri - exam
lab: #6
linz 6.1 # 21
Suppose
S→aXaY |XZ
X→zXz|Y
Y→Yb | λ
Z
the complexity (g) = Σ some of the right hand sizes
A→ v in P where G = (V, T, S, P)
S→ aXaY) .. 1+5 .. variable on the left plus the length of the right
next rule 1 + 2
λ .. 1 + 0
question: show that the removal of useless productions always reduces the complexity
if I am removing useless prrduciions from my grammar what am I doing .. never adding anything .. at most I'm throwing stuff out. .. decreasing the complexity.
what can you say about hhe removal of λ .. if I throw away the λ rule, I eliminate a rulem but I add more rules
.. so we get to throw away a 1+0, but add a lot more.
. but we can alos throw one away .. so we can't say anything about the complexity
7.2 example
q0Aq3→a
q0Aq1→b
q1zq2 →λ
q0zq0 →a
q0aq3→ (
.. page 193.
S = q0zq2
w = ab
q0zq2 → a(q0aq1)(q1zq2)
→ ab(q1zq2)
→ abλ .. ab
this is the derivation ..
take the npda and show the computations
npda attached
original does not fit the rules for must grow or shrink by one
so altar npda to change rule that doesn't grow
show computatin
(q0, ab, z) |--
(q0, b, Az)
current state, remaining input, stac contents
.. "instantaneous description"
q1Aq2
.. from state q1 , whith a A on the stack, any string produced from this
qiAqj → x ------
qi, A on stack .. end up in qj, there's now a y on the stack (ie, A is gone, and I produced string x
q0zqf→w
iff (q0, w, z) |--* (qf, λ, λ)
(1) show {a^nb^n | n is not a multiple of 5} is a cfl
(2) Let L be a context free language.. Show there is an algorithm that determines whether or not there are any strings of length ≤n in L
(3) Give a CFL L st L-bar is not CFL
can decide if empty, finite -> looking for a loop
λ .. is S nullable
closure properties
union, * closure, concat
can take a CFL n REG → CFL.
L1 = a^nb^n | n≥0} .. we know it's CFG by S→aSb |λ
L2 = {w | # of a's is not multiple of five} is regular
L1 n L2 needs to give us a^nb^n where n is not a multiple of five
see attached nfa.
proves that L2 is regular
so that's how we use the closure
#2 -
L is my given language CFL
L2 = {w | |w| ≤n} .. is regular since finite
L1 = L n L2 is a cfl
if L1 is empty, then the answer is no, L does not contain strings
else yes L contains strings of length ≤ n
L1 is empty iff S1 is useless
Pumping Lemma for CFL
L infinite and CFL
then exists m>0 , for any w in L |w| ≥m
w = uvxyz
|vxy| ≤m
|xy| ≥ 1
u vi x yi z in L for all i
so I have five cases to solve for
a^nb^nc^n
L CFL st L-bar is not CFL
L = {a^nb^mc^k | n=m or m≠k}
S→ AB | CD
A -> aAb |λ
B -> cB |λ
D → bDc | E | F
E→bE | b
F→ cF | c
L-bar .. w st does not fit the pattern a*b*c*
but also the not of the coditiin
a&n b^m c^k .. n≠m and m=k.
and the and part is the killer
you need to pop the b's 'cuz i pushed the 'as .. but then therey're got .. can't compare it anymore
.. so I can't do this .. not a CFL
-------
Tue 10/28/08
λ
eliminate useless,
convert to cnf
proof of the substitution rule
S→SA |λ
A→λ
.....
add λ back at end
S→λ
A→xy
A→By
B→x
G an G-hat
if S→ w using only the rules og G .. show that it derives in G-hat as well
S dervices αAα --> α1yα2 ... derives w
if that's the first occurance
α1Aα2 .. use two steps
and then any other occurances of using his rule, I have to replace in the same way
S→AA |a
A→SA |b
in an npda, i never have to read anyway .. so do it exactly as before .. and then read nothing, see S , replace with AA
etc
7c
n > 2m
aaBb | a| aB | ab | aAb
for n to be smaller
aaAb | Ab | b
L = { w in {a,b}* : #as ≠ #bs}
S→ A | B
A→ aAb | bAa | AA | Aa | a
B → aBb | bBa | BB | bB | b
.
8b
S→ AB | CD
B→cB |λ
A→ aAb | λ
C→ aC |λ
D→ bDc | Ec | bF
E→ Ec | λ
F→ bF|λ
no regular language language is inherently ambiguous.
every regular language has a regular grammar that is not ambiguous becaase you just steal it off of the DFA
npda - write down what you mean that state to do .. so you don't get lost
cfl closure
concat, union, star closure, reversal
λ in L
w in L
L = empty
L infinite?
regular closure
union, intersection, concatination, reversal, complementation, star closure, difference, xor, nor
λ in L?
....
CFL n regular => CFL
old exam 2
L = {a^n ..
show not regular
assume it is . clearly infinite apply the pumping lemma
choose w = a^m b^m c^2m
|w| = 4m > m
M+M ≤ 2m
w = xyz ....
y = a^k for some k
i=0 does't work
but i=2 does
give a subset that is not regular
L1 = {b^j c^k | j≤ k, is a subset of L but it is also not regular
L2 .. infinite subset which is regular
L2 = c^k | k ≥0 .. in that case, n=j=0
given
L = a^n b^j c^k ..
can't compare the length of the first half with the legnth of the second half
a^n a^k st n≤ k .. really just some number of a's
so subsets tell me nothing.
subsets of a finite regular language of course are regular
not responsible for pumping lemma for 8.1
L1
a^n b^m where 10≤ n≤m≤1 .. it's finite!
n+m > 10, n+m < k
.. cant count
-------
Wed 10/29/08
write cfg
closure prooperites for regular language .. and for cfl
apply things
answering questionsa bout the alg to remove tuseless, unit, etc
membership alg .. since an entire lab was devoted to it.
npda .. must read all the input
don't have to go to a qf
but probably do on equality
#a's < #'bs
a, b, λ
b, a, λ
a, z, λ
b, z, λ
a, a, aa
b, b, bb
at soee point wwen i finished all the input, i better have a b left on the stack .. so I could go to a final state like that.
chapter 5.
L= {a^n b^m | 2n ≤m ≤ 3n}
n = 2,
4≤m ≤ 6
S→aSbb | aSbbb | λ
8e .. look at it
#a's + #b's < #c's
for each a, I need a c
For each b, I need a c
For each c, 0, 1 a; 0,1 b
at some point I must produce a C without an a or a b
S→ AcA
A→ aAc | cAa | bAc | cAb |AA | cA | λ
for >
S→ BaB | BbB
B→ aBc | cBa | bBc | cBb | BB | aB | bB |λ
material that's not on the exam:
so far
regular languages
DFA, NFA, reggular expressions, gregular grammars
closed under lots!
and algorithms for lots!
(1 variable state machine) .. where the variable held the current state
context free languages -
- NPDA and CFG's
.. we lost some flexibility .. PDA's .. deterministics .. do not give all context free languages
they're a sbbset . so that's different than regular languatgs
closed nder union, star clsure, concatination,
algorithms for membership, empty, finite , λ
... less than what we had before
for memory: variable for tee state and used a stack for memory in this state machine
so we had as mucted but access only to the guy on the top
very restrictive machine but we got a whole new set of languages
but plenty of things we cantt do
ww, for example
a^n b^m c^o | wwhere we consider all three
so reasons to consider sets outside of here
so far our computational models and what we've been using are dfa and nfa, and npda ...
simply accept or reject
.. we've been using htese guys as acceptors
what about computing f(x) = x+1 .. if input is x, output should be x+1
or what about f(w) maybe I want to prodce ww^r
I acutaaly want to perform a computation
and have a result .. an output .. we can do that yet
but we're about to
chapter 9 - turing machine.
developed by Alan Turing . 1912-1954 .. died of self-inflicted cianide posening
very well respected scientist for what he coold do but arrested for his homosexuality
so he killed himself
phd from prinston in LOgic , algebra and number theory
british .. spent most of his time in england
during wwii, he was responsible for tee decription of german codes . did this using a computational model in which he was able to break the codes f he german u boats
1939-1942 .. this was his project
1950 - "computing machinery and intelligence" . computing machinery .. the major professional org is called the association for computing machinery .. so it fits
presented what he called the turing test
.. how smart can we make a computer?
judge , wall and another person with a computer
judge is having a conversation between both other person and computer
can the judge determine which is human and which is a computer .. if not, passes the turing test.
Turing thesis - Church's Thesis
any function known/believed to be computable is computable by a turing machine.
this little machine can ffo everything a computer can.
Control unit
|
memory
two -way infinite sequence of memory cells
connections to memory -- read / write head aka cursor
know my state and position on tape ..
qi is the state, arrow is position in memory
so single→stack→variable for state and infinite memory
input is on the tape and so is the output
.. don't always need output
if we think of memory as a huge sequence of memory cells, that is what we have
we can add on here .. painful but i can do it
search for string patterns
makes sense that it's very much like what we do .. it is not random access -- can only move one step left and one step right
Standard Turing Machine
- - - - - - ---- - -
there are many variations but they're all just as powerful
TM M = (Q, Σ, Γ, δ, q0, [], F)
Q = finite set of states
Σ = finite input alphabet
Γ = finite tape alphabet
[] = member of Γ .. is the blank .. tells us empty
Σ subset Γ - { [] }
and input must pbe aprt of the tape alphabet
q0 in Q is the start state
F subst of Q are my final states
δ : Q X Γ --> Q x Γ x Move) where MOve : {L, R}
don't get to stay put
δ is a partial function
.. if it's defined, there's only one possibility
δ(qi, α) is either undefined or some qj, β, (L,r)
partial because the entire domain is not of use
state q0
tape
[] b b b a []
δ(q2, a) -> q4, x, left
state q4
[] b b a x []
^
before :
bbaq2ab
q2 .. cursor goes right beofre the letter it sits under
|-- bbq4axb
left of cursor, state, evertyhing from cursor to right
state q1
[] b b a x y b []
^
δ(q1, x) = q3, x, r
after:
state q3
b b a x y b []
^
bbaxq3yb
can read a blank
can go w/o changing the state
always reading something and writing something
δ(current state, current cell content) = (next state, write to the cell, direction I move)
δ(q0, a) = (q1, a, R)
δ(q0, b) = (q0, x, R)
δ(q1, a) = (q2, y, L)
δ(q1, b) = (q1, x, R)
F = {q2}
see drawing
so you've read at laeast two a's to get to q2
L(TM) = w | # of a's ≥2
.. so as soon as I see two a's .. I don't have to read any further
def.
Lt TM M = (Q, Σ,Γ, δ, q0, [], F)
L(M) = { w in Σ* | q0 w |--* uq1 v
qf in F u, v in Γ*
I dont care what u and v are .. i just care that i got to a final state
if input
q0 abbaba |-- aq1bbaba
|---a x q1 baba
|-- axx q1 aba
|-- axq2 xyba
.. and i'm in a final state
.. we could write a dfa for this .. so overkill
but process the string L to R .. ecause aayy regular language I have to process the string left to right.
let's do one more:
exhibit -- do the |-- stuff
just draw , don't write out δ
L = { w | number of a's is odd }
see drawing
model way mor intuned to how you look at things.
regular languages, if i always continue moving right, I should be able to answer the question
L = { a^n b^n | n ≥1}
how would we do it
cross them off .. ie, change to x's and y's
9.1
aaaaaaaaabbbbbbbbbbb
solve it "recursively"
cross off first a, go to then end and get rid of b
keep reducing the problem down to a smaller problem
-------
Mon 11/03/08
no λ rules
cant have
A→AA→A
.. but if I only loop on λ, i'm not adding to the language
useless -- they introduce a loop that doesn't get me to the end
---------------
Chapter 9
--------
Turing machine TM:
= (Q, Σ, Γ, δ, q0, [], F)
Q states, F subset of Q final states
q0 initial states
[] in Γ, the tape alphabet
Σ is the input alphabet
Σ sbuset of Γ-[]
δ: QxΓ → QxΓx{L|R}
.. this is a partial funciin -- all of the domain is not used
L(M) = { w in Σ* | q0, w |--* (x1,qf, y .. qf in F, a,y in Γ*
so I process and end up in a final state
abqicbca .. instantaneous dscripction .. state, where thh cursor is, and the entie contents of the tape
Turing machines as acceptors.
L = { w in {a,b}* st w contains substring aba }
regular, context free, and [turing] decidable
see attached
always possible to process left to right for a regular lagnuage
-- we're not really writing anything .. just processing
now back to the example from last time
L2 = {a^nb^n | n > 1}
turing thesis says that everything the computer can do , i can do
make a plan.
wann get to some state
[] a a a b b b []
write a blank, run right to blank, step back and find a b, write a [], run left.
so you're really solving the same problem over and over again
see diagram.
test a few strings
ab, aabb, aab, aba
q0ab |-- q[]ab |-- q1ab |-- qr b |-- bqr |-- q2b
|-- qL
|-- q1
!-- qf
q0aab |-- q[]abb |-- q1aab
|-- qr ab
|-- a qr b
|-- abqr
|-- a q2 b
|-- ql a
!-- ql[] a
|-- qone a
|-- qr
|-- q2
.. looking for a b .. now i'm stuck.
abb .. get's stuck at q1 .. becuasse it doesn't have an a
abab .. also dies at q1 .. get's stuck on the middle b
make a plan!
L3 = { w | number of a's is the number of b's}
attached.
remember stack -- could do that plan
can't use our last tape plan
problem is: .. can't write blans .. need to know when i'm done with the input
.. but we can use x's and y's .. or even just x's
so the plan: first char if a, write [], move right until b
at that point, write x, move left to blank and then move right.
if it's a b...
b, [], move righh until a
write, x, move left until blank , move right
if x, write blank, move right.
if blank, accept
see diagram.
not talking efficient here .. could be more efficient on the stack.
where do we die if it's aba .. die looking at the last a .. we make the a a blank and we look for a b .. if we find a blank, we're stuck.
Turing machine as Transducer -
input w ---> output f(w)
..now it matters what's on the tape
q0w |--* qf f(w)
i'm actually computing sumething
example:
input w in {0,1}* output is 2*w in binary
input 101
want output to be 1010
if input is 11, want output to be 110
.. append a zero!
plan .. run right to blank, write 0, run left to blank .. step right .. qf
sometimes our output is in unary ... 11 means 2!
11 .. 11x ... a1x1 .. aax11
then make the a's to 1's
make the x into a 1 and then delete one of the one's
making that little mark is going to be so helpful
rotate
alu .. can add and compare
f(x,y) = {1 x≤y / 0 not ..
can make it a bool
x#y
compare
f(w) = ww^R
can manipulate strings
so this is a transducer
-------
Fri 11/07/08
Theorum:
Lwr