CMPS 260 Spring 2026
HW #4: CFG/CFL; Top-Down Parsing; CYK Algorithm
Due: Friday, April 24

1. Let G be this q-grammar:

S → cKd  |  aM (1) (2)
K → bM  |  aS  |  c (3) (4) (5)
M → bSca  |  λ (6) (7)

(a) Show the grammar's parse table.

`
top of
stack
next input symbol
  a b c d $
S
    
    
    
    
    
K
M

(b) Show the "stack movie" illustrating the steps taken during a parse —guided by the parse table from (a)— of the string abccdca

(c) Let Gc be the grammar resulting from augmenting G by adding production (8): S → λ. Show that Gc is not a q-grammar. Do so by showing a pair of "successful" stack movies, in one of which either production (1) or (2) is applied and in the other of which production (8) is applied in the "same circumstance" (i.e., with S at the top of the stack and either c or a being the next input symbol).

(d) Let Gd be the grammar resulting from augmenting G by adding production (8): K → λ. Is Gd a q-grammar? If it is, explain how the parse table from (a) would be updated to account for this new production. If it is not, explain why not.

(e) (more difficult) Let Ge be the grammar resulting from replacing production (1) in G by the production (1') S → cKb. Show that Ge is not a q-grammar.


2. Consider the following context-free grammar (which produces a nonsense language). As you can infer from its description, its start symbol is S, its terminal alphabet Σ is {a,b,c,d,e}, and its nonterminal set V is {S, H, K, M}.

SHdK  |  a   (1) (2)
HMHd  |  K   (3) (4)
KeSa  |  λ   (5) (6)
Mc  |  bcd   (7) (8)

Nonterminal
Symbol
  FIRST( )     FOLLOW( )  
S    
H
K
M
(a) For each nonterminal A, compute both FIRST(A) and FOLLOW(A). Convey that information by filling in the table to the right. Recall that

FIRST'(A) = { t ∈ Σ | A ⟹+ tα for some α }
FOLLOW(A) = { t ∈ Σ ∪ {$} | S$ ⟹+ αAtβ for some α, β }

Now, if A is nullable (i.e., A ⟹+ λ), then FIRST(A) = FIRST'(A) ∪ {λ}. Otherwise, FIRST(A) = FIRST'(A).

(b) Using the information provided by FIRST( ) and FOLLOW( ), fill in the table below to define a one-symbol-lookahead stack machine parser for the grammar. Each table entry should identify which production (if any) to apply based upon the next input symbol (i.e., the first symbol of the input string that has yet to be consumed) and the nonterminal symbol on top of the stack. (If a terminal symbol is on top of the stack, then of course it gets popped when it is matched with the next input symbol.)

`
top of
stack
next input symbol
  a b c d e $
S
    
    
    
    
    
    
H
K
M

(c) Implement a Recursive Descent parser for the grammar and submit it to the designated Brightspace dropbox. As a model, you can use the Java class RecDescentParser, which provides a recursive descent parser for the LL(1) grammar discussed in a course web page.


3. Consider the following Chomsky Normal Form grammar G3:

S MK  |  b    (1) (2)
K KM  |  MK  |  a    (3) (4) (5)
M KS  |  c    (6) (7)

Use the CYK algorithm to determine whether the string abcac is a member of L(G3).

Specifically, fill in the cells of the matrix pictured below so that the cell in row i and column j contains Vi,j = { X ∈ {S,K,M}  |  X ⟹+ wi,j }, where wi,j is the substring of w (i.e., the string of interest) beginning with its i-th symbol and ending with its j-th symbol (inclusive). (Your answers are expected to include not only the answer to the question "Is w ∈ L(G)?" but also the correctly completed table.)

Recall that, for i satisfying 1≤i≤|w|, X ∈ Vi,i iff X → wi,i is a production in the grammar. Meanwhile, for i and j satisfying 1≤i<j≤|w|, X ∈ Vi,j iff there exists k, where i≤k<j, and nonterminals Y and Z such that Y ∈ Vi,k, Z ∈ Vk+1,j, and X → YZ is a production in the grammar.

   1      2      3      4      5
+------+------+------+------+------+
|      |      |      |      |      |
|      |      |      |      |      | 1
|      |      |      |      |      |
+------+------+------+------+------+
       |      |      |      |      |
       |      |      |      |      | 2
       |      |      |      |      |
       +------+------+------+------+
              |      |      |      |
              |      |      |      | 3
              |      |      |      |
              +------+------+------+
                     |      |      |
                     |      |      | 4
                     |      |      |
                     +------+------+
                            |      |
                            |      | 5
                            |      |
                            +------+

If you prefer it, you can turn the matrix upside down so that the row numbers decrease as you go down the page.

Note that some descriptions of the CYK algorithm are based upon Vi,j being defined to be the set containing those nonterminals that generate the substring of w of length i beginning at position j. A table based upon this definition would be such that each of its rows corresponds to one of the diagonals in the table as described above (and as covered in lecture and the relevant web page). If you find this variation to be easier to understand, feel free to use it, but make sure to state it explicitly in your answer.


4. Use the standard construction (as described in web page) to obtain a Chomsky Normal Form grammar that is equivalent to (i.e., generates the same language as) the context-free grammar shown below. Present the resultant CNF grammar.

S aHHb  |  a
H SHb | abc


5. A linear grammar is a CFG in which every production's right-hand side contains at most one nonterminal symbol. A right-linear grammar is a linear grammar in which any such occurrence of a nonterminal symbol must be the last (i.e., rightmost) symbol of that right-hand side.

(a) Describe a construction that, given a right-linear grammar GR and a linear grammar GLin, produces a linear grammar G such that L(G) = L(GR) · L(GLin).

(b) Apply your construction to these particular grammars and show the resulting linear grammar.

SRbaK | cb
H ⟶acK | λ
K ⟶baSR | aH
SLincaAd | Abc
A ⟶dcSLind | ab
GR GLin