Lecture 3 . ...... Syntactic Formalisms for Parsing Natural Languages Aleš Horák, Miloš Jakubíček, Vojtěch Kovář (based on slides by Juyeon Kang) ia161@nlp.fi.muni.cz Autumn 2013 IA161 Syntactic Formalisms for Parsing Natural Languages 1 / 36 Lecture 3 . ...... Chart parsing IA161 Syntactic Formalisms for Parsing Natural Languages 2 / 36 Lecture 3 Main points CKY algorithm Earley parsing General chart parsing methods IA161 Syntactic Formalisms for Parsing Natural Languages 3 / 36 Lecture 3 Directional or non-directional { Directional top-down Directional bottom-up    Non-directional top-down method – firstly by Unger Non-directional bottom-up – by Cocke, Younger and Kasami (CYK, also CKY) → They access the input in an seemingly arbitrary order, so they require the entire input to be in memory before parsing can start IA161 Syntactic Formalisms for Parsing Natural Languages 4 / 36 Lecture 3 Non-directional top-down methods by Unger Capable of working with the entire class of CFG Expects as input a sentence and a CFG It works by searching for partitionings of the input which match the right hand side(RHS) of production rules. IA161 Syntactic Formalisms for Parsing Natural Languages 5 / 36 Lecture 3 Non-directional top-down methods by Unger Let G denote a CF grammar and w be an input sentence. Principle: if the input sentence w belongs to the language L(G) it must be derivable from the start symbol S of the grammar G. Let S be defined as: S→S1 S2…Sk The input sentence w must be obtainable from the sequence of symbols S1 S2…Sk in a way that S1 must derive a first part of the input, S2 a second part, and so on. S1 S2 Sk W1…wp1 wp1+1…wp2….. wpk−1…wpk IA161 Syntactic Formalisms for Parsing Natural Languages 6 / 36 Lecture 3 Non-directional bottom-up methods as CYK CYK is an example of chart parsing discovered independently by Cocke, Younger and kasami Consider which non-terminals can be used to derive substrings of the input, beginning with shorter strings and moving up to longer strings 1 Start with strings of length one, matching the single character in the input strings against unit productions in the grammar 2 Then considers all substrings of length two, looking for production with right-hand side elements that match the two characters of the substring. 3 Continues up to longer strings IA161 Syntactic Formalisms for Parsing Natural Languages 7 / 36 Lecture 3 Non-directional bottom-up methods as CYK CYK example 2 Two example sentences and their potential analysis He [gave[the young cat][to Bill]]. He [gave [the young cat][some milk]]. The corresponding grammar rules: VP -> Vditrans NP PPto VP -> Vditrans NP VP Regardless of the final sentence analysis, the ditransitive verb (gave) and its first object NP (the young cat) will have the same analysis -> No need to analyze it twice. IA161 Syntactic Formalisms for Parsing Natural Languages 8 / 36 Lecture 3 Non-directional bottom-up methods as CYK Solutions: chart parsing 1 Store analyzed constituents: well formed substring table or (passive) chart 2 Partial and complete analyses: (active) chart In other words, instead of recalculating that the young cat is an NP, we will store that information Dynamic programming: never go backwards IA161 Syntactic Formalisms for Parsing Natural Languages 9 / 36 Lecture 3 CKY algorithm program CKY Parser; begin for p := 1 to n do V[p, 1] := {A|A → ap ∈ P }; for q := 2 to n do for p := 1 to n − q + 1 do V[p, q] = ∅; for k :=1 to q − 1 do V[p, q] = V[p, q] ∪ ∪ {A|A → BC ∈ P, B ∈ V[p, k], C ∈ V[p + k, q − k]}; od od od end Complexity of CKY is O(n3) IA161 Syntactic Formalisms for Parsing Natural Languages 10 / 36 Lecture 3 CKY example input grammar: . Definition .. ...... S → AA|BB|AX|BY|a|b X → SA Y → SB A → a B → b input string w = abaaba. IA161 Syntactic Formalisms for Parsing Natural Languages 11 / 36 Lecture 3 CKY example – solution a b a a b a a . Definition .. ...... S → AA|BB|AX|BY|a|b X → SA Y → SB A → a B → b p – position, q – length q p 1 2 3 4 5 6 1 2 3 4 5 6 IA161 Syntactic Formalisms for Parsing Natural Languages 12 / 36 Lecture 3 CKY example – solution a b a a b a a . Definition .. ...... S → AA|BB|AX|BY|a|b X → SA Y → SB A → a B → b p – position, q – length q p 1 2 3 4 5 6 1 S, A 2 3 4 5 6 IA161 Syntactic Formalisms for Parsing Natural Languages 12 / 36 Lecture 3 CKY example – solution a b a a b a a . Definition .. ...... S → AA|BB|AX|BY|a|b X → SA Y → SB A → a B → b p – position, q – length q p 1 2 3 4 5 6 1 S, A S, B S, A S, A S, B S, A 2 3 4 5 6 IA161 Syntactic Formalisms for Parsing Natural Languages 12 / 36 Lecture 3 CKY example – solution a b a a b a a . Definition .. ...... S → AA|BB|AX|BY|a|b X → SA Y → SB A → a B → b p – position, q – length q p 1 2 3 4 5 6 1 S, A S, B S, A S, A S, B S, A 2 Y 3 4 5 6 IA161 Syntactic Formalisms for Parsing Natural Languages 12 / 36 Lecture 3 CKY example – solution a b a a b a a . Definition .. ...... S → AA|BB|AX|BY|a|b X → SA Y → SB A → a B → b p – position, q – length q p 1 2 3 4 5 6 1 S, A S, B S, A S, A S, B S, A 2 Y X S, X Y X 3 4 5 6 IA161 Syntactic Formalisms for Parsing Natural Languages 12 / 36 Lecture 3 CKY example – solution a b a a b a a . Definition .. ...... S → AA|BB|AX|BY|a|b X → SA Y → SB A → a B → b p – position, q – length q p 1 2 3 4 5 6 1 S, A S, B S, A S, A S, B S, A 2 Y X S, X Y X 3 S ∅ 4 5 6 IA161 Syntactic Formalisms for Parsing Natural Languages 12 / 36 Lecture 3 CKY example – solution a b a a b a a . Definition .. ...... S → AA|BB|AX|BY|a|b X → SA Y → SB A → a B → b p – position, q – length q p 1 2 3 4 5 6 1 S, A S, B S, A S, A S, B S, A 2 Y X S, X Y X 3 S ∅ Y S 4 5 6 IA161 Syntactic Formalisms for Parsing Natural Languages 12 / 36 Lecture 3 CKY example – solution a b a a b a a . Definition .. ...... S → AA|BB|AX|BY|a|b X → SA Y → SB A → a B → b p – position, q – length q p 1 2 3 4 5 6 1 S, A S, B S, A S, A S, B S, A 2 Y X S, X Y X 3 S ∅ Y S 4 X 5 6 IA161 Syntactic Formalisms for Parsing Natural Languages 12 / 36 Lecture 3 CKY example – solution a b a a b a a . Definition .. ...... S → AA|BB|AX|BY|a|b X → SA Y → SB A → a B → b p – position, q – length q p 1 2 3 4 5 6 1 S, A S, B S, A S, A S, B S, A 2 Y X S, X Y X 3 S ∅ Y S 4 X S 5 6 IA161 Syntactic Formalisms for Parsing Natural Languages 12 / 36 Lecture 3 CKY example – solution a b a a b a a . Definition .. ...... S → AA|BB|AX|BY|a|b X → SA Y → SB A → a B → b p – position, q – length q p 1 2 3 4 5 6 1 S, A S, B S, A S, A S, B S, A 2 Y X S, X Y X 3 S ∅ Y S 4 X S ∅ 5 6 IA161 Syntactic Formalisms for Parsing Natural Languages 12 / 36 Lecture 3 CKY example – solution a b a a b a a . Definition .. ...... S → AA|BB|AX|BY|a|b X → SA Y → SB A → a B → b p – position, q – length q p 1 2 3 4 5 6 1 S, A S, B S, A S, A S, B S, A 2 Y X S, X Y X 3 S ∅ Y S 4 X S ∅ 5 ∅ 6 IA161 Syntactic Formalisms for Parsing Natural Languages 12 / 36 Lecture 3 CKY example – solution a b a a b a a . Definition .. ...... S → AA|BB|AX|BY|a|b X → SA Y → SB A → a B → b p – position, q – length q p 1 2 3 4 5 6 1 S, A S, B S, A S, A S, B S, A 2 Y X S, X Y X 3 S ∅ Y S 4 X S ∅ 5 ∅ X 6 IA161 Syntactic Formalisms for Parsing Natural Languages 12 / 36 Lecture 3 CKY example – solution a b a a b a a . Definition .. ...... S → AA|BB|AX|BY|a|b X → SA Y → SB A → a B → b p – position, q – length q p 1 2 3 4 5 6 1 S, A S, B S, A S, A S, B S, A 2 Y X S, X Y X 3 S ∅ Y S 4 X S ∅ 5 ∅ X 6 S IA161 Syntactic Formalisms for Parsing Natural Languages 12 / 36 Lecture 3 CKY online demo http://www.diotavelli.net/people/void/demos/cky.html IA161 Syntactic Formalisms for Parsing Natural Languages 13 / 36 Lecture 3 DCG DCG= Definite Clause Grammars Syntactic shorthand for producing parsers with Prolog clauses: Prolog-based parsing Represent the input with difference lists: two lists with the first containing the input to parse (a suffix of the entire input string) and the second containing the string remaining after a successful parse. These two lists correspond to the input and output variables of the clauses. Each clause corresponds to a non-terminal in the grammar. IA161 Syntactic Formalisms for Parsing Natural Languages 14 / 36 Lecture 3 Earley parser Jay Earley, 1968 Strong resemblance to LR parsing but more dynamic Work with what are called Earley items Earley item is a production augmented with a marker inserted at some point in the production’s right hand side and a number to indicate where in the input matching of the production began. Earley item sets are constructed by applying three operations to the current list of Earley item sets: scanner, predictor, completor IA161 Syntactic Formalisms for Parsing Natural Languages 15 / 36 Lecture 3 Earley algorithm Repeat until no new item can be added: 1 Prediction For every state in agenda of the form (X → α • Y β, j), add (Y → • γ, k) to agenda for every production in the grammar with Y on the left-hand side (Y → γ). 2 Scanning If a is the next symbol in the input stream, for every state in agenda of the form (X → α • a β, j), add (X → α a • β, j) to agenda. 3 Completion For every state in agenda of the form (X → γ •, j), find states in agenda of the form (Y → α • X β, i) and add (Y → α X • β, i) to agenda. IA161 Syntactic Formalisms for Parsing Natural Languages 16 / 36 Lecture 3 Earley algorithm Earley’s example A pointed rule (Marker) is a production increased by a point. The point indicates the current state of application of the rule The girl speaks S->•GN GV S->GN•GV GN-> • GN GNP GN->GN•GNP 1 2 3 4 DET->the. N->girl. V->speaks. IA161 Syntactic Formalisms for Parsing Natural Languages 17 / 36 Lecture 3 Earley algorithm 4 S->NP•VP V -> speaks• 3 S->NP•VP, NP->NP•NPP N -> girl• 2 DET->the•, NP->DET•N 1 2 3 The girl speaks IA161 Syntactic Formalisms for Parsing Natural Languages 18 / 36 Lecture 3 Chart parsing The Earley parser can be modified to work bottom-up or head-corner ⇒ a variety of chart parsing algorithms (Kay, 1980) IA161 Syntactic Formalisms for Parsing Natural Languages 19 / 36 Lecture 3 Chart parsing Three basic approaches: top-down bottom-up head-driven No constraints on the CF grammar Chart parsers usually contain two data structures chart and agenda, both of contain which contain edges. Edge is a triple [A→ α•β, i, j], where: i, j ∈ N, 0 ≤ i ≤ j ≤ n for n input words A → αβ is a grammar rule 0 a 1 b 2 a 3 a 4 b 5 a 6 [A → BC • DE, 0, 3] IA161 Syntactic Formalisms for Parsing Natural Languages 20 / 36 Lecture 3 General chart parser program Chart Parser; begin initialize (CHART); initialize (AGENDA); while (AGENDA not empty) do E := take edge from AGENDA; for each (edge F, which can be created by the edge E and another edge from CHART) do if ((F is not in AGENDA) and (F is not in CHART) and (F is different from E) then add F to AGENDA; fi; od; add E to CHART; od; end; IA161 Syntactic Formalisms for Parsing Natural Languages 21 / 36 Lecture 3 Top-down approach Initialization: ∀ p ∈ P | p = S → α add edge [S→ •α, 0, 0] to agenda. startup chart is empty. Iteration – take edge E from agenda and then: a) (fundamental rule) if E is in the form of [A→ α•, j, k], then for each edge [B→ γ• A β, i, j] in the chart, create an edge [B→ γ A •β, i, k]. b) (closed edges) if E is in the form of [B→ γ• A β, i, j], then for each edge [A→ α•, j, k] in the chart, create an edge [B → γ A •β, i, k]. c) (read terminal) if E is in the form of [A→ α•aj+1β, i, j], create an edge [A → α aj+1•β, i, j+1]. d) (prediction) if E is in the form of [A→ α• B β, i, j] then for each grammar rule B→ γ ∈ P, create an edge [B→ • γ, i, i]. IA161 Syntactic Formalisms for Parsing Natural Languages 22 / 36 Lecture 3 Example – chart parsing Grammar: S → CLAUSE CLAUSE → V OPTPREP N OPTPREP → ϵ OPTPREP → PREP V → jel PREP → kolem N → domu N → kolem Sentence: ”jel kolem domu” (a1=jel, a2=kolem, a3=domu). IA161 Syntactic Formalisms for Parsing Natural Languages 23 / 36 Lecture 3 Example – chart after top-down analysis 0 1 2 3 jel kolem domu IA161 Syntactic Formalisms for Parsing Natural Languages 24 / 36 Lecture 3 Example – chart after top-down analysis 0 1 2 3 jel kolem domu S → •CLAUSE IA161 Syntactic Formalisms for Parsing Natural Languages 24 / 36 Lecture 3 Example – chart after top-down analysis 0 1 2 3 jel kolem domu S → •CLAUSE CLAUSE → •V OPTPREP N IA161 Syntactic Formalisms for Parsing Natural Languages 24 / 36 Lecture 3 Example – chart after top-down analysis 0 1 2 3 jel kolem domu S → •CLAUSE CLAUSE → •V OPTPREP N V → •jel IA161 Syntactic Formalisms for Parsing Natural Languages 24 / 36 Lecture 3 Example – chart after top-down analysis 0 1 2 3 jel kolem domu S → •CLAUSE CLAUSE → •V OPTPREP N V → •jel V → jel• IA161 Syntactic Formalisms for Parsing Natural Languages 24 / 36 Lecture 3 Example – chart after top-down analysis 0 1 2 3 jel kolem domu S → •CLAUSE CLAUSE → •V OPTPREP N V → •jel V → jel• CLAUSE → V • OPTPREP N IA161 Syntactic Formalisms for Parsing Natural Languages 24 / 36 Lecture 3 Example – chart after top-down analysis jel kolem domu 00 11 22 33 NN→ domu.PREP→ kolem.VV→ jel. NN→ kolem. SS→ .CLAUSE CLAUSE→ V OPTPREP . N OPTPREP→ PREP.. CLAUSE→ V . OPTPREP N CLAUSE→ . V OPTPREP N OPTPREP→ .. OPTPREP→ .PREP CLAUSE→ V OPTPREP . N CLAUSE→ V OPTPREP N . SS→ CLAUSE . SS→ CLAUSE . CLAUSE→ V OPTPREP N . IA161 Syntactic Formalisms for Parsing Natural Languages 24 / 36 Lecture 3 Bottom-up approach Initialization: ∀ p ∈ P | p = A→ ϵ add edges [A→ •, 0, 0], [A→ •, 1, 1], ..., [A→ •, n, n] to agenda. ∀ p ∈ P | p = A→ aiα add edge [A→ •aiα, i-1, i-1] to agenda. startup chart is empty. Iteration – take an edge E from agenda and then: a) (fundamental rule) if E is in the form of [A→ α•, j, k], then for each edge [B→ γ• A β, i, j] in the chart, create an edge [B→ γ A •β, i, k]. b) (closed edges) if E is in the form of [B→ γ• A β, i, j], then for each edge [A→ α•, j, k] in the chart, create an edge [B → γ A •β, i, k]. c) (read terminal) if E is in the form of [A→ α•aj+1β, i, j], then create an edge [A → α aj+1•β, i, j+1]. d) (prediction) if E is in the form of [A→ α•, i, j], then for each grammar rule B→Aγ create an edge [B→ •Aγ, i, i]. IA161 Syntactic Formalisms for Parsing Natural Languages 25 / 36 Lecture 3 Head-driven chart parsing Rule head – any particular right-hand side non-terminal E.g. in the rule CLAUSE → V OPTPREP N heads can be V, OPTPREP, N. An edge is a triple [A→ α•β•γ, i, j], where i, j ∈ N, 0 ≤ i ≤ j ≤ n for n input words, A→ αβγ is a grammar rule and the head is in β. The algorithm (bottom-up approach) is very similar to the previous simpler one. The analysis does not go left to right, but begins on the head of each rule instead. IA161 Syntactic Formalisms for Parsing Natural Languages 26 / 36 Lecture 3 Head-driven chart parsing Initialization ∀ p ∈ P | p = A→ ϵ add edges [A→ ••, 0, 0], [A→ ••, 1, 1], ..., [A→ ••, n, n] to agenda. ∀ p ∈ P | p = A→ αaiβ (ai is rule head) add edge [A→ α•ai•β, i-1, i] to agenda. startup chart is empty. IA161 Syntactic Formalisms for Parsing Natural Languages 27 / 36 Lecture 3 Head-driven chart parsing Iteration – take and edge E from agenda and then: a1) if E is in the form of [A→ •α•, j, k], then for each edge [B→ β•γ•Aδ, i, j] in the chart, create edge [B→ β•γA•δ, i, k]. a2) [B→ βA•γ•δ, k, l] in the chart, create edge [B→ β•Aγ•δ, j, l]. b1) if E is in the form of [B→ β•γ•Aδ, i, j], then for each edge [A→ •α•, j, k] in the chart, create edge [B→ β•γA•δ, i, k]. b2) if E is in the form of [B→ βA•γ•δ, k, l], then [A→ •α•, j, k] in the chart, create edge [B→ β•Aγ•δ, j, l]. c1) if E is in the form of [A→ βai•γ•δ, i, j], then create edge [A→ β•aiγ•δ, i-1, j]. c2) if E is in the form of [A→ β•γ•aj+1δ, i, j], then create edge [A→ β•γaj+1•δ, i, j+1]. d) if E is in the form of [A→ •α•, i, j], then for each grammar rule B→ β A γ create edge [B→ β•A•γ, i, j] (A is rule head). IA161 Syntactic Formalisms for Parsing Natural Languages 28 / 36 Lecture 3 Generalized LR method by Tomita Tomita’s Algorithm extends the standard LR parsing algorithm: LR parsing is very efficient, but can only handle a small subset of CFG can handle arbitrary CFG LR efficiency is preserved In order to keep a record of the parse-state, we maintain a stack consisting of symbol/state pairs. IA161 Syntactic Formalisms for Parsing Natural Languages 29 / 36 Lecture 3 Generalized LR method by Tomita generalized LR parser (GLR) Masaru Tomita: Efficient parsing for natural language, 1986 uses a standard LR table which may contain conflicts stack is represented as a DAG reduction performed before reading action IA161 Syntactic Formalisms for Parsing Natural Languages 30 / 36 Lecture 3 Tree ranking all chart parsing methods: parallelization as means of fighting the ambiguity key concept: a polynomial data structure holding up to exponential parse trees efficient algorithms to retrieve n-best trees according to some ranking enable taking into account a probabilistic notion of a sentence IA161 Syntactic Formalisms for Parsing Natural Languages 31 / 36 Lecture 3 PCFG = Probabilistic CFG each rule r ∈ R has a probability P(r) assigned probability of a tree t ∈ T usually computed as P(t) = Πr∈tP(r) ⇒ tbest = argmaxt(P(t)) IA161 Syntactic Formalisms for Parsing Natural Languages 32 / 36 Lecture 3 Statistical parsing CFG → PCFG → learned grammar → statistical parsing → how to obtain probabilities (= how to train the parser?) IA161 Syntactic Formalisms for Parsing Natural Languages 33 / 36 Lecture 3 Statistical NLP In the 90’s: a change of paradigm in (computational) linguistics from rationalism to empiricism (corpus-based evidence) Simultaneously in NLP: big development of language modelling and statistical methods based on machine learning (both supervised and unsupervised). → statistical parsing vs. Chomsky: It must be recognised that the notion of a ‘probability of a sentence’ is an entirely useless one, under any interpretation of this term (Chomsky, 1969) [taken from Chapter 1 of Young and Bloothooft, eds, Corpus-Based Methods in Language and Speech Processing] IA161 Syntactic Formalisms for Parsing Natural Languages 34 / 36 Lecture 3 Summary (Probabilistic) Context-free grammar used in parsing natural language Chart parsing methods: CKY, Earley, head-driven chart parsing IA161 Syntactic Formalisms for Parsing Natural Languages 35 / 36 Lecture 3 References H. Bunt, M. Tamita: Recent advances in parsing technology, Kluwer, 1996 H. Bunt, P. Merlo, & J. Nivre (eds.): Trends in Parsing Technology: Dependency Parsing, Domain Adaptation, and Deep Parsing, Springer Dordrecht, Heidelberg/London/New York 2010 G. Dick: Parsing techniques: a practical guide, Springer, 2008 J. Earley: An efficient context-free parsing algorithm. Communications of the ACM, 13(2):94–102, 1970 M. Kay: Algorithm schemata and data structures in syntactic processing. In Readings in natural language processing, pages 35–70. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1986 M.-J. Nederhof: Generalized left-corner parsing. In Proceedings of the sixth conference on European chapter of the Association for Computational Linguistics, pages 305–314, Morristown, NJ, USA, 1993. Association for Computational Linguistics. IA161 Syntactic Formalisms for Parsing Natural Languages 36 / 36