.
......
Syntactic Formalisms for Parsing
Natural Languages
Aleš Horák, Miloš Jakubíček, Vojtěch Kovář
(based on slides by Juyeon Kang)
ia161@nlp.fi.muni.cz
Autumn 2013
IA161 Syntactic Formalisms for Parsing Natural Languages 1 / 476
CZ.1.07/2.2.00/28.0041
Centrum interaktivních a multimediálních studijních opor pro inovaci výuky a efektivní učení
IA161 Syntactic Formalisms for Parsing Natural Languages 2 / 476
Introducing
Course objective
Introducing
theoretical backgrounds on parsing
parsing methods focused on syntax
practical implementation methods
possible applications and evaluations
IA161 Syntactic Formalisms for Parsing Natural Languages 3 / 476
Introducing
Course syllabus
PART I : Theoretical backgrounds
Historical overview
State of the art parsing methods and trends
Advanced syntactic formalisms
PART II : Practical applications
Applications & Use Cases
Practical Implementations
Parsing Evaluation
IA161 Syntactic Formalisms for Parsing Natural Languages 4 / 476
Introducing
Course format
Weekly lectures (2 hours)
Final written exam
Two homework assignments
Grading
Final exam: 60 points
Each homework: 20 points
For each homework 10 % top scoring individuals
receive 5 bonus points
Points required for colloquium: 60 points
IA161 Syntactic Formalisms for Parsing Natural Languages 5 / 476
Lecture 1
.
......
Introductive and Historical Overview
on Natural Languages Parsing
IA161
Syntactic Formalisms for
Parsing Natural Languages
IA161 Syntactic Formalisms for Parsing Natural Languages 6 / 476
Lecture 1
Main points
Introduction to Natural Language Processing
Issues in Syntax
What is a parsing?
Overview of Parsing methods and trends
IA161 Syntactic Formalisms for Parsing Natural Languages 7 / 476
Lecture 1
Why natural language processing ?
Huge amounts of data from Internet and Intranet
Applications for processing large amounts of texts need NLP
expertise
Classify text into categories
Index and search large texts
Automatic translation
Speech recognition
Information extraction
Automatic summarization
Question answering
Knowledge acquisition
Text generation/dialogues
IA161 Syntactic Formalisms for Parsing Natural Languages 8 / 476
Lecture 1
History of Natural Language Processing
1948 – 1st NLP application?
dictionary look-up system by Andrew Booth,
for machine translation purposes
developed at Birkbeck College, London
University
IA161 Syntactic Formalisms for Parsing Natural Languages 9 / 476
• So far, it turns out, they have not considered at all the problem of multiple
meaning (!), and have been concerned only with the mechanics of looking up
words in a dictionary. First, you sense the ﬁrst letter of a word, and then
have the machine see whether or not the memory contains precisely the word
in question. If so, the machine simply produces the translation (…) of this
word. If this exact word is not contained in the memory, then the machine
discards the last letter of the word, and tries over. If this fails, it discards
another letter, and tries again. After it has found the largest initial
combination of letters which is in the dictionary, it “looks up” the whole
discarded portion in a special “grammatical annex” of the dictionary. Thus
confronted by “running,” it might ﬁnd “run” and then ﬁnd out what the
ending (n)ing does to “run.” (Warren Weaver on Booth’s machine)
• This ﬁrst application shows how closely NLP stands to the origins of
computer science.
• Booth was formerly (during WWII) doing research on X-ray crystallography
of explosives. This involved lots of arithmetics, hence after WWII he tried to
develop electronic computers, the ﬁrst was an Automatic Relay Calculator
(ARC) – 1946.
• In the same year he was funded by Rockefeller Foundation (RF) to visit US
researchers, reported that only von Neumann gave him any time. He got in
love with (and later married) von Neumann’s research assistant Kathleen and
redesigned ARC according to von Neumann’s architecture.
• In 1947 he visited RF’s Natural Sciences Division Director Warren Weaver,
who refused to fund a computer for mathematical calculations, but suggested
funding a computer for machine translation of natural languages (!).
• Booth developed techniques for parsing text and also for building dictionaries.
November 11, 1955 Booth gave an early public demonstration of natural
language machine translation (in Figure).
• Later on Booth was very successful in building computers, his wife Kathleen
was programming them and wrote one of the ﬁrst books on programming.
• 1958 Kathleen did research on simulating neural networks to investigate ways
in which animals recognise patterns, 1959 then a neural network for character
recognition.
Lecture 1
History of Natural Language Processing
IA161 Syntactic Formalisms for Parsing Natural Languages 10 / 476
Lecture 1
History of Natural Language Processing
1949 – Warren Weaver
Natural Sciences Division Director in the Rockefeller
Foundation
Mathematician, Science Advocate
WWII code breaker
He viewed Russian as English in code – the
”Translation” memorandum
Also knowing nothing oﬃcial about, but having guessed and inferred
considerable about powerful new mechanized methods in
cryptography – methods which I believe succeed even when one
does not know what language has been coded – one naturally
wonders if the problem of translation could conceivably be treated
as a problem in cryptography. When I look at an article in Russian, I
say “This is really written in English, but it has been coded in some
strange symbols. I will now proceed to decode.”
IA161 Syntactic Formalisms for Parsing Natural Languages 11 / 476
• Weaver was one of the Machine Translation pioneers and one of the most
important science managers at the time. All the time he was meeting
scientists, putting them together, organizing funding, and investigating
potential research areas; while being a top-scientist – in 1949 he co-authored
the The Mathematical Theory of Communication with Claude Shannon.
Lecture 1
History of Natural Language Processing
1966 – Over-promised under-delivered
Machine Translation worked only word by word
NLP brought the ﬁrst hostility of research funding agencies
NLP gave AI a bad name before AI had a name.
All funding of NLP came to a grinding halt due to the infamous
ALPAC report.
Public spent 20 million with very limited outcomes.
1966–1976 – “A lost decade”
Revival in 1980’s
Martin Kay: The Proper Place of Men and Machines in Language
Translation
IA161 Syntactic Formalisms for Parsing Natural Languages 12 / 476
• ALPAC (Automatic Language Processing Advisory Committee) was a
committee of seven scientists led by John R. Pierce, established in 1964 by the
U. S. Government in order to evaluate the progress in computational
linguistics in general and machine translation in particular. Its report, issued
in 1966, gained notoriety for being very skeptical of research done in machine
translation so far, and emphasizing the need for basic research in
computational linguistics; this eventually caused the U. S. Government to
reduce its funding of the topic dramatically.
• ALPAC’s ﬁnal recommendations were that research should be supported on:
• 1. practical methods for evaluation of translations;
• 2. means for speeding up the human translation process;
• 3. evaluation of quality and cost of various sources of translations;
• 4. investigation of the utilization of translations, to guard against production
of translations that are never read;
• 5. study of delays in the over-all translation process, and means for
eliminating them, both in journals and in individual items;
• 6. evaluation of the relative speed and cost of various sorts of machine-aided
translation;
• 7. adaptation of existing mechanized editing and production processes in
translation;
• 8. the over-all translation process; and
• 9. production of adequate reference works for the translator, including the
adaptation of glossaries that now exist primarily for automatic dictionary
look-up in machine translation
• Kay’s counterargument: “The goal of MT should not be the fully automatic
high quality translation (FAHQT) that can replace human translators.
Instead, MT should adopt less ambitious goals, e.g. more cost-eﬀective
human-machine interaction and aim at enhancement of human translation
productivity.”
Lecture 1
NLP looked to Linguistics
Linguistics is language described, not prescribed.
Linguistics had few applicable theories for Machine Translation
1957 – Noam Chomsky’s Syntactic Structures revolutionized
Linguistics as it applies to Machine Translation.
Rule based system of syntactic structures.
Believed there are features common to all
languages that enable people to speak
creatively and freely.
Hypothesized all children go through the
same stages of language development
regardless of the language they are learning
– a concept of an innate Universal Grammar
(never proven)
One of the most prominent persons of NLP in
20th
century, though very controversial.
IA161 Syntactic Formalisms for Parsing Natural Languages 13 / 476
• Avram Noam Chomsky (born December 7, 1928) is an American linguist,
philosopher, cognitive scientist, logician, and political commentator and
activist. Working for most of his life at the Massachusetts Institute of
Technology (MIT), where he is currently Professor Emeritus, he has authored
over 100 books on various subjects.
• He is credited as the creator or co-creator of the Chomsky hierarchy, the
universal grammar theory, and the Chomsky–Schützenberger theorem.
Chomsky is also well known as a political activist, and a leading critic of U.S.
foreign policy, state capitalism, and the mainstream news media. Ideologically,
he aligns himself with anarcho-syndicalism and libertarian socialism.
• Highly inﬂuential, between 1980 and 1992, Chomsky was cited within the ﬁeld
of Arts and Humanities more often than any other living scholar, and eighth
overall within the Arts and Humanities Citation Index during the same
period. He has been described as a prominent cultural ﬁgure, and was voted
the ”world’s top public intellectual” in a 2005 poll.
Lecture 1
NLP looked to Linguistics
1958 – Bar-Hillel report
Concluded Fully-Automatic High-Quality Translation (FAHQT) could
not be accomplished without human knowledge.
1968 – Case Grammar (Fillmore)
“The case for case” paper
Later evolved into Frame Semantics
1970 – Augmented Transition Networks (Woods)
Procedural Semantics – Theory of the “meaning” of sentence.
Augmented Transition Network (ATN) parser
IA161 Syntactic Formalisms for Parsing Natural Languages 14 / 476
• Yehoshua Bar-Hillel was charged by the US Oﬃce of Naval Research to make
a critical assessment of current MT activity in the United States and Great
Britain. Agencies of the US government were funding research on a large and
increasing scale.
• Bar-Hillel had been the ﬁrst person appointed full-time to work on MT at the
Massachusetts Institute of Technology in 1951.
• His basic contention was that MT research – now a “multimillion dollar
aﬀair”, as he pointed out – was, with few exceptions, set on a mistaken and
unattainable goal, namely, fully automatic translation of a quality equal to
that of a good human translator. This he held to be utterly unrealistic, and in
his view resources were being wasted which could be more fruitfully be
devoted to the development of less ambitious and more practical computer
aids for translators.
• Woods: 2010 ACL Lifetime Achievement Award
• all within the LUNAR project of NASA Apollo 11 Program
• ATN parser was the ﬁrst properly formalized hand-built parser that could
produce Chomsky-style deep structure analyses for a large, ﬂuent subset of
natural English.
Lecture 1
Successful NLP Systems
1970 – Shrdlu (Terry Winograd)
Human: Pick up a big red block.
Computer: OK.
Human: Grasp the pyramid.
Computer: I don’t understand which pyramid you mean.
Human (changing their mind): Find a block which is taller than the
one you are holding and put it into the box.
Computer: By ”it”, I assume you mean the block which is taller than
the one I am holding.
Computer: OK.
Human: What does the box contain?
Computer: The blue pyramid and the blue block.
Human: What is the pyramid supported by?
Computer: The box.
Human: How many blocks are not in the box?
Computer: Four of them.
Human: Is at least one of them narrower than the one which I told
you to pick up?
Computer: Yes, the red cube.
IA161 Syntactic Formalisms for Parsing Natural Languages 15 / 476
• SHRDLU was an early natural language understanding computer program,
developed by Terry Winograd at MIT in 1968–1970. In it, the user carries on
a conversation with the computer, moving objects, naming collections and
querying the state of a simpliﬁed ”blocks world”, essentially a virtual box
ﬁlled with diﬀerent blocks. SHRDLU was written in the Micro Planner and
Lisp programming language on the DEC PDP-6 computer and a DEC
graphics terminal. Later additions were made at the computer graphics labs
at the University of Utah, adding a full 3D rendering of SHRDLU’s ”world”.
• The name SHRDLU was derived from ETAOIN SHRDLU, the arrangement of
the alpha keys on a Linotype machine, arranged in descending order of usage
frequency in English.
Lecture 1
Successful NLP Systems II
1973 – Lunar question answering system (Woods)
WHAT IS THE AVERAGE CONCENTRATION OF ALUMINUM IN HIGH
ALKALI ROCKS?
WHAT SAMPLES CONTAIN P200?
GIVE ME THE MODAL ANALYSES OF P200 IN THOSE SAMPLES
GIVE ME EU DETERMINATIONS IN SAMPLES WHICH CONTAIN ILM
IA161 Syntactic Formalisms for Parsing Natural Languages 16 / 476
• LUNAR is an experimental natural language, information retrieval system. It
was designed to help geologists access, compare, and evaluate
chemical-analysis data on moon rock and soil composition obtained from the
Apollo-11 mission. The primary goal of the designers was research on the
problems involved in building a man-machine interface that would allow
communicate in ordinary English, A ”real world” application was chosen for
two reasons: First, it tends to focus eﬀort on the problems really in need of
solution (sometimes this is implicitly avoided in ”toy” problems) and second,
the possibility of producing a system capable of performing a worthwhile task.
• LUNAR system operates by translating a question entered in English into an
expression in a formal query language (Codd, 1974). The translation is done
with an augmented transition network (ATN) parser coupled with a
rule-driven semantic interpretation procedure, which guides the analysis of
the question.
• The ”query” that results from this analysis is then applied to the database to
produce the answer to the request,The query language is a generalization of
the predicate calculus. Its central feature is a quantiﬁer function that is able
to express, in a simple manner, the restrictions placed on a database-retrieval
request by the user. The function is used in concert with special enumeration
functions for classes of database objects, freeing the quantiﬁer function from
explicit dependence on the structure of the database. LUNAR also served as
a foundation for the early work on speech understanding at BBN.
• The formal query language used by LUNAR system contains three types of
objects: designators, which name classes of objects in the database (including
functionally deﬁned objects); propositions, which are formed from predicates
with designators as arguments; and commands, which initiates actions.
• Request: (DO MY SAMPLES HAVE GREATER THAN 13 PERCENT
ALUMINIUM
Query Language Translation (after parsing):
(TEST (FOR SOME X1 / (SEQ SAMPLES) : T ; (CONTAIN X1
NPR* X2 / ’AL203) (GREATERTHAN 13 PCT))))
Response :
YES
• LUNAR processes these request in the following three steps:
1. Syntactic analysis using an augmented transition network parser and heuristic
information (including semantics) to produce the most likely derivation tree for
the request;
2. Semantic interpretation to produce a representation of the meaning of the
request in a formal query language;
3. Execution of the query language expression on the database to produce the
answer to the request.
• LUNAR’s language processor contains an ATN grammar for a large subset of
English, the semantic rules for interpreting database requests, and a
dictionary of approximately 3,500 words. As an indication of the capabilities
of the processor, it is able to deal with tense and modality, some anaphoric
references and comparatives, restrictive relative clauses, certain adjective
modiﬁers and embedded complement constructions.
Lecture 1
Successful NLP Systems III
1976 – TAUM-METEO (University of Montreal)
prototype MT system for translating weather forecasts between
English and French
1985 – METEO (John Chandioux)
successor of TAUM-METEO
in operational use at Environnement Canada forecasts until 30th
of September 2001
1970 – SYSTRAN
provided translations for US Air Force’s Foreign Technology
Division
adopted by XEROX (1978)
still developed, present in wide range of systems
Google language tools
Microsoft spell check
IA161 Syntactic Formalisms for Parsing Natural Languages 17 / 476
• The METEO System is a Very High Quality Machine Translation system for
weather bulletins that has been in operational use at Environnement Canada
from 1982 to 2001. It stems from a prototype developed in 1975-76 by the
TAUM Group, known as TAUM-METEO. As many authors confuse the
prototype with the actual system, a bit of history is in order.
• The initial motivation to develop that prototype was that a junior translator
came to TAUM to ask for help in doing the extremely boring (and at the
same time diﬃcult) job of translating weather bulletins at Environment
Canada he had to do at the moment.
• Indeed, since all oﬃcial communications emanating from the Canadian
government must be available in French and English, because of the oﬃcial
bilingual services act of 1968, and weather bulletins represent a large amount
of translation in real time, junior translators had to spend several months of
purgatory producing ﬁrst draft translations, then revised by seniors. That
was in fact a quite diﬃcult job, because of the speciﬁcities of the English and
French sublanguages used, and not very motivating, as the lifetime of a
bulletin is only 4 hours.
Lecture 1
Major Issues in NLP
Ambiguity in Language:
Syntactic (structural)
Semantic (word sense)
Referential
IA161 Syntactic Formalisms for Parsing Natural Languages 18 / 476
Lecture 1
Ambiguity Makes NLP diﬃcult
Structural/Syntactic ambiguity
I saw the Grand Canyon ﬂying to New York.
I saw the sheep grazing in the ﬁeld.
Word Sense ambiguity
The man went to the bank to get some cash.
The man went to the bank and jumped in the river.
Referential ambiguity
Steve hated Paul. He hit him.
He = Steve ? or he = Paul ?
IA161 Syntactic Formalisms for Parsing Natural Languages 19 / 476
Lecture 1
Linguistics levels of analysis
Speech
Written language
Phonetics
Phonology
Morphology
Syntax
Semantics
Beyond: pragmatic, cognitive, logic…
Each level has an input and output representation, output
from one level is the input to the next, sometimes levels might
be skipped (merged) or split.
IA161 Syntactic Formalisms for Parsing Natural Languages 20 / 476
Lecture 1
Issues in syntax
Propagation of errors from lower levels – mainly morphology,
need to correctly identify the part of speech (POS)
“The man did his homework”
Who did what?
man=noun; did=verb; his=genitive; homework=noun
Identify collocations
Mother in law, hot dog, …
IA161 Syntactic Formalisms for Parsing Natural Languages 21 / 476
Lecture 1
More issues in Syntax
Anaphora resolution
“The son of my professor entered my class. He scared me.”
Preposition attachment
“I saw the man in the park with a telescope.“
IA161 Syntactic Formalisms for Parsing Natural Languages 22 / 476
Lecture 1
Syntax input and output
Input: sequence of pairs (lemma, (morphological) tag)
Output: sentence structure (tree) with annotated nodes (all
lemmas, (morpho-syntactic tags, functions ) of various forms
Deals with:
The relation between lemmas & morphological categories and the
sentence structure use syntactic categories such as subject, verb,
object,…
IA161 Syntactic Formalisms for Parsing Natural Languages 23 / 476
Lecture 1
Syntactic representation
Tree structure
Two main ideas for the tree
Phrase structure (derivation tree)
Using bracketed grouping
Brackets annotated by phrase type
Heads (often) explicitly marked
Dependency structure
Basic relation: head (governor) – dependent
Links annotated by syntactic functions
Phrase structure implicitly present
IA161 Syntactic Formalisms for Parsing Natural Languages 24 / 476
Lecture 1
Dependency Tree vs. PS Tree
IA161 Syntactic Formalisms for Parsing Natural Languages 25 / 476
Lecture 1
Shallow parsing
“the man chased the bear”
“the man” “chased the bear”
Subject - - Predicate
Identify basic structures
NP-[the man] VP-[chased the bear]
IA161 Syntactic Formalisms for Parsing Natural Languages 26 / 476
Lecture 1
Full parsing
“John loves Mary“
S(Loves(John, Mary))
VP(∃x Loves(x, Mary))
Verb(∃y ∃x Loves(x, y))
loves
NP(John))
Name(John)
John
NP(Mary)
Name(Mary)
Mary
Help ﬁguring out automatically questions like who did what
and when?
IA161 Syntactic Formalisms for Parsing Natural Languages 27 / 476
Lecture 1
What is a natural language parsing ?
One of the most commonly researched tasks in Natural
Language Processing (NLP)
Parsing, in traditional sense, is what happens when
a student takes the words of a sentences one by one, assigns
each to a part of speech, speciﬁes its grammatical categories, and
lists the grammatical relations between words (identifying subject
and various types of object for
a verb, specifying the word with which some other word agrees,
and so on).
IA161 Syntactic Formalisms for Parsing Natural Languages 28 / 476
Lecture 1
Characteristics of parsing
Much of the history of parsing until a few decades ago can be
understood as the direct consequence of the history of
theories of grammar:
Parsing is done by human beings, rather than by physical
machines or abstract machine
What is parsed is a bit of natural language, rather than
a bit of some language-like symbolic system
Parsing is heuristic rather than algorithmic
IA161 Syntactic Formalisms for Parsing Natural Languages 29 / 476
Lecture 1
New notions of parsing
In the second half of 20th
century the parsing has come to be
extended to a large collection of operations in relation with
theoretical linguistics, formal language theory, computer
science, artiﬁcial intelligence and psycholinguistics:
Parsing is the syntactic analysis of languages.
The objective of Natural Language Parsing is
to determine parts of sentences (such as verbs, noun phrases, or
relative clauses), and the relationships between them (such as
subject or object).
Unlike parsing of formally deﬁned artiﬁcial languages (such as
Java or predicate logic), parsing of natural languages presents
problems due to ambiguity, and the productive and creative use
of language.
IA161 Syntactic Formalisms for Parsing Natural Languages 30 / 476
Lecture 1
Parsing
The grammar for Natural Language is ambiguous and typical
sentences have multiple possible analyses (syntactically and
semantically).
Some parsing tools (i.e. grammatical, morphologic, syntactic,
statistic, probabilistic, heuristic, …) help to ﬁnd the most
plausible parse tree of a given sentence.
IA161 Syntactic Formalisms for Parsing Natural Languages 31 / 476
Lecture 1
Practical function of a parsing
Parsing can tell us when a sentence is in a language deﬁned by
a grammar
Parsing makes the extraction of the information possible by
identifying relations between words, or phrases in sentences.
IA161 Syntactic Formalisms for Parsing Natural Languages 32 / 476
Lecture 1
Practical function of a parsing
Parsers are being used in a number of disciplines:
In computer science
Compiler construction, database interfaces, self-describing
databases, artiﬁcial intelligence…
In linguistics
Text analysis, corpora analysis, machine translation…
In document preparation and conversion
In typesetting chemical formulae
In chromosome recognition
IA161 Syntactic Formalisms for Parsing Natural Languages 33 / 476
Lecture 1
Practical function of a parsing
However,
Many diﬀerent possible syntactic formalisms:
Regular expressions, Context-free grammars, Context-sensitive
grammars, …
Many diﬀerent ways of representing the results of parsing:
Parse tree, Chart, Graph, …
IA161 Syntactic Formalisms for Parsing Natural Languages 34 / 476
Lecture 1
Historical overview of parsing methods
Basically two ways to parse a sentence
Top-down vs. Bottom-up
We can characterize the search strategy of parsing algorithms in
terms of the direction in which a structure is built:
from the words upwards (bottom-up) or
from the root node downwards (top-down)
IA161 Syntactic Formalisms for Parsing Natural Languages 35 / 476
Lecture 1
Historical overview of parsing methods
Directionality in these two ways
Directional vs. Non-directional
Non-directional top-down methods by S. Unger (1968)
Non-directional bottom-up methods by CYK
Directional top-down methods:
The predict/match automaton, Depth-ﬁrst search (backtrack),
Breadth-ﬁrst search (Greibach), Recursive descent, Deﬁnite Clause
grammars
Directional bottom-up methods:
The shift/reduce automaton, Depth-ﬁrst search (backtrack),
Breadth-ﬁrst search, restricted by Earley(1970)
IA161 Syntactic Formalisms for Parsing Natural Languages 36 / 476
Lecture 1
Historical overview of parsing methods
Methods originating at parsing of formal languages
Linear directional top-down methods:
LL(K)
Linear directional bottom-up methods:
Precedence, bounded-context, LR (k), LALR(1), SLR(1)
Methods speciﬁcally devised for parsing of natural languages
Generalized LR (Masaru Tomita)
Chart parsing (Martin Kay)
IA161 Syntactic Formalisms for Parsing Natural Languages 37 / 476
Lecture 1
Summary
Natural language parsing as one of the NLP domain
Extended notion of parsing in relation with diﬀerent ﬁelds
Ambiguity of language
What is it to “parse”?
Overview of basic parsing methods
IA161 Syntactic Formalisms for Parsing Natural Languages 38 / 476
Lecture 1
References I
H. Bunt, J. Carroll & G. Satta (eds.): New Developments in Parsing
Technology, Kluwer, Dordrecht/Boston/London 2004
H. Bunt, P. Merlo, & J. Nivre (eds.): Trends in Parsing Technology:
Dependency Parsing, Domain Adaptation, and Deep Parsing, Springer
Dordrecht, Heidelberg/London/New York 2010
H. Bunt, M. Tamita (eds.): Recent advances in parsing technology,
Kluwer, Boston, 1996
G. Dick: Parsing techniques: a practical guide, Springer, 2008
Roger G. Johnson: Andrew D. Booth – Britain’s Other “Fourth Man”. In:
History of Computing. Learning from the Past, Springer Berlin
Heidelberg, 2010.
J. Hutchins: From First Conception to First Demonstration: the Nascent
Years of Machine Translation, 1947–1954. A Chronology. In: Machine
Translation, Volume 12, Issue 3, Kluwer, 1997.
IA161 Syntactic Formalisms for Parsing Natural Languages 39 / 476
Lecture 1
References II
J. Hutchins: Milestones no.6: Bar-Hillel and the nonfeasibility of FAHQT.
In: International Journal of Language and Documentation no.1, 1999.
M. Kay: The proper place of men and machines in language translation.
In: Machine Translation, Volume 12, Issue 1–2, Kluwer 1997 (reprint of
1980).
More on history of MT:
http://www.hutchinsweb.me.uk/history.htm
IA161 Syntactic Formalisms for Parsing Natural Languages 40 / 476
Lecture 2
.
......
Syntactic Formalisms for Parsing
Natural Languages
Aleš Horák, Miloš Jakubíček, Vojtěch Kovář
(based on slides by Juyeon Kang)
ia161@nlp.fi.muni.cz
Autumn 2013
IA161 Syntactic Formalisms for Parsing Natural Languages 41 / 476
Lecture 2
.
...... Basic parsing methods
IA161 Syntactic Formalisms for Parsing Natural Languages 42 / 476
Lecture 2
Main points
Context-free grammar
Parsing methods
Top-down or bottom-up
Directional or non-directional
Basic parsing algorithms
Unger
CKY (or CYK)
Left-corner parsing
Earley
IA161 Syntactic Formalisms for Parsing Natural Languages 43 / 476
Lecture 2
Ambiguity in Natural Language
Notion of ambiguity
Essential ambiguity: same syntactic structure but the semantics
diﬀer
Spurious ambiguity: diﬀerent syntactic structure but no change in
semantics
There is no unambiguous languages!
An input may have exponentially many parses
Should identify the “correct” parse
IA161 Syntactic Formalisms for Parsing Natural Languages 44 / 476
Lecture 2
Ambiguity in Natural Language
Main idea of parsing
Parsing (syntactic structure)
Input: sequence of tokens
John ate an apple
Output: parse tree
S
NP VP
NAME VERB NP
ART NOUN
John ate an apple
IA161 Syntactic Formalisms for Parsing Natural Languages 45 / 476
Lecture 2
Ambiguity in Natural Language
Basic connection between a sentence and the grammar it
derives from is the “parse tree”, which describes how the
grammar was used to produce the sentences.
For the reconstruction of this connection we need
a “parsing techniques”
IA161 Syntactic Formalisms for Parsing Natural Languages 46 / 476
Lecture 2
Ambiguity in Natural Language
Word categories: Traditional parts of speech
Noun Names of things boy, cat, truth
Verb Action or state become, hit
Pronoun Used for noun I, you, we
Adverb Modiﬁes V, Adj, Adv sadly, very
Adjective Modiﬁes noun happy, clever
Conjunction Joins things and, but, while
Preposition Relation of N to, from, into
Interjection An outcry ouch, oh, alas, psst
IA161 Syntactic Formalisms for Parsing Natural Languages 47 / 476
Lecture 2
Formal language
Symbolic string set which describe inﬁnitely unlimited language
as mathematical tool for recognizing and generating languages.
Topic of formal language: ﬁnding ﬁnitely inﬁnite languages
using rewriting system.
Three basic components of formal language: ﬁnite symbol set,
ﬁnite string set, ﬁnite formal rule set
IA161 Syntactic Formalisms for Parsing Natural Languages 48 / 476
Lecture 2
Constituency
Sentences have parts, some of which appear to have subparts.
These groupings of words that go together we will call
constituents.
(How do we know they go together?)
I hit the man with a cleaver
I hit [the man with a cleaver]
I hit [the man] with a cleaver
You could not go to her party
You [could not] go to her party
You could [not go] to her party
IA161 Syntactic Formalisms for Parsing Natural Languages 49 / 476
Lecture 2
The Chomsky hierarchy
Type 0 Languages / Grammars (LRE: Recursively enumerable
grammar)
Rewrite rules α → β
where α and β are any string of terminals and non-terminals
Type 1 Context-sensitive Languages / Grammars (LCS)
Rewrite rules αXβ → αϒβ
where X is a non-terminal, and α, ϒ, β are any string of terminals and
non-terminals, (ϒ must be non-empty but strings α and β can be
empty).
Type 2 Context-free Languages / Grammars (LCF)
Rewrite rules X → ϒ
where X is a non-terminal and ϒ is any string of terminals and
non-terminals
Type 3 Regular Languages / Grammars (LREG)
Rewrite rules X → αY
where X, Y are single non-terminals, and α is a string of terminals; Y
might be missing.
IA161 Syntactic Formalisms for Parsing Natural Languages 50 / 476
Lecture 2
The Chomsky hierarchy
Type 0 > 1 > 2 > 3
according to generative power
Superior language can generate inferior language but superior
language is more ineﬃcient and slow than inferior language.
IA161 Syntactic Formalisms for Parsing Natural Languages 51 / 476
Lecture 2
The Chomsky hierarchy






Figure : Chomsky hierarchy
IA161 Syntactic Formalisms for Parsing Natural Languages 52 / 476
Lecture 2
Context-free grammar (Type 2)
The most common way of modeling constituency.
The idea of basing a grammar on constituent structure dates
back to Wilhem Wundt (1890), but not formalized until
Chomsky (1956), and, independently, by Backus (1959).
CFG = Context-Free Grammar = Phrase Structure Grammar=
BNF = Backus-Naur Form
IA161 Syntactic Formalisms for Parsing Natural Languages 53 / 476
Lecture 2
Context-free grammar (Type 2)
CFG rewriting rule
X →ϒ
where X is a non-terminal symbol and ϒ is string consisting of
terminals/non-terminals.
The term “Context-free” expresses the fact that the
non-terminal v can always be replaced by w, regardless of the
context in which it occurs.
IA161 Syntactic Formalisms for Parsing Natural Languages 54 / 476
Lecture 2
Context-free grammar (Type 2)
G = < T, N, S, R>
T is set of terminals (lexicon)
N is set of non-terminals (written in capital letter). S is start
symbol (one of the non-terminals)
R is rules/productions of the form X →ϒ , where X is a
non-terminal and ϒ is a sequence of terminals and
non-terminals (may be empty).
A grammar G generates a language L
IA161 Syntactic Formalisms for Parsing Natural Languages 55 / 476
Lecture 2
Example1 of Context-Free Grammar
G = < T, N, S, R>
T = { that, this, a, the, man, book, ﬂight, meal, include, read, does }
N = { S, NP, NOM, VP, DET ,N, V, AUX }
S = S
R = {
S → NP VP Det → that | this | a | the
S → Aux NP VP N → book | ﬂight | meal | man
S → VP V → book | include | read
NP → Det NOM AUX → does
NP → N
VP → V
VP → V NP
}
IA161 Syntactic Formalisms for Parsing Natural Languages 56 / 476
Lecture 2
Example2 of Context-Free Grammar
R1: S -> NP VP R13: DET -> his|her
R2: NP -> DET N R14: DET -> the
R3: NP -> NP PNP R15: V -> eat|serve
R4: NP -> PN R16: V -> give
R5: VP -> V R17: V -> speak|speaks
R6: VP -> V NP R18: V -> discuss
R7: VP -> V PNP R19: PN -> John|Mark
R8: VP -> V NP PNP R20: PN -> Mary|Juliette
R9: VP -> V PNP PNP R21: N -> daugther|mother
R10: PNP -> PP NP R22: N -> son|boy
R11: PP-> to|from|of R23: N -> salad|soup|meat
R12: DET -> an|a R24: N -> desert|cheese|bread
R25: ADJ -> small|kind
Simpliﬁed example of CFG = GD
IA161 Syntactic Formalisms for Parsing Natural Languages 57 / 476
Lecture 2
Example2 of Context-Free Grammar
Using the presented grammar, we make a ﬁrst derivation for
the sentence “John speaks”,
S -> GD NP VP (by R1)
S -> GD PN VP (by R4)
-> GD John VP (by R19)
-> GD John V (by R5)
-> GD John speaks (by R17)
IA161 Syntactic Formalisms for Parsing Natural Languages 58 / 476
Lecture 2
Example2 of Context-Free Grammar
Another derivation of “John speaks” from GD using rule 5
before rule 4
S -> GD NP VP
S -> GD NP V
-> GD NP speaks
-> GD PN speaks
-> GD John speaks
IA161 Syntactic Formalisms for Parsing Natural Languages 59 / 476
Lecture 2
Production Rule 3
NP -> NP PNP
Because it contains the same symbol in his left and his right,
we say that the production having this property is recursive.
IA161 Syntactic Formalisms for Parsing Natural Languages 60 / 476
Lecture 2
Production Rule 3
This property of R3 involves that the language generated by
the grammar GD is inﬁnite, because we can create the
sentences of arbitrary length by iterative application of R3.
Test
NP -> GD NP PNP -> GD NP PNP PNP -> GD NP PNP PNP
PNP….
The son of John speaks
The son of the mother of John speaks
The son of the daughter of the daughter ….of John speaks.
IA161 Syntactic Formalisms for Parsing Natural Languages 61 / 476
Lecture 2
Production Rule 3
Last remark concerning this grammar (GD)
This grammar can generate sentences which are ambiguous.
“John speaks to the daughter of Mark”
Example
1 A conversation between John and the daughter of Mark (R7)
2 A conversation about Mark between John and the daughter
(R9)
IA161 Syntactic Formalisms for Parsing Natural Languages 62 / 476
Lecture 2
Production Rule 3
VP VP
Speaks PNP V PNP PNP
Pto NP to the D of Mark
NP the daughter PNP of Mark
IA161 Syntactic Formalisms for Parsing Natural Languages 63 / 476
Lecture 2
Commonly used non-terminal abbreviations
S sentence
NP noun phrase
PP prepositional phrase
VP verb phrase
XP X phrase
N noun
PREP preposition
V verb
DET/ART determiner / article
ADJ adjective
ADV adverb
AUX auxiliary verb
PN proper noun
IA161 Syntactic Formalisms for Parsing Natural Languages 64 / 476
Lecture 2
Parsing methods
Classiﬁcation of parsing methods
Top-down parsing vs. Bottom-up parsing
Directional vs. non-directional parsing
IA161 Syntactic Formalisms for Parsing Natural Languages 65 / 476
Lecture 2
Top-down or bottom-up
Top-down parsing
The sentence from the start symbol, the production tree is
reconstructed from the top downwards
Identify the production rules in preﬁx order
Never explores a tree that cannot result in an S
BUT Wastes time generating trees inconsistent with the input
Bottom-up parsing
The sentence back to the start symbol
Identify the production rules in postﬁx order
Never generates trees that are not grounded in the input
BUT Wastes time generating trees that do not lead to an S
IA161 Syntactic Formalisms for Parsing Natural Languages 66 / 476
Lecture 2
Top-down parsing
Top-down parsing is goal-directed.
A top-down parser starts with a list of constituents to be built.
It rewrites the goals in the goal list by matching one against the
LHS of the grammar rules,
and expanding it with the RHS,
...attempting to match the sentence to be derived.
If a goal can be rewritten in several ways, then there is a choice
of which rule to apply (search problem)
Can use depth-ﬁrst or breadth-ﬁrst search, and goal ordering.
IA161 Syntactic Formalisms for Parsing Natural Languages 67 / 476
Lecture 2
Top-down parsing
Simulation of the operation of parser in top-down
methods
The son speaks
1 S
2 NP VP
3 DET N VP
4 4. a N VP. Fail: input begin by the. We return to DET N VP
5 the N VP
6 the daughter VP. New fail α=le N VP
……
7 the son VP
8 the son V
9 the son speaks.
……
IA161 Syntactic Formalisms for Parsing Natural Languages 68 / 476
Lecture 2
Top-down parsing
Top-down parsing example
S → NP VP
→ NAME VP
→ “John” VP
→ “John” VERV NP
→ “John” “ate” NP
→ “John” “ate” DET NOUN
→ “John” “ate” “an” NOUN
→ “John” “ate” “an” “apple”
IA161 Syntactic Formalisms for Parsing Natural Languages 69 / 476
Lecture 2
Top-down parsing
S
NP VP
(1) (2) (3)
(4) (5)
(7) (8)
(6)
NAME
S
NP VP
NAME
S
NP VP
John
VERB NPNAME
S
NP VP
John
VERB NPNAME
S
NP VP
John ate
VERB NP
ART NOUN
NAME
S
NP VP
John ate
VERB NP
ART NOUN
NAME
S
NP VP
John ate an
VERB NP
ART NOUN
NAME
S
NP VP
John ate an apple
IA161 Syntactic Formalisms for Parsing Natural Languages 70 / 476
Lecture 2
Top-down parsing
Algorithm of top-down left-right (LR) parsing
α is a primal current word, u input to be recognized.
tdlrp = main function
tdlrp (α, u)
begin
if (α = u) then return (true) ﬁ
Α = u1……ukAΥ
while (∃A− > β) do
(β = uk+1……….uk+1
δ
) with δ = ϵ ou δ = A…
if (tdlrp(u1……uk+1
δ
Υ) = true) then return(true) ﬁ
od
return (false)
end
IA161 Syntactic Formalisms for Parsing Natural Languages 71 / 476
Lecture 2
Top-down parsing
Problems in top-down parsing
Left recursive rules... e.g. NP → NP PP... lead to inﬁnite recursion
Will do badly if there are many diﬀerent rules for the same LHS.
Consider if there are 600 rules for S, 599 of which start with NP,
but one of which starts with a V, and the sentence starts with a
V.
Top-down parsers do well if there is useful grammar-driven
control: search is directed by the grammar.
Top-down is hopeless for rewriting parts of speech
(preterminals) with words (terminals).
IA161 Syntactic Formalisms for Parsing Natural Languages 72 / 476
Lecture 2
Bottom-up parsing
Bottom-up parsing is data-directed.
The initial goal list of a bottom-up parser is the string to be parsed.
If a sequence in the goal list matches the RHS of a rule, then this
sequence may be replaced by the LHS of the rule.
Parsing is ﬁnished when the goal list contains just the start
symbol.
If the RHS of several rules match the goal list, then there is a
choice of which rule to apply (search problem)
Can use depth-ﬁrst or breadth-ﬁrst search, and goal ordering.
IA161 Syntactic Formalisms for Parsing Natural Languages 73 / 476
Lecture 2
Bottom-up parsing
Let’s suppose that we have a sentence “the son eats his soup”
in the grammar GD.
Question
How we can do to verify that the word belong to the language
generated by the grammar GD and if the answer is positive to
assign a tree?
→ The ﬁrst idea can be given in the following algorithms:
IA161 Syntactic Formalisms for Parsing Natural Languages 74 / 476
Lecture 2
Bottom-up parsing
Bottom-up parsing example
“John” “ate” “an” “apple”
→ NAME “ate” “an” “apple”
→ NAME VERV “an” “apple”
→ NAME VERV DET “apple”
→ NAME VERV DET NOUN
→ NP VERV DET NOUN
→ NP VERV NP
→ NP VP
→ S
IA161 Syntactic Formalisms for Parsing Natural Languages 75 / 476
Lecture 2
Bottom-up parsing
(1) (2) (3)
(4) (5)
(7) (8)
(6)
NP
John ate an apple
NAME
John ate an apple
NAME VERB
John ate an apple
NAME VERB
ART
John ate an apple
NAME VERB
ART NOUN
John ate an apple
NAME VERB
ART NOUN
NP
VP
John ate an apple
NAME
NP
VERB
ART NOUN
NP
VP
John ate an apple
NAME
NP
VERB
ART NOUN
NP
VP
S
John ate an apple
NAME
NP
VERB
ART NOUN
IA161 Syntactic Formalisms for Parsing Natural Languages 76 / 476
Lecture 2
Bottom-up parsing
Problems with bottom-up parsing
Unable to deal with empty categories: termination problem,
unless rewriting empties as constituents is somehow restricted
(but then it’s generally incomplete)
Ineﬃcient when there is great lexical ambiguity
(grammar-driven control might help here). Conversely, it is
data-directed: it attempts to parse the words that are there.
Both Top-down (LL) and Bottom-up (LR) parsers can (and
frequently do) do work exponential in the sentence length on
NLP problems.
IA161 Syntactic Formalisms for Parsing Natural Languages 77 / 476
Lecture 2
Left-corner parsing
Left-corner parsing
Bottom-up with top-down ﬁltering:
combine top-down processing with bottom-up processing in order
to avoid going wrong in the ways that we are prone to go wrong
with pure top-down and pure bottom-up techniques
IA161 Syntactic Formalisms for Parsing Natural Languages 78 / 476
Lecture 2
Left-corner parsing
.
Going wrong with top-down parsing
..
......
S -> NP VP
NP -> DET N
NP -> PN
VP -> IV
DET -> the
N -> robber
PN -> Vincent
IV -> died
Vincent died.
IA161 Syntactic Formalisms for Parsing Natural Languages 79 / 476
Lecture 2
Left-corner parsing
.
Going wrong with bottom-up parsing
..
......
S -> NP VP
NP -> DET N
VP -> IV
VP -> TV NP
TV -> plant
IV -> died
DET-> the
N -> plant
The plant died.
1 DET plant died
2 DET TV IV Fail
3 DET N IV OK
4 NP VP OK
5 S
IA161 Syntactic Formalisms for Parsing Natural Languages 80 / 476
Lecture 2
Left-corner parsing
.
Combining Top-down and Bottom-up Information
..
......
S -> NP VP
NP -> DET N
NP -> PN
VP -> IV
DET -> the
N -> robber
PN -> Vincent
IV -> died
Vincent died.
IA161 Syntactic Formalisms for Parsing Natural Languages 81 / 476
Lecture 2
Left-corner parsing
Now, let’s look at how a left-corner recognizer would proceed
to recognize Vincent died.
1 Input: Vincent died. Recognize an S. (Top-down prediction.)
S
vincent died
2 The category of the ﬁrst word of the input is PN. (Bottom-up
step using a lexical rule.)
S
PN
vincent died
IA161 Syntactic Formalisms for Parsing Natural Languages 82 / 476
Lecture 2
Left-corner parsing
3 Select a rule that has at its left corner : NP-> PN. (Bottom-up
step using a phrase structure rule.)
S
NP
PN
vincent died
4 Select a rule that has at its left corner: S->NP VP. (Bottom-up
step.)
5 Match! The left hand side of the rule matches with S, the
category we are trying to recognize.
S
NP
PN
vincent died
VP
IA161 Syntactic Formalisms for Parsing Natural Languages 83 / 476
Lecture 2
Left-corner parsing
6 Input: died. Recognize a VP. (Top-down prediction.)
7 The category of the ﬁrst word of the input is IV. (Bottom-up
step.) S
NP
PN IV
vincent died
VP
8 Select a rule that has at its left corner: VP->IV. (Bottom-up step.)
9 Match! The left hand side of the rule matches with VP, the
category we are trying to recognize.
S
NP
PN IV
vincent died
VP
IA161 Syntactic Formalisms for Parsing Natural Languages 84 / 476
Lecture 2
Left-corner parsing
What is a left-corner of a rule:
the ﬁrst symbol on the right hand side. For example, NP is the left
corner of the rule S → NP VP, and IV is the left corner of the rule VP
→ IV. Similarly, we can say that Vincent is the left corner of the
lexical rule PN → Vincent.
IA161 Syntactic Formalisms for Parsing Natural Languages 85 / 476
Lecture 2
Left-corner parsing
What is a left-corner of a rule:
“Predictive” parser : it uses grammatical knowledge to predict
what should come next, given what it has found already.
4 operations creating new items from old:
“Shift”, “Predict”, “Match” and “Reduce”
IA161 Syntactic Formalisms for Parsing Natural Languages 86 / 476
Lecture 2
Left-corner parsing
Deﬁnition (Corner relation)
The relation ∠ between non-terminals A and B such that B ∠ A if
and only if there is a rule A → Bα, where α denotes some
sequence of grammar symbols
Deﬁnition (Left corner relation)
The transitive and reﬂexive closure of ∠ is denoted by ∠∗
,
which is called left-corner relation
IA161 Syntactic Formalisms for Parsing Natural Languages 87 / 476
Lecture 2
Left-corner parsing
.
Left-corner table
..
......
Non Terminal Left-corners
S S NP time an VorN ﬁles
NP NP time an VorN ﬁles
VP VP VorN ﬁles VorP like
PP PP VorP like
VorN VorN ﬁles
VorP VorP like
Grammar
S → NP VP
S → S PP
NP → time
NP → an arrow
NP → VorN
VP → VorN
VP → VorP NP
PP → VorP NP
VorN → ﬁles
VorP → like
IA161 Syntactic Formalisms for Parsing Natural Languages 88 / 476
Lecture 2
How to deal with ambiguity?
Backtracking
Try all variants subsequently.
Determinism
Just choose one variant and keep it (i.,e. greedy).
Parallelism
Try all variants in parallel.
Underspeciﬁcation
Do not desambiguate, keep ambiguity.
IA161 Syntactic Formalisms for Parsing Natural Languages 89 / 476
Lecture 2
Summary
One view on parsing: parsing as a phrase-structure formal
grammar recognition task
Parsing approaches: top-down, bottom-up, left-corner
IA161 Syntactic Formalisms for Parsing Natural Languages 90 / 476
Lecture 3
.
......
Syntactic Formalisms for Parsing
Natural Languages
Aleš Horák, Miloš Jakubíček, Vojtěch Kovář
(based on slides by Juyeon Kang)
ia161@nlp.fi.muni.cz
Autumn 2013
IA161 Syntactic Formalisms for Parsing Natural Languages 91 / 476
Lecture 3
.
...... Chart parsing
IA161 Syntactic Formalisms for Parsing Natural Languages 92 / 476
Lecture 3
Main points
CKY algorithm
Earley parsing
General chart parsing methods
IA161 Syntactic Formalisms for Parsing Natural Languages 93 / 476
Lecture 3
Directional or non-directional
{
Directional top-down
Directional bottom-up



Non-directional top-down method
– ﬁrstly by Unger
Non-directional bottom-up
– by Cocke, Younger and Kasami (CYK, also CKY)
→ They access the input in an seemingly arbitrary order, so
they require the entire input to be in memory before parsing
can start
IA161 Syntactic Formalisms for Parsing Natural Languages 94 / 476
Lecture 3
Non-directional top-down methods
by Unger
Capable of working with the entire class of CFG
Expects as input a sentence and a CFG
It works by searching for partitionings of the input which
match the right hand side(RHS) of production rules.
IA161 Syntactic Formalisms for Parsing Natural Languages 95 / 476
Lecture 3
Non-directional top-down methods
by Unger
Let G denote a CF grammar and w be an input sentence.
Principle: if the input sentence w belongs to the language L(G)
it must be derivable from the start symbol S of the grammar G.
Let S be deﬁned as: S→S1 S2…Sk
The input sentence w must be obtainable from the sequence of
symbols S1 S2…Sk in a way that S1 must derive a ﬁrst part of the
input, S2 a second part, and so on.
S1 S2 Sk
W1…wp1 wp1+1…wp2….. wpk−1…wpk
IA161 Syntactic Formalisms for Parsing Natural Languages 96 / 476
Lecture 3
Non-directional bottom-up methods
as CYK
CYK is an example of chart parsing
discovered independently by Cocke, Younger and kasami
Consider which non-terminals can be used to derive substrings
of the input, beginning with shorter strings and moving up to
longer strings
1 Start with strings of length one, matching the single character in
the input strings against unit productions in the grammar
2 Then considers all substrings of length two, looking for production
with right-hand side elements that match the two characters of
the substring.
3 Continues up to longer strings
IA161 Syntactic Formalisms for Parsing Natural Languages 97 / 476
Lecture 3
Non-directional bottom-up methods
as CYK
CYK example 2
Two example sentences and their potential analysis
He [gave[the young cat][to Bill]].
He [gave [the young cat][some milk]].
The corresponding grammar rules:
VP -> Vditrans NP PPto
VP -> Vditrans NP VP
Regardless of the ﬁnal sentence analysis, the ditransitive verb
(gave) and its ﬁrst object NP (the young cat) will have
the same analysis
-> No need to analyze it twice.
IA161 Syntactic Formalisms for Parsing Natural Languages 98 / 476
Lecture 3
Non-directional bottom-up methods
as CYK
Solutions: chart parsing
1 Store analyzed constituents: well formed substring table or
(passive) chart
2 Partial and complete analyses: (active) chart
In other words, instead of recalculating that the young cat is
an NP, we will store that information
Dynamic programming: never go backwards
IA161 Syntactic Formalisms for Parsing Natural Languages 99 / 476
Lecture 3
CKY algorithm
program CKY Parser;
begin
for p := 1 to n do V[p, 1] := {A|A → ap ∈ P };
for q := 2 to n do
for p := 1 to n − q + 1 do
V[p, q] = ∅;
for k :=1 to q − 1 do
V[p, q] =
V[p, q] ∪
∪ {A|A → BC ∈ P, B ∈ V[p, k], C ∈ V[p + k, q − k]};
od
od
od
end
Complexity of CKY is O(n3)
IA161 Syntactic Formalisms for Parsing Natural Languages 100 / 476
počítá se (trojúhelníková) matice V:
• sloupce = pozice ve vstupní větě
• řádky = délky (pod)řetězců vstupní věty
• prvky = množiny neterminálů, které pokrývají odpovídající část
vstupní věty
první cyklus naplní první řádek matice
ve vnitřním cyklu se B a C vybírají vždy z už hotových políček
matice (menší řetězce) – tj. od 2. řádku už vůbec
nepracujeme se vstupní větou, jen s předchozími řádky
neznámé terminály na vstupu se ignorují
Lecture 3
CKY example
input grammar:
.
Deﬁnition
..
......
S → AA|BB|AX|BY|a|b
X → SA
Y → SB
A → a
B → b
input string w = abaaba.
IA161 Syntactic Formalisms for Parsing Natural Languages 101 / 476
Lecture 3
CKY example – solution
a b a a b a
a
.
Deﬁnition
..
......
S → AA|BB|AX|BY|a|b
X → SA
Y → SB
A → a
B → b
p – position, q – length
q
p
1 2 3 4 5 6
1 S, A S, B S, A S, A S, B S, A
2 Y X S, X Y X
3 S ∅ Y S
4 X S ∅
5 ∅ X
6 S
IA161 Syntactic Formalisms for Parsing Natural Languages 102 / 476
nechat počítat na tabuli studenty – políčka v prvních řádcích
jdou rychleji
napsat na tabuli prázdnou matici V a do ní doplňovat.
postup: např. 2. řádek, políčko [1,2] vzniká z [1,1] a [2,1] – 4
kombinace SS, SB, AS, AB → v gramatice je jen SB, tj. Y.
kombinace se vždycky počítají ze dvou políček, které se
pohybují ve “véčku” nad počítaným polem.
∅ v políčku znamená, že příslušný podřetězec nejde
vygenerovat z žádného pravidla gramatiky.
složitost CKY je vždycky O(n3
) na rozdíl od ostatních, kde je
jen Ω(n3
)
výsledek = true/false podle toho, jestli je v políčku dole
kořen. pro generování stromů z CKY tabulky bychom si
museli pamatovat v každém políčku, z jakých políček vznikl
který neterminál.
Lecture 3
CKY online demo
http://www.diotavelli.net/people/void/demos/cky.html
IA161 Syntactic Formalisms for Parsing Natural Languages 103 / 476
Lecture 3
DCG
DCG=
Deﬁnite Clause Grammars
Syntactic shorthand for producing parsers with Prolog clauses:
Prolog-based parsing
Represent the input with diﬀerence lists: two lists with the ﬁrst
containing the input to parse (a suﬃx of the entire input string)
and the second containing the string remaining after a
successful parse.
These two lists correspond to the input and output variables of the
clauses.
Each clause corresponds to a non-terminal in the grammar.
IA161 Syntactic Formalisms for Parsing Natural Languages 104 / 476
Lecture 3
Earley parser
Jay Earley, 1968
Strong resemblance to LR parsing but more dynamic
Work with what are called Earley items
Earley item is a production augmented with a marker inserted at
some point in the production’s right hand side and a number to
indicate where in the input matching of the production began.
Earley item sets are constructed by applying three operations to
the current list of Earley item sets: scanner, predictor, completor
IA161 Syntactic Formalisms for Parsing Natural Languages 105 / 476
Lecture 3
Earley algorithm
Repeat until no new item can be added:
1 Prediction
For every state in agenda of the form (X → α • Y β, j), add
(Y → • γ, k) to agenda for every production in the grammar with
Y on the left-hand side (Y → γ).
2 Scanning
If a is the next symbol in the input stream, for every state in
agenda of the form (X → α • a β, j), add (X → α a • β, j) to
agenda.
3 Completion
For every state in agenda of the form (X → γ •, j), ﬁnd states in
agenda of the form (Y → α • X β, i) and add (Y → α X • β, i) to
agenda.
IA161 Syntactic Formalisms for Parsing Natural Languages 106 / 476
Lecture 3
Earley algorithm
Earley’s example
A pointed rule (Marker) is a production increased by a point.
The point indicates the current state of application of the rule
The girl speaks
S->•GN GV
S->GN•GV
GN-> • GN GNP
GN->GN•GNP
1 2 3 4
DET->the. N->girl. V->speaks.
IA161 Syntactic Formalisms for Parsing Natural Languages 107 / 476
Lecture 3
Earley algorithm
4 S->NP•VP V -> speaks•
3 S->NP•VP, NP->NP•NPP N -> girl•
2 DET->the•, NP->DET•N
1 2 3
The girl speaks
IA161 Syntactic Formalisms for Parsing Natural Languages 108 / 476
Lecture 3
Chart parsing
The Earley parser can be modiﬁed to work bottom-up or
head-corner
⇒ a variety of chart parsing algorithms (Kay, 1980)
IA161 Syntactic Formalisms for Parsing Natural Languages 109 / 476
Lecture 3
Chart parsing
Three basic approaches:
top-down
bottom-up
head-driven
No constraints on the CF grammar
Chart parsers usually contain two data structures chart and
agenda, both of contain which contain edges.
Edge is a triple [A→ α•β, i, j], where:
i, j ∈ N, 0 ≤ i ≤ j ≤ n for n input words
A → αβ is a grammar rule
0 a 1 b 2 a 3 a 4 b 5 a 6
[A → BC • DE, 0, 3]
IA161 Syntactic Formalisms for Parsing Natural Languages 110 / 476
Lecture 3
General chart parser
program Chart Parser;
begin
initialize (CHART);
initialize (AGENDA);
while (AGENDA not empty) do
E := take edge from AGENDA;
for each (edge F, which can be created by
the edge E and another edge from CHART) do
if ((F is not in AGENDA) and (F is not in CHART) and
(F is diﬀerent from E)
then add F to AGENDA;
ﬁ;
od;
add E to CHART;
od;
end;
IA161 Syntactic Formalisms for Parsing Natural Languages 111 / 476
tato struktura programu je společná všem typům chart
parserů. ty se navzájem liší v:
1. jak se inicializuje
2. jak se vybírá F
dá se to udělat i jinak (bez agendy), ale tato metoda je
nejčastější
proč se nezacyklí:
1. hran je konečný počet
2. každou hranu projde maximálně jednou
Lecture 3
Top-down approach
Initialization:
∀ p ∈ P | p = S → α add edge [S→ •α, 0, 0] to agenda.
startup chart is empty.
Iteration – take edge E from agenda and then:
a) (fundamental rule) if E is in the form of [A→ α•, j, k], then for
each edge [B→ γ• A β, i, j] in the chart, create an edge [B→ γ A
•β, i, k].
b) (closed edges) if E is in the form of [B→ γ• A β, i, j], then for each
edge [A→ α•, j, k] in the chart, create an edge [B → γ A •β, i, k].
c) (read terminal) if E is in the form of [A→ α•aj+1β, i, j], create an
edge [A → α aj+1•β, i, j+1].
d) (prediction) if E is in the form of [A→ α• B β, i, j] then for each
grammar rule B→ γ ∈ P, create an edge [B→ • γ, i, i].
IA161 Syntactic Formalisms for Parsing Natural Languages 112 / 476
Lecture 3
Example – chart parsing
Grammar:
S → CLAUSE
CLAUSE → V OPTPREP N
OPTPREP → ϵ
OPTPREP → PREP
V → jel
PREP → kolem
N → domu
N → kolem
Sentence:
”jel kolem domu” (a1=jel, a2=kolem, a3=domu).
IA161 Syntactic Formalisms for Parsing Natural Languages 113 / 476
Lecture 3
Example – chart after top-down analysis
jel kolem domu
00 11 22 33
NN→ domu.PREP→ kolem.VV→ jel.
NN→ kolem.
SS→ .CLAUSE
CLAUSE→ V OPTPREP . N
OPTPREP→ PREP..
CLAUSE→ V . OPTPREP N
CLAUSE→ . V OPTPREP N OPTPREP→ ..
OPTPREP→ .PREP
CLAUSE→ V OPTPREP . N
CLAUSE→ V OPTPREP N .
SS→ CLAUSE .
SS→ CLAUSE .
CLAUSE→ V OPTPREP N .
IA161 Syntactic Formalisms for Parsing Natural Languages 114 / 476
1. inicializace
2. predikce – aplikace d)
3. predikce – aplikace d)
4. terminal – aplikace c)
5. uzavrena hrana – aplikace a)
6. …
7. v posledním – vynechané ϵ-hrany (nevešly by se)
složitost – počet pravidel bereme jako konstantu → pak
máme podle délky vstupu n celkem n2
možných hran a v
každém kroku zpracujem až n hran → O(n3
).
Lecture 3
Bottom-up approach
Initialization:
∀ p ∈ P | p = A→ ϵ add edges [A→ •, 0, 0], [A→ •, 1, 1], ..., [A→ •,
n, n] to agenda.
∀ p ∈ P | p = A→ aiα add edge [A→ •aiα, i-1, i-1] to agenda.
startup chart is empty.
Iteration – take an edge E from agenda and then:
a) (fundamental rule) if E is in the form of [A→ α•, j, k], then for
each edge [B→ γ• A β, i, j] in the chart, create an edge [B→ γ A
•β, i, k].
b) (closed edges) if E is in the form of [B→ γ• A β, i, j], then for each
edge [A→ α•, j, k] in the chart, create an edge [B → γ A •β, i, k].
c) (read terminal) if E is in the form of [A→ α•aj+1β, i, j], then create
an edge [A → α aj+1•β, i, j+1].
d) (prediction) if E is in the form of [A→ α•, i, j], then for each
grammar rule B→Aγ create an edge [B→ •Aγ, i, i].
IA161 Syntactic Formalisms for Parsing Natural Languages 115 / 476
a), b) a c) jsou stejné jako u shora dolů, liší se jen v d).
většinou vytváří víc nadbytečných hran.
Lecture 3
Head-driven chart parsing
Rule head – any particular right-hand side non-terminal E.g. in
the rule CLAUSE → V OPTPREP N heads can be V, OPTPREP, N.
An edge is a triple [A→ α•β•γ, i, j], where i, j ∈ N, 0 ≤ i ≤ j ≤ n for
n input words, A→ αβγ is a grammar rule and the head is in β.
The algorithm (bottom-up approach) is very similar to the
previous simpler one. The analysis does not go left to right, but
begins on the head of each rule instead.
IA161 Syntactic Formalisms for Parsing Natural Languages 116 / 476
Lecture 3
Head-driven chart parsing
Initialization
∀ p ∈ P | p = A→ ϵ add edges [A→ ••, 0, 0], [A→ ••, 1, 1], ...,
[A→ ••, n, n] to agenda.
∀ p ∈ P | p = A→ αaiβ (ai is rule head) add edge [A→ α•ai•β, i-1,
i] to agenda.
startup chart is empty.
IA161 Syntactic Formalisms for Parsing Natural Languages 117 / 476
Lecture 3
Head-driven chart parsing
Iteration – take and edge E from agenda and then:
a1) if E is in the form of [A→ •α•, j, k], then for each edge [B→ β•γ•Aδ,
i, j] in the chart, create edge [B→ β•γA•δ, i, k].
a2) [B→ βA•γ•δ, k, l] in the chart, create edge [B→ β•Aγ•δ, j, l].
b1) if E is in the form of [B→ β•γ•Aδ, i, j], then for each edge [A→ •α•,
j, k] in the chart, create edge [B→ β•γA•δ, i, k].
b2) if E is in the form of [B→ βA•γ•δ, k, l], then [A→ •α•, j, k] in the
chart, create edge [B→ β•Aγ•δ, j, l].
c1) if E is in the form of [A→ βai•γ•δ, i, j], then create edge
[A→ β•aiγ•δ, i-1, j].
c2) if E is in the form of [A→ β•γ•aj+1δ, i, j], then create edge
[A→ β•γaj+1•δ, i, j+1].
d) if E is in the form of [A→ •α•, i, j], then for each grammar rule
B→ β A γ create edge [B→ β•A•γ, i, j] (A is rule head).
IA161 Syntactic Formalisms for Parsing Natural Languages 118 / 476
Lecture 3
Generalized LR method by Tomita
Tomita’s Algorithm extends the standard LR parsing algorithm:
LR parsing is very eﬃcient, but can only handle a small subset
of CFG
can handle arbitrary CFG
LR eﬃciency is preserved
In order to keep a record of the parse-state, we maintain a stack
consisting of symbol/state pairs.
IA161 Syntactic Formalisms for Parsing Natural Languages 119 / 476
Lecture 3
Generalized LR method by Tomita
generalized LR parser (GLR)
Masaru Tomita: Eﬃcient parsing for natural language, 1986
uses a standard LR table which may contain conﬂicts
stack is represented as a DAG
reduction performed before reading action
IA161 Syntactic Formalisms for Parsing Natural Languages 120 / 476
Lecture 3
Tree ranking
all chart parsing methods: parallelization as means of ﬁghting
the ambiguity
key concept: a polynomial data structure holding up to
exponential parse trees
eﬃcient algorithms to retrieve n-best trees according to some
ranking
enable taking into account a probabilistic notion of a sentence
IA161 Syntactic Formalisms for Parsing Natural Languages 121 / 476
Lecture 3
PCFG
= Probabilistic CFG
each rule r ∈ R has a probability P(r) assigned
probability of a tree t ∈ T usually computed as
P(t) = Πr∈tP(r)
⇒ tbest = argmaxt(P(t))
IA161 Syntactic Formalisms for Parsing Natural Languages 122 / 476
Lecture 3
Statistical parsing
CFG → PCFG → learned grammar
→ statistical parsing
→ how to obtain probabilities (= how to train the parser?)
IA161 Syntactic Formalisms for Parsing Natural Languages 123 / 476
Lecture 3
Statistical NLP
In the 90’s: a change of paradigm in (computational) linguistics
from rationalism to empiricism (corpus-based evidence)
Simultaneously in NLP: big development of language modelling
and statistical methods based on machine learning (both
supervised and unsupervised).
→ statistical parsing
vs. Chomsky:
It must be recognised that the notion of a ‘probability of a
sentence’ is an entirely useless one, under any interpretation of
this term (Chomsky, 1969)
[taken from Chapter 1 of Young and Bloothooft, eds, Corpus-Based Methods in Language and Speech
Processing]
IA161 Syntactic Formalisms for Parsing Natural Languages 124 / 476
Lecture 3
Summary
(Probabilistic) Context-free grammar used in parsing natural
language
Chart parsing methods: CKY, Earley, head-driven chart parsing
IA161 Syntactic Formalisms for Parsing Natural Languages 125 / 476
Lecture 3
References
H. Bunt, M. Tamita: Recent advances in parsing technology, Kluwer,
1996
H. Bunt, P. Merlo, & J. Nivre (eds.): Trends in Parsing Technology:
Dependency Parsing, Domain Adaptation, and Deep Parsing, Springer
Dordrecht, Heidelberg/London/New York 2010
G. Dick: Parsing techniques: a practical guide, Springer, 2008
J. Earley: An eﬃcient context-free parsing algorithm. Communications
of the ACM, 13(2):94–102, 1970
M. Kay: Algorithm schemata and data structures in syntactic
processing. In Readings in natural language processing, pages 35–70.
Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1986
M.-J. Nederhof: Generalized left-corner parsing. In Proceedings of the
sixth conference on European chapter of the Association for
Computational Linguistics, pages 305–314, Morristown, NJ, USA, 1993.
Association for Computational Linguistics.
IA161 Syntactic Formalisms for Parsing Natural Languages 126 / 476
Lecture 4
.
......
Syntactic Formalisms for Parsing
Natural Languages
Aleš Horák, Miloš Jakubíček, Vojtěch Kovář
(based on slides by Juyeon Kang)
ia161@nlp.fi.muni.cz
Autumn 2013
IA161 Syntactic Formalisms for Parsing Natural Languages 127 / 476
Lecture 4
.
...... Dependency Syntax and Parsing
IA161 Syntactic Formalisms for Parsing Natural Languages 128 / 476
Lecture 4
Outline
1 Motivation
2 Dependency Syntax
3 Dependency Parsing
IA161 Syntactic Formalisms for Parsing Natural Languages 129 / 476
Lecture 4
Motivation
what you have seen as far: applying analysis of formal
languages to a natural language – creating a phrase-structure
derivation tree according to some grammar
PS accounts for one important syntactic property:
constituency
is that all?
but what about: discontinuous phrases, structure sharing
IA161 Syntactic Formalisms for Parsing Natural Languages 130 / 476
Lecture 4
Motivation
another crucial syntactic phenomenon is dependency
what is a dependency? ”some relation between two words“
what is the diﬀerence to phrase-structure?
what does constituency express?
what does dependency express?
IA161 Syntactic Formalisms for Parsing Natural Languages 131 / 476
Lecture 4
Dependency Syntax (Meľchuk 1988)
A more formal account – what is a dependency? A relation!
.
Dependency Relation
..
......
Let W be a set of all words within a sentence, then dependency relation
→ is D ⊆ W × W such that:
D is anti-reﬂexive: a → b ⇒ a ̸= b
D is anti-symmetric: a → b ∧ b → a ⇒ a = b, ≡
(anti-reﬂexivity) a → b ⇒ b ↛ a
D is anti-transitive: a → b ∧ b → c ⇒ a ↛ c
optionally: D is labeled: there is a mapping l : D → L, L being
the set of labels
IA161 Syntactic Formalisms for Parsing Natural Languages 132 / 476
Lecture 4
Dependency Representation
a → b: a depends on b, a is a dependent b, b is the head
of a
a dependency graph
a dependency tree
IA161 Syntactic Formalisms for Parsing Natural Languages 133 / 476
Lecture 4
Dependency Tree vs. PS Tree
sleep S
ideas furiously NP VP
Green A N V ADV
Green ideas sleep furiously
IA161 Syntactic Formalisms for Parsing Natural Languages 134 / 476
Lecture 4
Non-projectivity
a property of a dependency tree: a sentence is non-projective
whenever drawing (projecting) a line from a node to the surface
of the tree crosses an arc
a lot of attention has been paid to this problem
practical implications are rather limited (in most cases
non-projectivity can be easily handled or avoided)
hard cases:
koupil
Malou
chaloupku
IA161 Syntactic Formalisms for Parsing Natural Languages 135 / 476
Lecture 4
Czech Tradition of Dependency Syntax
a long tradition of dependency syntax in the Prague linguistic
school (Sgall, Hajičová, Panevová)
Institute of Formal and Applied Linguistics at Charles University
formalized as Functional Generative Description (FGD) of
language
Prague Dependency Treebank (PDT)
IA161 Syntactic Formalisms for Parsing Natural Languages 136 / 476
Lecture 4
Dependencies vs. PS
is one of the formalisms clearly better than the other one?
No.
dependencies: ⊕ account for relational phenomena, ⊕ simple
phrase-structure: ⊕ account for constituency, ⊕ easy chunking
can we perform transformation from one of the formalism to the
other one a vice versa? Technically yes, but . . .
It is not a problem to convert the structure between a dependency
tree and a PS tree ...
... but it is a problem to transform the information included
⇒ both of the formalisms are convertible but not mutually
equivalent
IA161 Syntactic Formalisms for Parsing Natural Languages 137 / 476
Lecture 4
Dependency Parsing
rule-based vs. statistical
transition-based (→ deterministic parsing)
graph-based (→ spanning trees algorithms)
various other approaches (ILP, PS conversion, . . . )
very recent advances (vs. long studied PS parsing algorithms)
IA161 Syntactic Formalisms for Parsing Natural Languages 138 / 476
Lecture 4
Introduction to Dependency parsing
Motivation
a. dependency-based syntactic representation seem to be useful in
many applications of language technology: machine translation,
information extraction
→ transparent encoding of predicate-argument structure
b. dependency grammar is better suited than phrase structure
grammar for language with free or ﬂexible word order
→ analysis of diverse languages within a common framework
c. leading to the development of accurate syntactic parsers for a
number of languages
→ combination with machine learning from syntactically
annotated corpora (e.g. treebank)
IA161 Syntactic Formalisms for Parsing Natural Languages 139 / 476
Lecture 4
Introduction to Dependency parsing
Dependency parsing
“Task of automatically analyzing the dependency structure of a
given input sentence”
Dependency parser
“Task of producing a labeled dependency structure of the kind
depicted in the follow ﬁgure, where the words of the sentence
are connected by typed dependency relations”
ROOT Economic news had little eﬀect on ﬁnancial markets .
PRED
PU
PC
ATTATT
OBJ
ATTSBJATT
IA161 Syntactic Formalisms for Parsing Natural Languages 140 / 476
Lecture 4
Deﬁnitions of dependency graphs and dependency
parsing
Dependency graphs: syntactic structures over sentences
Def. 1.: A sentence is a sequence of tokens denoted by
S = w0w1 . . . wn
Def. 2.: Let R = {r1, . . . , rm} be a ﬁnite set of possible
dependency relation types that can hold between any two
words in a sentence. A relation type r ∈ R is additionally called
an arc label.
IA161 Syntactic Formalisms for Parsing Natural Languages 141 / 476
Lecture 4
Deﬁnitions of dependency graphs and dependency
parsing
Dependency graphs: syntactic structures over sentences
Def. 3.: A dependency graph G = (V, A) is a labeled directed
graph, consists of nodes, V, and arcs, A, such that for
sentence S = w0w1 . . . wn and label set R the following holds:
1 V ⊆ {w0w1 . . . wn}
2 A ⊆ V × R × V
3 if (wi, r, wj) ∈ A then (wi, r′
, wj) /∈ A for all r′
̸= r
IA161 Syntactic Formalisms for Parsing Natural Languages 142 / 476
Lecture 4
Approach to dependency parsing
a. data-driven
it makes essential use of machine learning from linguistic data
in order to parse new sentences
b. grammar-based
it relies on a formal grammar, deﬁning a formal language, so
that it makes sense to ask whether a given input is in the
language deﬁned by the grammar or not.
→ Data-driven have attracted the most attention in
recent years.
IA161 Syntactic Formalisms for Parsing Natural Languages 143 / 476
Lecture 4
Data-driven approach
.
......
according to the type of parsing model adopted,
the algorithms used to learn the model from data
the algorithms used to parse new sentences with the model
a. transition-based
start by deﬁning a transition system, or state machine, for
mapping a sentence to its dependency graph.
b. graph-based
start by deﬁning a space of candidate dependency graphs for a
sentence.
IA161 Syntactic Formalisms for Parsing Natural Languages 144 / 476
Lecture 4
Data-driven approach
a. transition-based
learning problem: induce a model for predicting the next state
transition, given the transition history
parsing problem: construct the optimal transition sequence for
the input sentence, given induced model
b. graph-based
learning problem: induce a model for assigning scores to the
candidate dependency graphs for a sentence
parsing problem: ﬁnd the highest-scoring dependency graph for
the input sentence, given induced model
IA161 Syntactic Formalisms for Parsing Natural Languages 145 / 476
Lecture 4
Transition-based Parsing
Transition system consists of a set C of parser conﬁgurations
and of a set D of transitions between conﬁgurations.
Main idea: a sequence of valid transitions, starting in the
initial conﬁguration for a given sentence and ending in one of
several terminal conﬁgurations, deﬁnes a valid dependency
tree for the input sentence.
D1′m = d1(c1), . . . , dm(cm)
IA161 Syntactic Formalisms for Parsing Natural Languages 146 / 476
Lecture 4
Transition-based Parsing
Deﬁnition
Score of D1′m factors by conﬁguration-transition pairs (ci, di):
s(D1′m) =
∑m
i=1 s(ci, di)
Learning
Scoring function s(ci, di) for di(ci) ∈ D1′m
Inference
Search for highest scoring sequence D∗
1′m given s(ci, di)
IA161 Syntactic Formalisms for Parsing Natural Languages 147 / 476
Lecture 4
Transition-based Parsing
Inference for transition-based parsing
Common inference strategies:
Deterministic [Yamada and Matsumoto 2003, Nivre et al. 2004]
Beam search [Johansson and Nugues 2006, Titov and Henderson
2007]
Complexity given by upper bound on transition sequence length
Transition system
Projective O(n) [Yamada and Matsumoto 2003, Nivre 2003]
Limited non-projective O(n) [Attardi 2006, Nivre 2007]
Unrestricted non-projective O(n2) [Nivre 2008, Nivre 2009]
IA161 Syntactic Formalisms for Parsing Natural Languages 148 / 476
Lecture 4
Transition-based Parsing – Nivre algorithm
IA161 Syntactic Formalisms for Parsing Natural Languages 149 / 476
Lecture 4
Transition-based Parsing
Learning for transition-based parsing
Typical scoring function:
s(ci, di) = w · f(ci, di) where f(ci, di) is a feature vector over
conﬁguration ci and transition di and w is a weight vector
[wi = weight of featurefi(ci, di)]
Transition system
Projective O(n) [Yamada and Matsumoto 2003, Nivre 2003]
Limited non-projective O(n) [Attardi 2006, Nivre 2007]
Unrestricted non-projective O(n2) [Nivre 2008, Nivre 2009]
Problem
Learning is local but features are based on the global history
IA161 Syntactic Formalisms for Parsing Natural Languages 150 / 476
Lecture 4
Transition-based Parsing
Projectivization to pseudo-projectivity:
IA161 Syntactic Formalisms for Parsing Natural Languages 151 / 476
Lecture 4
Graph-based Parsing
For a input sentence S we deﬁne a graph Gs = (Vs, As) where
Vs = {w0, w1, . . . , wn} and
As = {(wi, wj, l)|wi, wj ∈ V and l ∈ L}
Score of a dependency tree T factors by subgraphs Gs, . . . , Gs:
s(T) =
∑m
i−1 s(Gi)
Learning: Scoring function s(Gi) for a subgraph Gi ∈ T
Inference: Search for maximum spanning tree scoring sequence
T∗
of Gs given s(Gi)
IA161 Syntactic Formalisms for Parsing Natural Languages 152 / 476
Lecture 4
Graph-based Parsing
Learning graph-based models
Typical scoring function:
s(Gi) = w · f(Gi) where f(Gi) is a high-dimensional feature vector
over subgraphs and w is a weight vector
[wj = weight of feature fj(Gi)]
Structured learning [McDonald et al. 2005a, Smith and
Johnson 2007]:
Learn weights that maximize the score of the correct dependency
tree for every sentence in the training set
Problem
Learning is global (trees) but features are local (subgraphs)
IA161 Syntactic Formalisms for Parsing Natural Languages 153 / 476
Lecture 4
Graph-based Parsing – Eisner algorithm
IA161 Syntactic Formalisms for Parsing Natural Languages 154 / 476
Lecture 4
Graph-based Parsing – Chu-Liu-Edmonds algorithm
IA161 Syntactic Formalisms for Parsing Natural Languages 155 / 476
Lecture 4
Grammar-based approach
a. context-free dependency parsing
exploits a mapping from dependency structures to CFG
structure representations and reuses parsing algorithms
originally developed for CFG → chart parsing algorithms
b. constraint-based dependency parsing
parsing viewed as a constraint satisfaction problem
grammar deﬁned as a set of constraints on well-formed
dependency graphs
ﬁnding a dependency graph for a sentence that satisﬁes all the
constraints of the grammar (having the best score)
IA161 Syntactic Formalisms for Parsing Natural Languages 156 / 476
Lecture 4
Grammar-based approach
a. context-free dependency parsing
Advantage: Well-studied parsing algorithms such as CKY,
Earley’s algorithm can be used for dependency parsing as well.
→ need to convert dependency grammars into eﬃciently
parsable context-free grammars; (e.g. bilexical CFG, Eisner and
Smith, 2005)
b. constraint-based dependency parsing
deﬁnes the problem as constraint satisfaction
Weighted constraint dependency grammar (WCDG, Foth and
Menzel, 2005)
Transformation-based CDG
IA161 Syntactic Formalisms for Parsing Natural Languages 157 / 476
Lecture 4
Conclusions
1 Dependency syntax vs. constituency (phrase-structure) syntax
2 Non-projectivity
3 Graph-based and Transition-based methods
IA161 Syntactic Formalisms for Parsing Natural Languages 158 / 476
Lecture 5
.
......
Syntactic Formalisms for Parsing
Natural Languages
Aleš Horák, Miloš Jakubíček, Vojtěch Kovář
(based on slides by Juyeon Kang)
ia161@nlp.fi.muni.cz
Autumn 2013
IA161 Syntactic Formalisms for Parsing Natural Languages 159 / 476
Lecture 5
.
...... Parsing with (L)TAG and LFG
IA161 Syntactic Formalisms for Parsing Natural Languages 160 / 476
Lecture 5
(Lexicalized) Tree Adjoining Grammar (TAG) and
Lexical Functional Grammar (LFG)
A) Same goal
formal system to model human speech
model the syntactic properties of natural language
syntactic frame work which aims to provide a computationally
precise and psychologically realistic representation of language
B) Properties
Unﬁcation based
Constraint-based
Lexicalized grammar
IA161 Syntactic Formalisms for Parsing Natural Languages 161 / 476
Lecture 5
How to parse the sentence in TAG?
by Joshi, A. Levy, L and Takahashi, M. in 1975
IA161 Syntactic Formalisms for Parsing Natural Languages 162 / 476
Lecture 5
TAG’s basic component
Representation structure: phrase-structure trees
Finite set of elementary trees
Two kinds of elementary trees
Initial trees (α): trees that can be substituted
Auxiliary trees (β): trees that can be adjoined
IA161 Syntactic Formalisms for Parsing Natural Languages 163 / 476
Lecture 5
TAG’s basic component
The tree in (X ∪ Z) are called elementary trees.
Initial tree: Auxiliary tree:
terminal nodes or
substitution nodes
Z
Z*
X
IA161 Syntactic Formalisms for Parsing Natural Languages 164 / 476
Lecture 5
TAG’s basic component
An initial tree (α)
all interior nodes are labeled with non-terminal symbols
the nodes on the frontier of initial tree are either labeled with
terminal symbols, or with non-terminal symbols marked for
substitution (↓)
An auxiliary tree (β)
one of its frontier nodes must be marked as foot node (∗)
the foot node must be labeled with a non-terminal symbol which is
identical to the label of the root node.
A derived tree (γ)
tree built by composition of two other trees
the two composition operations that TAG uses adjoining and
substitution.
IA161 Syntactic Formalisms for Parsing Natural Languages 165 / 476
Lecture 5
Main operations of combination (1): adjunction
Sentence of the language of a TAG are derived from the
composition of an α and any number of β by this operation.
It allows to insert a complete structure into an interior node of
another complete structure.
Three constraints possible
Null adjunction (NA)
Obligatory adjunction (OA)
Selectional adjunction (SA)
IA161 Syntactic Formalisms for Parsing Natural Languages 166 / 476
Lecture 5
Main operations of combination (1): adjunction
Y
S
NP0↓
NP1↓ NP1↓
NP0↓
VP VP VP
VP
V
VV VP*V has
has lovedloved
S
X
X
X*
X
Y
(α)
(α2
)
Adjoining
(β1
)
+ →
(β) (γ)
IA161 Syntactic Formalisms for Parsing Natural Languages 167 / 476
Lecture 5
Main operations of combination (2): substitution
It inserts an initial tree or a lexical tree into an elementary tree.
One constraint possible
Selectional substitution
S
NP0↓
NP1↓ N
NP0↓
VP NP VP
VP
V
D↓D↓ NV loved
womanwomanloved
S
X
A↓ A
(α2
)
Substitution
(α3
)
+ →
IA161 Syntactic Formalisms for Parsing Natural Languages 168 / 476
Lecture 5
Adjoining constraints
Selective Adjunction (SA(T)): only members of a set T ⊆ A can
be adjoined on the given node, but the adjunction
is not mandatory
Null Adjunction (NA): any adjunction is disallowed for the
given node (NA = SA(ϕ))
Obligatory Adjunction (OA(T)): an auxiliary tree member of
the set T ⊆ A must be adjoined on the given node
for short OA = OA(A)
IA161 Syntactic Formalisms for Parsing Natural Languages 169 / 476
Lecture 5
Example 1: selective adjunction (SA)
One possible analysis of “send” could involve selective
adjunction:
α1 β1 β2
S VP VP
NP↓ VPSA(β1,β2,...) VP* away VP* PP
send NP↓ P NP↓
to
send
send away
send to
send something
IA161 Syntactic Formalisms for Parsing Natural Languages 170 / 476
Lecture 5
Example 2: obligatory adjunction
For when you absolutely must have adjunction at a node:
α1 β1 β2
S VP VP
NP↓ VPOA(β1,β2) Aux VP* Aux VP*
V has is
seen
has
is
has seen
is seen
IA161 Syntactic Formalisms for Parsing Natural Languages 171 / 476
Lecture 5
Elementary trees (initial trees and auxiliary trees)
Yesterday a man saw Mary
S NP
Adv S* D D↓ N
(βyest) (αa) (αman)
yesterday a man
S
NP0 ↓ VP NP
V NP1 ↓ N
saw Mary
*: foot node/root node
↓: substitution node
IA161 Syntactic Formalisms for Parsing Natural Languages 172 / 476
Lecture 5
Elementary trees (initial trees and auxiliary trees)
S
Ad S
yesterday NP VP
D N V NP
a man saw N
(α5)
Mary
IA161 Syntactic Formalisms for Parsing Natural Languages 173 / 476
Lecture 5
Derivation tree
Speciﬁes how a derived tree was constructed
The root node is labeled by an S-type initial tree.
Other nodes are labeled by auxiliary trees in the case of adjoining
or initial trees in the case of substitution.
A tree address of the parent tree is associated with each node.
saw
man(1) Mary(2.2) yest (0)
a (1)
IA161 Syntactic Formalisms for Parsing Natural Languages 174 / 476
Lecture 5
Derivation tree and derived tree α5
saw
man(1) Mary(2.2) yest (0)
a (1)
S
Ad S
yesterday NP VP
D N V NP
a man saw N
(α5)
Mary
_ _ _ _ : substitution operation
______ : adjunction operation
IA161 Syntactic Formalisms for Parsing Natural Languages 175 / 476
Lecture 5
Example 1: Harry likes peanuts passionately
Step 1 NP
Harry
NP
peanuts
S
NP VP
V NP
likes
VP
VP* ADV
passionatelyStep 2: substitution
NP
Harry
S
NP VP
V NP
likes
NP
peanuts
+ +
S
NP VP
V NP
likes
Harry
peanuts
Step 3: adjunction
S
NP VP
V NP
likes
Harry
peanuts
VP
VP* ADV
passionately
+
S
NP
VP
V NP
likes
Harry
peanuts
VP
ADV
passionately
IA161 Syntactic Formalisms for Parsing Natural Languages 176 / 476
Lecture 5
Derivation tree and derived tree of Harry likes
peanuts passionately
likes
Harry(1) peanuts(2.2) passionately(2)
S
NP
VP
V NP
likes
Harry
peanuts
VP
ADV
passionately
IA161 Syntactic Formalisms for Parsing Natural Languages 177 / 476
Lecture 5
Two important properties of TAG
Elementary trees can be of arbitrary size, so the domain of
locality is increased
Extended domain of locality (EDL)
Small initial trees can have multiple adjunctions inserted within
them, so what are normally considered non-local phenomena
are treated locally
Factoring recursion from the domain of dependency (FRD)
IA161 Syntactic Formalisms for Parsing Natural Languages 178 / 476
Lecture 5
Extended domain of locality (EDL): Agreement
The lexical entry for a verb like “loves” will contain a tree like
the following:
S
NP3.sg↓ VP
V NP↓
loves
With EDL, we can easily state agreement between the subject
and the verb in a lexical entry
IA161 Syntactic Formalisms for Parsing Natural Languages 179 / 476
Lecture 5
Factoring recursion from the domain of
dependency (FRD): Extraction
S’
NPi[+wh] S’
who COMP S
that NP VP
Bill V NP
likes ei
S’
COMP S
Φ INFL NP VP
did John V NP S’*
tell Sam
.
......
The above trees for the sentence “who did John tell Sam that Bill likes ?” allow the
insertion of the auxiliary tree in between the WH-phrase and its extraction site,
resulting a long distance dependency; yet this is factored out from the domain of
locality in TAG.
IA161 Syntactic Formalisms for Parsing Natural Languages 180 / 476
Lecture 5
Factoring recursion from the domain of
dependency (FRD): Extraction
S’
NPi[+wh] S’
who COMP S
Φ INFL NP VP
did John V NP
tell Sam S’
COMP S
that NP VP
Bill V NP
likes ei
IA161 Syntactic Formalisms for Parsing Natural Languages 181 / 476
Lecture 5
Variations of TAG
Feature Structure Based TAG (FTAG: Joshi and Shanker, 1988)
each of the nodes of an elementary tree is associated with two
feature structures:
top & bottom Substitution
Substitution with features
Adjoining with features
Y X Xtr
br
t U tr
br
X
Y
t
b
Y
tr
br
tf
bf
X
Y
t U tr
br
tf
b U bf
t
Y
Y*
Y
Y
IA161 Syntactic Formalisms for Parsing Natural Languages 182 / 476
Lecture 5
Variations of TAG
Synchronous TAG (STAG: Shieber and Schabes, 1990)
A pair of TAGs characterize correspondences between languages
Semantic interpretation, language generation and translation
Muti-component TAG (MCTAG: Chen-Main and Joshi, 2007)
A set of auxiliary tree can be adjoined to a given elementary tree
Probabilistic TAG (PTAG: Resnik, 1992, Shieber, 2007)
Associating a probability with each elementary tree
Compute the probability of a derivation
IA161 Syntactic Formalisms for Parsing Natural Languages 183 / 476
Lecture 5
XTAG Project (UPenn, since 1987 ongoing)
A long-term project to develop a wide-coverage grammar for
English using the Lexicalized Tree-Adjoining Grammar (LTAG)
formalism
Provides a grammar engineering platform consisting of a
parser, a grammar development interface, and a morphological
analyzer
The project extends to variants of the formalism, and languages
other than English
IA161 Syntactic Formalisms for Parsing Natural Languages 184 / 476
Lecture 5
XTAG system
Input Sentence
P.O.S Blender
Tree Selection
Derivation Structure
Parser
Morph Analyzer Tagger
Tree Grafting
Morph DB
Stat DB
Trees DB
Syn DB
Lex Prob DB
IA161 Syntactic Formalisms for Parsing Natural Languages 185 / 476
Lecture 5
Components in XTAG system
Morphological Analyzer & Morph DB: 317K inﬂected items
derived from over 90K stems
POS Tagger & Lex Prob DB: Wall Street Journal-trained 3-gram
tagger with N-best POS sequences
Syntactic DB: over 30K entries, each consisting of:
Uninﬂected form of the word
POS
List of trees or tree-families associated with the word
List of feature equations
Tree DB: 1004 trees, divided into 53 tree families and 221
individual trees
IA161 Syntactic Formalisms for Parsing Natural Languages 186 / 476
Lecture 5
(a) Morphology database (b) syntactic database
Interfaces to the database maintenance tools
IA161 Syntactic Formalisms for Parsing Natural Languages 187 / 476
Lecture 5
Interface to the XTAG system
Parser evaluation in XTAG Project by [Bangalore,S. et.al, 1998]
http://www.cis.upenn.edu/~xtag/
IA161 Syntactic Formalisms for Parsing Natural Languages 188 / 476
Lecture 5
How to parse the sentence in LFG?
by Bresnan, J. and Kaplan, R.M. In 1982
IA161 Syntactic Formalisms for Parsing Natural Languages 189 / 476
Lecture 5
Main representation structures
c-structure: constituent structure
level where the surface syntactic form, including categorical
information, word order and phrasal grouping of constituents,
is encoded.
f-structure: functional structure
internal structure of language where grammatical relations
are represented. It is largely invariable across languages.
(e.g. SUBJ, OBJ, OBL, (X)COMP, (X)ADJ)
a-structure: argument structure
They encode the number, type and semantic roles of the
arguments of a predicate.
IA161 Syntactic Formalisms for Parsing Natural Languages 190 / 476
Lecture 5
Level of structures and their interaction in LFG
Functional
Projection architecture
semantic
structure
information
structure
phonological
structure
argument
structure
functional
structure
constituent
structure
LFG's
focus
IA161 Syntactic Formalisms for Parsing Natural Languages 191 / 476
Lecture 5
Level of structures and their interaction in LFG
In LFG, the parsing result is grammatically correct only if it
satisﬁes 2 criteria:
1 the grammar must be able to assign a correct c-structure
2 the grammar must be able to assign a correct well-formed
f-structure
IA161 Syntactic Formalisms for Parsing Natural Languages 192 / 476
Lecture 5
c-structure
C-structure
PP
P NP
with N
friends
S
NP VP
N V NP
I saw Det N
the girl
The constituent structure represents the organization of overt phrasal syntax
It provides the basis for phonological interpretation
Languages are very diﬀerent on the c-structure level :external factors that usually vary by language
.
Properties of c-structure
..
......
c-structures are conventional phrase structure trees:
they are deﬁned in terms of syntactic categories, terminal nodes, dominance and precedence.
They are determined by a context free grammar that describes all possible surface strings of the language.
LFG does not reserve constituent structure positions for aﬃxes: all leaves are individual words.
IA161 Syntactic Formalisms for Parsing Natural Languages 193 / 476
Lecture 5
f-structure
PRED OBJ
PRED NUM
PLURAL'friend'
'with' PRED 'friend'
NUM PLURAL
PRED 'with'
OBJ
Attribute-Value notation for f-structure
.
......
1 representation of the functional structure of a sentence
2 f-structure match with c-structure
3 it has to satisfy three formal constraints: consistency,
coherence, completeness
4 language are similar on this level: allow to explain
cross-linguistic properties of phenomena
IA161 Syntactic Formalisms for Parsing Natural Languages 194 / 476
Lecture 5
Examples of f-structure
OBJ
TENSE
PRED
SUBJ
OBJ2
PRED
PAST
SUBJ, OBJ, OBJ2
PRED
PRED
DEF
NUM SG
SUBJ
TENSE
PRED
PRED
DEF
NUM SG
PAST
PCASE
OBJ PRED
DEF
NUM SG
'homework'
+
OBLon
+
'teacher'
-
'e-mail'
'Sabine'
'Veit'
OBLon
SUBJ, OBJ ''insist OBLon
'send '
1 2
IA161 Syntactic Formalisms for Parsing Natural Languages 195 / 476
Lecture 5
Constraint 1: f-structure must be consistent
1 Two paths in the graph structure may designate the same
element-called uniﬁcation, structure-sharing
Ex: John must leave
PRED XCOMP
PRED SUBJ
PRED
'leave'
'must'
'John'
SUBJ
PRED 'leave'
SUBJ
PRED 'must'
SUBJ
XCOMP
PRED 'John'
IA161 Syntactic Formalisms for Parsing Natural Languages 196 / 476
Lecture 5
Constraint 1: f-structure must be consistent
2 attributes are functionally unique - there may not be two arcs
with the same attribute from the same f-structure
OBJOBJ
PRED 'Veit'
PRED 'Tom'
SUBJ
SUBJ
PRED
TENSE
TENSE
SUBJ ''sleep
PAST
FUT
Incosnistent f-structure
*
IA161 Syntactic Formalisms for Parsing Natural Languages 197 / 476
Lecture 5
Constraint 1: f-structure must be consistent
3 The symbols used for atomic f-structure are distinct - it is
impossible to have two names for a single atomic f-structure
(“clash”)
PRED SUBJ
PRED NUM
'pro'
'sleep'
*They sleeps
excludedSINGULAR
/PLURAL
IA161 Syntactic Formalisms for Parsing Natural Languages 198 / 476
Lecture 5
Constraint 2: f-structure must be coherent
All argument functions in an f-structure must be selected by
the local PRED feature.
SUBJ
PRED
TENSE
PRED
NUM SG
PERS 3
PRES
OBJ
PRED
NUM
PERS 3
SG
'Mary'
'John'
'fall 'SUBJ
SUBJ
PRED
TENSE
PRED
NUM SG
PERS 3
PRES
'John'
'fall 'SUBJ ?
Complete f-structure Incoherent f-structure
IA161 Syntactic Formalisms for Parsing Natural Languages 199 / 476
Lecture 5
Constraint 3: f-structure must be complete
All functions speciﬁed in the value of a PRED feature must be
present in the f-structure of that PRED.
OBJ
PRED
NUM
PERS 3
SG
'Mary'
SUBJ
PRED
TENSE
PRED
NUM SG
PERS 3
PRES
'John'
'like 'SUBJ OBJ
Complete f-structure Incoherent f-structure
?
SUBJ
PRED
TENSE
PRED
NUM SG
PERS 3
PRES
'John'
'like 'SUBJ OBJ
IA161 Syntactic Formalisms for Parsing Natural Languages 200 / 476
Lecture 5
Correspondence between diﬀerent levels in LFG
C-structure
PP
P NP
Nwith
friends
PRED
OBJ
PRED
NUM PLURAL
'friend'
'with'
+
PP
P NP
Nwith
friends
PRED
OBJ
PRED
NUM PLURAL
'friend'
'with'
1
2
3
4
IA161 Syntactic Formalisms for Parsing Natural Languages 201 / 476
Lecture 5
Structural correspondence
c-structures and f-structures represent diﬀerent properties of an
utterance
How can these structures be associated properly to a particular
sentence?
Words and their ordering carry information about the linguistic
dependencies in thesentence
This is represented by the c-structure (licensed by a CFG)
LFG proposes simple mechanisms that maps between elements
from one structure and those of another: correspondence
functions
A function allows to map c-structures to f-structures Φ : N → F
IA161 Syntactic Formalisms for Parsing Natural Languages 202 / 476
Lecture 5
Mapping the c-structure into the f-structure
Since there is no isomorphic relationship between structure and
function LFG assumes c-structure and f-structure
The mapping between c-structure and f-structure is the core of
LFG‘s descriptive power
The mapping between c-structure and f-structure is located in
the grammar (PS) rules
c-structure f-structure
S
NP VP
D N V NP
D Nthe mouse admired
the elephant
SUBJ
TENSE
PRED
OBJ
PRED
DEF
NUM
PERS
PAST
SUBJ OBJ
PRED
DEF
NUM
PERS 3
SG
SG
+
3
'mouse'
+
'elephant'
'admire '
?
IA161 Syntactic Formalisms for Parsing Natural Languages 203 / 476
Lecture 5
Mapping mechanism: 6 steps
.
STEP 1: PS rules
..
......
Context-free phrase structure rules
Annotated with functional schemata
- EX.:
mother node
(without functional
schemata)
S NP VP
( SUBJ)= = daughter nodes
(with (a list of)
functional schemata)
- EX.: NP NP NP
= =
VP V (NP)
= ( SUBJ)=
Note:
is sometimes
omitted!
(this means nodes
without functional
schemata percolate
their entire
functional schema
unchanged to the
mother node
=
IA161 Syntactic Formalisms for Parsing Natural Languages 204 / 476
Lecture 5
Mapping mechanism: 6 steps
.
STEP 2: Lexicon entries
..
......
Lexicon entries consists of three parts: representation of the
word, syntactic category, list of functional schemata
Ex.: mouse N (↑PRED)=’mouse’
(↑PERS)=3
(↑NUM)=SG
the D (↑DEF)=+
admire V (↑PRED)=’admire ⟨(↑ SUBJ)(↑ OBJ)⟩’
-ed Aﬀ (↑TENSE)=PAST
IA161 Syntactic Formalisms for Parsing Natural Languages 205 / 476
Lecture 5
Mapping mechanism: 6 steps
.
STEP 3: c-structure
..
......
Like the PS rules, each node in the tree is associated with a functional schemata
With the functional schemata of the lexical entries at the leaves we obtain a complete c-structure
↔VP
↑=↓
NP
(↑ SUBJ) =↓
S →
S
(↑ SUBJ) =↓ ↑=↓
NP VP
S
(↑ SUBJ) =↓
NP
↑=↓
VP
↑=↓
D
↑=↓
N
↑=↓
V
(↑ OBJ) =↓
NP
(↑ DEF) = +
the
(↑ PRED) = ’mouse’
(↑ PRED) = 3
(↑ PRED) = SG
mouse
(↑ PRED) =
’admire ⟨(↑ SUBJ)(↑ OBJ)⟩ ’
(↑ TENSE) = PAST
admired
↑=↓
D
(↑ DEF) = +
the
↑=↓
N
(↑ PRED) = ’elephant’
(↑ PRED) = 3
(↑ PRED) = SG
elephant
IA161 Syntactic Formalisms for Parsing Natural Languages 206 / 476
Lecture 5
Mapping mechanism: 6 steps
.
STEP 4: Co-indexation
..
......
An f-structure is assigned to each node of the c-structure
Each of these f-structures obtains a name (f1 − fn)
Nodes in the c-structure and associated f-structure are co-indexed, i.e. obtain the same name
F-structure names f1 − fn can be chosen freely but they may not occur twice
S
(↑ SUBJ) =↓
NP
↑=↓
VP
↑=↓
D
↑=↓
N
↑=↓
V
(↑ OBJ) =↓
NP
(↑ DEF) = +
the
(↑ PRED) = ’mouse’
(↑ PRED) = 3
(↑ PRED) = SG
mouse
(↑ PRED) =
’admire ⟨(↑ SUBJ)(↑ OBJ)⟩ ’
(↑ TENSE) = PAST
admired
↑=↓
D
(↑ DEF) = +
the
↑=↓
N
(↑ PRED) = ’elephant’
(↑ PRED) = 3
(↑ PRED) = SG
elephant
f1
f2 f5
f3 f4 f6 f7
f8 f9
f1[ ]
f2[ ] f3[ ] f4[ ]
f5[ ]
f6[ ]
f7[ ] f8[ ] f9[ ]
IA161 Syntactic Formalisms for Parsing Natural Languages 207 / 476
Lecture 5
Mapping mechanism: 6 steps
.
STEP 5: Metavariable biding
..
......
All meta-variables are replaced by the names of the f-structure representation
S
(↑ SUBJ) =↓ ↑=↓
NP VP
f1
f2 f5
−→
S
(f1SUBJ) = f2 f1 = f5
NP VP
f1
f2 f5
S
(f1SUBJ) = f2
NP
f1 = f5
VP
f2 = f3
D
f1 = f4
N
f5 = f6
V
(f5OBJ) = f7
NP
(f3DEF) = +
the
(f4PRED) = ’mouse’
(f4PRED) = 3
(f4PRED) = SG
mouse
(f6PRED) =
’admire ⟨(f6SUBJ)(f6OBJ)⟩ ’
(f6TENSE) = PAST
admired
f7 = f8
D
(f8DEF) = +
the
f7 = f9
N
(f9PRED) = ’elephant’
(f9PRED) = 3
(f9PRED) = SG
elephant
f1
f2 f5
f3 f4 f6 f7
f8 f9
IA161 Syntactic Formalisms for Parsing Natural Languages 208 / 476
Lecture 5
Mapping mechanism: 6 steps
.
......
We introduce at this point the notion of functional equation
By listing all functional equations from a c-structure we obtain
the functional description, called f-description
(f1SUBJ) = f2 (f6PRED) = ’admire ⟨(f6SUBJ)(f6OBJ)⟩ ’
f2 = f3 (f6TENSE) = PAST
(f3DEF) = + (f5OBJ) = f7
f2 = f4 f7 = f8
(f4PRED) = ’mouse’ (f8DEF) = +
(f4PERS) = 3 f7 = f9
(f4NUM) = SG (f9PRED) = ’elephant’
f1 = f5 (f9PERS) = 3
f5 = f6 (f9NUM) = SG
Table : f-description
IA161 Syntactic Formalisms for Parsing Natural Languages 209 / 476
Lecture 5
Mapping mechanism: 6 steps
.
STEP 6: From f-description to f-structure
..
......
Computation of an f-structure is based on the f-description
For the derivation of f-structures from the f-description it is
important that no information is lost and that no information will
be added
The derivation is done by the application of the functional
equations
List of functional equations
a) simple equations of the form: fnA) = B
b) f-equations of the form: fn = fm
c) f-equations of the form: fnA) = fm
→ Functional equations with the same name are grouped into
an f-structure of the same name
IA161 Syntactic Formalisms for Parsing Natural Languages 210 / 476
Lecture 5
Application of the functional equation (a): (fnA) = B
DEF =+
PRED ='mouse'
PERS =3
NUM =SG
PRED ='admire SUBJ OBJ '
TENSE =PAST
DEF =+
PRED ='ELEPHANT'
PERS =3
NUM =SG
PRED 'mouse'
PERS 3
NUM SG
PRED 'admire
TENSE PAST
SUBJ OBJ '
DEF +
DEF +
PRED 'elephant'
PERS 3
NUM SG
Application of the functional equation (b): fn = fm
PRED 'admire
TENSE PAST
SUBJ OBJ '
PRED 'mouse'
PERS 3
NUM SG
DEF + DEF +
PRED 'elephant'
PERS 3
NUM SG
DEF + DEF +
uniﬁcation uniﬁcation
IA161 Syntactic Formalisms for Parsing Natural Languages 211 / 476
Lecture 5
Application of the functional equation (c): (fnA) = fm
SUBJ
OBJ
PRED 'mouse'
PERS 3
NUM SG
DEF +
PRED 'admire
TENSE PAST
SUBJ OBJ '
PRED 'elephant'
PERS 3
NUM SG
DEF +
PRED 'mouse'
PERS 3
NUM SG
DEF +
PRED 'elephant'
PERS 3
NUM SG
DEF +
SUBJ
OBJ
uniﬁcation
uniﬁcation
IA161 Syntactic Formalisms for Parsing Natural Languages 212 / 476
Lecture 5
.
STEP 1: lexical entries
..
......
made: V (↑PRED)=’MAKE⟨SUBJ,OBJ,XCOMP⟩’
(↑XCOMP SUBJ)=(↑OBJ)
(↑TENSE)=SIMPLEPAST
gave: V (↑PRED)=’GIVE⟨SUBJ,OBJ,OBJ2⟩’
(↑TENSE)=SIMPLEPAST
had said: V (↑PRED)=’SAY⟨SUBJ,OBJ⟩’
(↑TENSE)=PASTPERFECT
the: D (↑PRED)=’THE’
(↑SPECTYPE)=DEF
about: P (↑PRED)=’ABOUT⟨OBJ⟩’
which: N (↑PRED)=’PRO’
(↑PRONTYPE)=REL
John’s: D (↑PRED)=’JOHN’
(↑SPECTYPE)=POSS
many: D (↑PRED)=’MANY’
(↑SPECTYPE)=QUANT
things: N (↑PRED)=’THINGS’
(↑NUM)=PLURAL
.
STEP 2: c-structure
..
......
a. S →
NP
(↑ SUBJ) =↓
VP
↑=↓
b. NP →
{
A N
↑=↓ ↑=↓
}
c. VP →
V
↑=↓
NP
(↑ SUBJ) =↓
V
(↑ XCOMP) =↓
(↑ XCOMP PRED) = ’be ⟨SUBJ, PREDIC⟩ ’
d. V →
NP
(↑ PREDIC) =↓
IA161 Syntactic Formalisms for Parsing Natural Languages 213 / 476
Lecture 5
.
STEP 3: f-structure
..
......
'John made Peter angry'
S
SUBJ = =
NP
=
N
John
VP
= OBJ = XCOMP =
made
V NP
=
N
Peter
V
PREDIC =
NP
=
A
angry
= =
SUBJ =
=
OBJ =
=
XCOMP =
XCOMP PREDIC ='be SUBJ, PRED '
PREDIC =
=
.
STEP 4: uniﬁcation
..
......
PRED 'make 'SUBJ, XCOMP
TENSE simplepast
SUBJ , PRED 'John'
, PRED 'Peter'OBJ
PRED 'be SUBJ, PRED '
SUBJ
PREDIC , PRED 'angry'
XCOMP
, ,
IA161 Syntactic Formalisms for Parsing Natural Languages 214 / 476
Lecture 6
.
......
Syntactic Formalisms for Parsing
Natural Languages
Aleš Horák, Miloš Jakubíček, Vojtěch Kovář
(based on slides by Juyeon Kang)
ia161@nlp.fi.muni.cz
Autumn 2013
IA161 Syntactic Formalisms for Parsing Natural Languages 215 / 476
Lecture 6
.
...... Parsing with CCG
IA161 Syntactic Formalisms for Parsing Natural Languages 216 / 476
Lecture 6
Outline
1 A-B categorial system
2 Lambek calculus
3 Extended Categorial Grammar
Variation based on Lambek calculus
Abstract Categorial Grammar, Categorial Type Logic
Variation based on Combinatory Logic
Combinatory Categorial Grammar (CCG)
Multi-modal Combinatory Categorial Grammar
IA161 Syntactic Formalisms for Parsing Natural Languages 217 / 476
Lecture 6
Categorial Grammar is
: a lexicalized theory of grammar along with other theories of
grammar such as HPSG, TAG, LFG, …
: linguistically and computationally attractive
−→ language invariant combination rules, high eﬃcient parsing
IA161 Syntactic Formalisms for Parsing Natural Languages 218 / 476
Lecture 6
Main idea in CG and application operation
All natural language consists of operators and of operands.
Operator (functor) and operand (argument)
Application: (operator(operand))
Categorial type: typed operator and operand
IA161 Syntactic Formalisms for Parsing Natural Languages 219 / 476
Lecture 6
1. A-B categorial system
.
......
The product of the directional adaptation by Bar-Hillel (1953) of Ajdukiewicz’s
calculus of syntactic connection (Ajdukiewicz, 1935)
Deﬁnition 1 (AB categories).
Given A, a ﬁnite set of atomic categories, the set of
categories C is the smallest set such that:
A ⊆ C
(X\Y), (X/Y) ∈ C if X, Y ∈ C
IA161 Syntactic Formalisms for Parsing Natural Languages 220 / 476
Lecture 6
1. A-B categorial system
Categories (type): primitive categories and derivative
categories
Primitive: S for sentence, N for nominal phrase
Derivative: S/N, N/N, (S\N)/N, NN/N, S/S . . .
Forward(>) and backward (<) functional application
a. X/Y Y ⇒ X (>)
b. Y X\Y ⇒ X (<)
IA161 Syntactic Formalisms for Parsing Natural Languages 221 / 476
Lecture 6
1. A-B categorial system
Calculus on types in CG are analogue to algebraic
operations
.
...... x/y y → x ≈ 3/5 ∗ 5 = 3
Brazil defeated Germany
n (s\n)/n n
>
s\n
<
s
IA161 Syntactic Formalisms for Parsing Natural Languages 222 / 476
Lecture 6
1. A-B categorial system
Applicative tree of Brazil defeated Germany
defeated
operator
Germany
operand
Brazil
operand
@ defeated (Germany)
@ ((defeated(Germany))Brazil)
IA161 Syntactic Formalisms for Parsing Natural Languages 223 / 476
Lecture 6
Limitation of AB system
1 Relative construction
a. teami that ti defeated Germany
b. teami that Brazil defeated ti
a’. that (n\n)/(s\n) team [that](n\n)/(s\n) [defeated Germany]s\n
b’. that (n\n)/(s/n) team [that](n\n)/(s/n) [Brazil defeated]s/n
(?)
team that
(n\n)/(s/n)
Brazil
n
defeated
(s\n)/n
3 Many others complex phenomena
Coordination, object extraction, phrasal verbs, ...
4 AB’s generative power is too weak – context-free
IA161 Syntactic Formalisms for Parsing Natural Languages 224 / 476
Lecture 6
2. Lambek calculus (Lambek, 1958, 1961)
the calculus of syntactic types
still context-free
The axioms of Lambek calculus are the following:
1 x → x
2 (xy)z → x(yz) → (xy)z (the axioms 1, 2 with inference rules, 3, 4, 5)
3 If xy → z then x → z/y, if xy → z then y → x\z;
4 If x → z/y then xy → z, if y → x\z then xy → z;
5 If x → y and y → z then x → z.
IA161 Syntactic Formalisms for Parsing Natural Languages 225 / 476
Lecture 6
2. Lambek calculus (Lambek, 1958, 1961)
The rules obtained from the previous axioms are the
following:
1 Hypothesis: if x and y are types, then x/y and y\x are types.
2 Application rules : (x/y)y → x, y(y\x) → x
ex: Poor John works.
3 Associativity rule : (x\y)/z ↔ x\(y/z)
ex: John likes Jane.
4 Composition rules : (x/y)(y/z) → x/z, (x\y)(y\z) → x\z
ex: He likes him.
s/(n\s)n\s/n
5 Type-raising rules : x → y/(x/y), x → (y/x)\y
IA161 Syntactic Formalisms for Parsing Natural Languages 226 / 476
Lecture 6
3. Combinatory Categorial Grammar
Developed originally by M. Steedman (1988, 1990, 2000, ...)
Combinatory Categorial Grammar (CCG) is a grammar
formalism equivalent to Tree Adjoining Grammar, i.e.
it is lexicalized
it is parsable in polynomial time (See Vijay-Shanker and Weir,
1990)
it can capture cross-serial dependencies
Just like TAG, CCG is used for grammar writing
CCG is especially suitable for statistical parsing
IA161 Syntactic Formalisms for Parsing Natural Languages 227 / 476
Lecture 6
3. Combinatory Categorial Grammar
several of the combinators which Curry and Feys (1958)
use to deﬁne the λ-calculus and applicative systems in
general are of considerable syntactic interest (Steedman, 1988)
The relationships of these combinators to terms of the
λ-calculus are deﬁned by the following equivalences
(Steedman, 2000b):
a.Bfg ≡ λx.f(gx) ... composition
b.Tx ≡ λf.fx ... type-raising
c.Sfg ≡ λx.fx(gx) ... substitution
IA161 Syntactic Formalisms for Parsing Natural Languages 228 / 476
Lecture 6
CCG categories
Atomic categories: S, N, NP, PP, TV…
Complex categories are built recursively from atomic categories
and slashes
Example complex categories for verbs:
intransitive verb: S\NP walked
transitive verb: (S\NP)/NP respected
ditransitive verb: ((S\NP)/NP)/NP gave
IA161 Syntactic Formalisms for Parsing Natural Languages 229 / 476
Lecture 6
Lexical categories in CCG
An elementary syntactic structure – a lexical category – is
assigned to each word in a sentence, eg:
walked: S\NP ‘give me an NP to my left and I return a sentence’
Think of the lexical category for a verb as a function: NP is the
argument, S the result, and the slash indicates the direction of
the argument
IA161 Syntactic Formalisms for Parsing Natural Languages 230 / 476
Lecture 6
The typed lexicon item
The CCG lexicon assigns categories to words, i.e. it speciﬁes
which categories a word can have.
Furthermore, the lexicon speciﬁes the semantic counterpart of
the syntactic rules, e.g.:
love (S\NP)/NPλxλy.loves
′
xy
Combinatory rules determine what happens with the category
and the semantics on combination
IA161 Syntactic Formalisms for Parsing Natural Languages 231 / 476
Lecture 6
The typed lexicon item
Attribution of types to lexical items: examples
Predicate
ex: is as an identiﬁcator of nominal
as an operator of predication from a nominal (S\NP)/NP
from an adjective (S\NP)/(N/N)
from an adverb (S\NP)/(S\NP)\(S\NP)
from a preposition (S\NP)/((S\NP)\(S\NP)/NP)
ex: verbs unary (S\NP)
binary (S\NP)/NP
ternary (S\NP)/NP/NP
IA161 Syntactic Formalisms for Parsing Natural Languages 232 / 476
Lecture 6
The typed lexicon item
Adverbs
Adverb of verb
(S\NP)/(S\NP)
(S\NP)/NP/(S\NP)/NP
Adverb of adverb
(S\NP)/(S\NP)/(S\NP)/(S\NP)
(S\NP)/NP/(S\NP)/NP/(S\NP)/NP/(S\NP)/NP
Adverb of adjective
(N/N)/(N/N)
(N\N)/(N\N)
Adverb of proposition
S/S
.
...... Adverb: operator of determination of type (X/X)
IA161 Syntactic Formalisms for Parsing Natural Languages 233 / 476
Lecture 6
The typed lexicon item
Preposition
Prep. 1:
constructor of adverbial phrase
(S\NP)\(S\NP)/NP
(S/S)/NP
(S/S)/N
Prep. 2:
constructor of adjectival phrase
(N\N)/NP
(N\N)/N
.
...... Preposition: constructor of determination of type (X/X)
IA161 Syntactic Formalisms for Parsing Natural Languages 234 / 476
Lecture 6
Dictionary of typed words
Syntactic categories Syntactic types Lexical entries
Nom. N Olivia, apple…
Completed nom. NP an apple, the school
Pron. NP She, he…
Adj. (N/N), (N\N) pretty woman,…
Adv. (N/N)/(N/N), very delicious,…
(S\NP)\(S\NP)…
Vb (S\NP), (S\NP)/NP… run, give…
Prep. (S\NP)\(S\NP)/NP run in the park,
(NP\NP)/NP… book of John, …
Relative (S\NP)/S… I believe that…
IA161 Syntactic Formalisms for Parsing Natural Languages 235 / 476
Lecture 6
Combinatorial categorial rules
Functional application (>, <)
Functional composition (> B, < B)
Type-raising (< T, > T)
Distribution (< S, > S)
Coordination (< Φ, > Φ)
IA161 Syntactic Formalisms for Parsing Natural Languages 236 / 476
Lecture 6
Functional application (FA)
X/Y : f Y : a⇒ X : fa(forward functional application, >)
Y : a X\Y : f⇒ X : fa(backward functional application, <)
Combine a function with its argument:
John likes Mary ((likes (Mary))John)
S
S\NP (likes (Mary))
NP (S\NP)/NP NP
Mary sleeps (sleeps (Mary))
S
NP S\NP
Direction of the slash indicates position of the argument with
respect to the function
IA161 Syntactic Formalisms for Parsing Natural Languages 237 / 476
Lecture 6
Derivation in CCG
The combinatorial rule used in each derivation step is usually
indicated on the right of the derivation line
Note especially what happens with the semantic information
John loves Mary
NP : John′ (S\NP)/NP : λxλy.loves′xy NP : Mary′
>
S\NP : λy.loves′Mary′y
<
S : loves′Mary′John′
IA161 Syntactic Formalisms for Parsing Natural Languages 238 / 476
Lecture 6
Function composition (FC)
Generalized forward composition (> Bn)
X/Y : f Y/Z : g ⇒B X/Z : λx.f(gx) (> B)
Functional composition composes two complex categories (two
functions):
(S\NP)/PP (PP/NP) ⇒B (S\NP)/NP
S/(S\NP) (S\NP)/NP ⇒B S/NP
S
>
S/NP
> B
S/(S\NP)
> T
NP
birds
(S\NP)/NP
like
NP
bugs
IA161 Syntactic Formalisms for Parsing Natural Languages 239 / 476
Lecture 6
Function composition (FC)
Generalized backward composition (< Bn)
Y\Z : f X\Y : g ⇒B X\Z : x.f(gx) (< B)
The referee gave
(s/np)/np
Unsal
np
a card
np
and
(X\X)/X
Rivaldo
np
the ball
np
<T
(s/np)\((s/np)/np)
<T
s\(s/np)
<T
(s/np)\((s/np)/np)
<T
s\(s/np)
<B
s\((s/np)/np
<B
s\((s/np)/np
< Φ >
s\((s/np)/np
<
s
IA161 Syntactic Formalisms for Parsing Natural Languages 240 / 476
Lecture 6
Type-raising (T)
Forward type-raising (> T)
X : a ⇒ T/(T\X) : λf.fa (> T)
Type-raising turns an argument into a function (e.g. for case
assignment)
NP ⇒ S/(S\NP) (nominative)
birds
NP
ﬂy
S\NP
<
S
birds
NP
> T
S/(S\NP)
>
S
ﬂy
S\NP
This must be used e.g. in the case of WH-questions
IA161 Syntactic Formalisms for Parsing Natural Languages 241 / 476
Lecture 6
Example of functional composition (> B) and
type-raising (T)
team
n
that
(n\n)/(s/np)
I
np
>T
s/(s\np)
>B
s/s
thought
(s\np)/s
that
s/s
Brazil
np
>T
s/(s\np)
>B
s/np
defeated
(s\np)/np
>B
s/np
>B
s/np
>
n\n
<
n
IA161 Syntactic Formalisms for Parsing Natural Languages 242 / 476
Lecture 6
Example of functional composition (> B) and
type-raising (T)
Backward type-raising (< T)
X : a ⇒ T\(T/X) : λf.fa (< T)
Type-raising turns an argument into a function (e.g. for case
assignment)
NP ⇒ (S\NP)\((S\NP)/NP) (accusative)
The referee gave
(s/np)/np
Unsal
np
<T
(s/np)\((s/np)/np)
a card
np
<T
s\(s/np)
and
(X\X)/X
Rivaldo
np
<T
(s/np)\((s/np)/np)
the ball
np
<T
s\(s/np)
<B
s\((s/np)/np)
<B
s\((s/np)/np)
< Φ >
s\((s/np)/np)
<
s
IA161 Syntactic Formalisms for Parsing Natural Languages 243 / 476
Lecture 6
Coordination (&)
X CONJ X ⇒Φ X (Coordination(Φ))
give
(VP/NP)/NP
a dog
<T
(VP/NP)\((VP/NP)/NP)
a bone
<T
VP\(VP/NP)
and
conj
a policeman
<T
(VP/NP)\((VP/NP)/NP)
a ﬂower
<T
VP\(VP/NP)
<B
VP\((VP/NP)/NP)
<B
VP\((VP/NP)/NP)
< & >
VP\((VP/NP)/NP)
<
VP
IA161 Syntactic Formalisms for Parsing Natural Languages 244 / 476
Lecture 6
Substitution (S)
Forward substitution (> S)
(X/Y)/Z Y/Z ⇒S X/Z
Application to parasitic gap such as the following:
a. team that I persuaded every detractor of to
support
team that
(n\n)/(s/np)
I
np
>T
s/(s\np)
persuaded
((s\np)/(s\np))/np
every detractor of
np/np
to support
(s\np)/np
>B
((s\np)/(s\np))/np
>S
(s\np)/np
>B
s/np
>
n\n
IA161 Syntactic Formalisms for Parsing Natural Languages 245 / 476
Lecture 6
Substitution (S)
Backward crossed substitution (< S×)
Y/Z (X\Y)/Z ⇒S X/Z
Application to parasitic gap such as the following:
a. John watched without enjoying the game between
Germany and Paraguay.
b. game that John watched without enjoying
.
...... game that John [watched](s\np)/np [without enjoying]((s\np)\(s\np))/np
game that
(n\n)/(s/np)
John
np
>T
s/(s\np)
watched
(s\np)/np
without enjoying
((s\np)\(s\np))/np
<S×
(s\np)/np
>B
s/(s\np)
>
n\n
IA161 Syntactic Formalisms for Parsing Natural Languages 246 / 476
Lecture 6
Limit on possible rules
The Principle of Adjacency:
Combinatory rules may only apply to entities which are
linguistically realised and adjacent.
The Principle of Directional Consistency:
All syntactic combinatory rules must be consistent with the
directionality of the principal function. ex: X\Y Y ̸=> X
The Principle of Directional Inheritance:
If the category that results from the application of a
combinatory rule is a function category, then the slash
deﬁning directionality for a given argument in that category
will be the same as the one deﬁning directionality for the
corresponding arguments in the input functions. ex:
X/Y Y/Z ̸=> X\Z.
IA161 Syntactic Formalisms for Parsing Natural Languages 247 / 476
Lecture 6
Semantic in CCG
CCG oﬀers a syntax-semantics interface.
The lexical categories are augmented with an explicit
identiﬁcation of their semantic interpretation and the rules of
functional application are accordingly expanded with an explicit
semantics.
Every syntactic category and rule has a semantic counterpart.
The lexicon is used to pair words with syntactic categories and
semantic interpretations:
love (S\NP)/NP ⇒ λxλy.loves′xy
IA161 Syntactic Formalisms for Parsing Natural Languages 248 / 476
Lecture 6
Semantic in CCG
The semantic interpretation of all combinatory rules is fully
determined by the Principle of Type Transparency:
Categories: All syntactic categories reﬂect the semantic type of
the associated logical form.
Rules: All syntactic combinatory rules are type-transparent
versions of one of a small number of semantic operations over
functions including application, composition, and type-raising.
IA161 Syntactic Formalisms for Parsing Natural Languages 249 / 476
Lecture 6
Semantic in CCG
proved := (S\NP3s)/NP : λxλy.prove′xy
the semantic type of the reduction is the same as its syntactic
type, here functional application.
Marcel
NP3sm : marcel′
proved
(S\NP3s)/NP : λxλy.prove′xy
completeness
NP : completeness′
>
S\NP3s : λy.prove′completeness′y
<
S : prove′completeness′marcel′
IA161 Syntactic Formalisms for Parsing Natural Languages 250 / 476
Lecture 6
Semantic in CCG
CCG with semantics : Mary will copy and ﬁle without
reading these articles
Mary will
S/VP
copy
VP/NP
and
CONJ
ﬁle
VP/NP
without
(VP\VP)/VPing
reading
VPing/NP
these articles
NP
:p.Mary’ λp.will’ :copy’ :and’ :ﬁle’ λp.λq.without’pq :read’ :articles’
>B
(VP\VP)/VPing
:λx.λq.without’(read’ x)q
<S
VP/NP
:λx.without’(read’x)(ﬁle’x)
< Φ >
VP/NP
:λx.and’(without’(read’x)(ﬁle’x))(copy’x)
<
VP
:and’(without’)(read’articles’)(ﬁle’articles’))(copy’articles’)
>
S
:will’(and’(without’)(read’articles’)(ﬁle’articles’))(copy’articles’))mary’
IA161 Syntactic Formalisms for Parsing Natural Languages 251 / 476
Lecture 6
Parsing a sentence in CCG
Step 1: tokenization
Step 2: tagging the concatenated lexicon
Step 3:
calculate on types attributed to the concatenated lexicons by
applying the adequate combinatorial rules
eliminate the applied combinators (we will see how to do on next
week)
Step 4: ﬁnding the parsing results presented in the form of an
operator/operand structure (predicate -argument structure)
IA161 Syntactic Formalisms for Parsing Natural Languages 252 / 476
Lecture 6
Parsing a sentence in CCG
Example: I requested and would prefer musicals
STEP 1 : tokenization/lemmatization → ex) POS Tagger,
tokenizer, lemmatizer
a. I-requested-and-would-prefer-musicals
b. I-request-ed-and-would-prefer-musical-s
STEP 2 : tagging the concatenated expressions → ex)
Supertagger, Inventory of typed words
I NP
Requested (S\NP)/NP
And CONJ
Would (S\NP)/VP
Prefer VP/NP
musicals NP
IA161 Syntactic Formalisms for Parsing Natural Languages 253 / 476
Lecture 6
Parsing a sentence in CCG
STEP 3 : categorial calculus
c. apply the coordination rules Coordination: (< & >)
X conj X ⇒ X
b. apply the functional composition rules Forward Composition: (> B)
X/Y : f Y/Z : g ⇒ X/Z : Bfg
a. apply the type-raising rules Subject Type-raising (> T)
NP : a ⇒ T/(T\NP) : Ta
7/ S
6/ S/NP NP (>)
5/ S/(S\NP) (S\NP)/NP NP (>B)
4/ S/(S\NP) (S\NP)/NP NP (> Φ)
3/ S/(S\NP) (S\NP)/NP CONJ (S\NP)/NP NP (>B)
2/ S/(S\NP) (S\NP)/NP CONJ (S\NP)/VP VP/NP NP (>T)
1/ NP (S\NP)/NP CONJ (S\NP)/VP VP/NP NP
I- requested- and- would- prefer- musicals
IA161 Syntactic Formalisms for Parsing Natural Languages 254 / 476
Lecture 6
Parsing a sentence in CCG
STEP 4 : semantic representation (predicate-argument
structure)
7/S: and’(will’(prefer’ musicals’) i’)(request’ musicals’ i’)
6/ :λy.and’(would’(prefer’ musicals’)y)(request’ musicals’ y)
5/ : λxλy.and’(will’(prefer’x)y)(request’xy)
4/ : λxλy.and’(will’(prefer’x)y)(request’xy)
3/ : λx.λy.will’(prefer’x)y
2/ :λf.f I’
1/ :i’ :request’ :and’ : will’ :prefer’ : musicals’
I requested and would prefer musicals
IA161 Syntactic Formalisms for Parsing Natural Languages 255 / 476
Lecture 6
Variation of CCG : Multi-modal CCG (Baldridge,
2002)
Modalized CCG system
Combination of Categorial Type Logic (CTL, Morrill, 1994;
Moortgat, 1997) into the CCG (Steedman, 2000)
Rules restrictions by introducing the modalities: *, x, •, ♢
Modalized functional composition rules
(> B) X/♢Y Y/♢Z ⇒ X/♢Z
(< B) X\♢Y Y\♢Z ⇒ X\♢Z
Invite you to read the paper “Multi-Modal CCG” of (Baldridge
and M.Kruijﬀ, 2003 )
IA161 Syntactic Formalisms for Parsing Natural Languages 256 / 476
Lecture 6
The positions of several formalisms on the
Chomsky hierarchy
Turing complete
Context-sensitive
Middly
context-sensitive
Context-free
Unrestricted CTL
CTL with
Non-expanding Rules
Multiset-CCG
CCG
TAG
AB
CTL Base Logic
Lambek Calculus
IA161 Syntactic Formalisms for Parsing Natural Languages 257 / 476
Lecture 7
.
......
Syntactic Formalisms for Parsing
Natural Languages
Aleš Horák, Miloš Jakubíček, Vojtěch Kovář
(based on slides by Juyeon Kang)
ia161@nlp.fi.muni.cz
Autumn 2013
IA161 Syntactic Formalisms for Parsing Natural Languages 258 / 476
Lecture 7
.
...... Parsing with HPSG
IA161 Syntactic Formalisms for Parsing Natural Languages 259 / 476
Lecture 7
Overview on syntactic formalisms
Uniﬁcation based grammars
: HPSG, LFG, TAG, UCG...
Dependency based grammars
: Tesnière model; Meaning-Text of Mel’čuk...
IA161 Syntactic Formalisms for Parsing Natural Languages 260 / 476
Lecture 7
Heritage of HPSG
GPSG – Generalized Phrase-Structure Grammar (Gerald Gazdar)
linear order/hierarchy order feature structure for representation of
information
LFG
Lexicon contains
Lexical rules
CG
Subcategorization
IA161 Syntactic Formalisms for Parsing Natural Languages 261 / 476
Lecture 7
Key points of HPSG
Monostratal theory without derivation
Sharing a given information without movement and
transformation
One representation for diﬀerent levels of analysis : phonology,
syntax, semantic
Constraint-based analysis
Uniﬁcation of given information
Computational formalism
IA161 Syntactic Formalisms for Parsing Natural Languages 262 / 476
Lecture 7
Syntactic representation in HPSG
Typed feature structure
consists of a couple “attribute/value”
the types are organized into a hierarchy
ex: sign>phrase, case>nominative
feature structure is a directed acyclic graph (DAG), with arcs
representing features going between values
IA161 Syntactic Formalisms for Parsing Natural Languages 263 / 476
Lecture 7
Features
Basic element of structure in HPSG
Should be appropriate to a type
Most frequently used features
PHON
SYNSEM
LOC/NON-LOC
CAT
CONTEXT
CONTENT
HEAD
SUJ
COMPS
S-ARG
IA161 Syntactic Formalisms for Parsing Natural Languages 264 / 476
Lecture 7
Types
Types are attributed to features -> typed features
sign
synsem
head
phrase
content
Index
....
Each of these feature values is itself a complex object:
The type sign has the features PHON and SYNSEM appropriate for
it
The feature SYNSEM has a value of type synsem
This type itself has relevant features (LOCAL and NONLOCAL)
IA161 Syntactic Formalisms for Parsing Natural Languages 265 / 476
Lecture 7
Types
sign is the basic type in HPSG used to describe lexical items (of
type word) and phrases (of type phrase).
All signs carry the following two features:
PHON encodes the phonological representation of the sign
SYNSEM syntax and semantics
sign


PHON list(phon-string)
SYNSEM synsem


IA161 Syntactic Formalisms for Parsing Natural Languages 266 / 476
Lecture 7
Types
In attribute-value matrix (AVM) form, here is the skeleton of an
object:














sign
PHON list(PHON)
SYNSEM





synsem
LOCAL local
NON-LOCAL non-local





DTRS list(SIGN)














IA161 Syntactic Formalisms for Parsing Natural Languages 267 / 476
Lecture 7
Structure of signs in HPSG
synsem introduces the features LOCAL and NONLOCAL
local introduces CATEGORY (CAT), CONTENT (CONT) and
CONTEXT(CONX)
non-local will be discussed in connection with unbounded
dependencies
category includes the syntactic category and the grammatical
argument of the word/phrase
IA161 Syntactic Formalisms for Parsing Natural Languages 268 / 476
Lecture 7
Description of an object in HPSG:
lexical sign and phrasal sign
sing
[
PHON list(phon-string)
SYNSEM synsem
]
word phrase
[
DTRS constituent-struc
]
synsem


LOCAL local
NON-LOCAL non-local


local





CATEGORY category
CONTENT content
CONTEXT context





category





HEAD head
VAL ...
... ...





IA161 Syntactic Formalisms for Parsing Natural Languages 269 / 476
Lecture 7
CATEGORY
CATEGORY encode the sign’s syntactic category
Given via the feature [HEAD head], where head is the supertype
for noun, verb, adjective, preposition, determiner, marker; each of
these types selects a particular set of head features
Given via the feature [VALENCE ...], possible to combine the
signs with the other signs to a larger phrases




SYNSEM|LOC|CAT|VALENCE
valence




SUBJECT list(synsem)
SPECIFIER list(synsem)
COMPLEMENTS list(synsem)








IA161 Syntactic Formalisms for Parsing Natural Languages 270 / 476
Lecture 7
Sub-categorization of head type
vform
ﬁnite inﬁnitive base gerund present-part. past-part. passive-part.
case
nominative accusative
pform
of to ...
IA161 Syntactic Formalisms for Parsing Natural Languages 271 / 476
Lecture 7
Description of an object in HPSG
sing
[
PHON list(phon-string)
SYNSEM synsem
]
word phrase
[
DTRS constituent-struc
]
synsem


LOCAL local
NON-LOCAL non-local


local





CATEGORY category
CONTENT content
CONTEXT context





category





HEAD head
VAL ...
... ...





IA161 Syntactic Formalisms for Parsing Natural Languages 272 / 476
Lecture 7
Semantic representation: CONTENT
(& CONTEXT) feature
Semantic interpretation of the sign is given as the value to
CONTENT
nominal-object: an individual/entity (or a set of them),
associated with a referring index, bearing agreement features →
INDEX, RESTR
Parameterized-state-of-aﬀairs (psoa): a partial situate; an
event relation along with role names for identifying the
participants of the event→ BACKGR
quantiﬁer: some, all, every, a, the, . . .
Note: many of these have been reformulated by “Minimal
Recursion Semantics (MRS)” which allows underspeciﬁcation of
quantiﬁer scopes.
IA161 Syntactic Formalisms for Parsing Natural Languages 273 / 476
Lecture 7
Sub-categorization of content type
content
... psoa
nom-obj



INDEX index
RESTR set(psoa)



laugh‘
[
LAUGHER ref
]
give‘






GIVER ref
GIVEN ref
GIFT ref






drink‘



DRINKER ref
DRUNKEN ref



think‘



THINKER ref
THOUGHT psoa



.
Note:
..
......
Semantic restriction on the index are represented as a value of RESTR. RESTR is an attribute of a nominal object.
The value of RESTR is a set of psoa. In turn, RESTR has the attribute of REL whose value can either be referential
indices or psoas.
IA161 Syntactic Formalisms for Parsing Natural Languages 274 / 476
Lecture 7
Sub-categorization of index type
index




PERSON person
NUMBER number
GENDER gender




referential there it
person
ﬁrst second third
number
singular plural
pgender
masculine feminine neuter
IA161 Syntactic Formalisms for Parsing Natural Languages 275 / 476
Lecture 7
Lexical input of She
HEAD
VALENCE
INDEX
RESTR
BACKGR
noun
val
ref
psoa
CASE
SUBJ
COMPS
SPR
PER
NUM
GEND
RELN
INST
nom
3rd
sing
fem
female
1
1
cat
ppro
context
CATEGORY
CONTENT
CONTEXT
LOCALSYNSEM
PHON she
localsynsemword
IA161 Syntactic Formalisms for Parsing Natural Languages 276 / 476
Lecture 7
Lexical input of She
sign
word
phrase
PHON
SYNSEM
list(phon-string)
synsem
DTRS constituent-struc
Each phrase has a DTRS attribute which has a
constituent-structure value
This DTRS value corresponds to what we view in a tree as
daughters (with additional grammatical role information, e.g.
adjunct, complement, etc.)
By distinguishing diﬀerent kinds of constituent-structures, we
can deﬁne diﬀerent kinds of constructions in a language
IA161 Syntactic Formalisms for Parsing Natural Languages 277 / 476
Lecture 7
Structure of phrase
head-adj-struc
ADJ-DTR
ADJ-DTR
sign
<>
head-ﬁller-struc
FILL-DTR
FILL-DTR
sign
<>
head-mark-struc
MARK-DTR
MARK-DTR
sign
<>
head-spr-struc
SPR-DTR
SPR-DTR
<sign>
<>
head-subj-struc
SUBJ-DTR
SUBJ-DTR
<sign>
<>
head-comps-struc
COMPS-DTR
COMP-DTR
<sign>
<>
constituent-struc
head-struc coord-struc
HEAD-DTR CONJ-DTRS
CONJUNCTION-DTR word
set(sign)sign
...
IA161 Syntactic Formalisms for Parsing Natural Languages 278 / 476
Lecture 7
head-subject/complement structure
SYNSEM | LOC | CAT
DTRS
HEAD
VAL
SUBJ
COMPS
head-subj-struc
PHON
SYNSEM
<she> SYNSEM | LOC | CAT
DTRS
HEAD
VAL
SUBJ
COMPS
head-comps-struc
SYNSEM | LOC | CAT
HEAD
VAL
SUBJ
COMPS
PHON
PHON
SYNSEM
<wine>
3
3
1
1
<drinks>
1
2
VFORM ﬁn3
verb 2
S H
H
C
IA161 Syntactic Formalisms for Parsing Natural Languages 279 / 476
Lecture 7
Questions! (1)
How exactly did the last example work?
drink has head information specifying that it is a ﬁnite verb and
subcategories for a subject and an object
The head information gets percolated up (the HEAD feature principle)
The valence information gets “checked oﬀ” as one moves up in the
tree (the VALENCE principle)
Such principles are treated as linguistic universals in HPSG
IA161 Syntactic Formalisms for Parsing Natural Languages 280 / 476
Lecture 7
HEAD-feature principle
The value of the HEAD feature of any headed phrase is
token-identical with the HEAD value of the head daughter
1
DTRS head-struc
SYNSEM | LOC | CAT | HEAD
DTRS | HEAD-DTR | SYNSEM | LOC | CAT | HEADphrase 1
IA161 Syntactic Formalisms for Parsing Natural Languages 281 / 476
Lecture 7
VALENCE principle
In a headed phrase, for each valence feature F, the F value of
the head daughter is the concatenation of the phrase’s F value
with the list of F-DTR’s SYNSEM (Pollard and Sag, 1994:348)
phrase
SS | LOC | CAT | VAL SUBJ
COMPS
[a]
[b]
DTRS
HEAD-DTR
SUBJ-DTR
COMP-DTR
SS | LOC | CAT | VAL
SUBJ [1] [a]
COMPS [2],...,[n] [b]
SS [1]
SS [2] ,..., ss[n]
Note: Valence Principle constrains the way in which
information is shared between phrases and their head
daughters.
F can be any one of SUBJ, COMPS, SPR
When the F-DTR is empty, the F valence feature of the head
daughter will be copied to the mother phrase
IA161 Syntactic Formalisms for Parsing Natural Languages 282 / 476
Lecture 7
Questions! (2)
Note that agreement is handled neatly, simply by the fact that
the SYNSEM values of a word’s daughters are token-identical to
the items on the VALENCE lists
How exactly do we decide on a syntactic structure?
Why the subject is checked oﬀ at a higher point in the tree?
IA161 Syntactic Formalisms for Parsing Natural Languages 283 / 476
Lecture 7
Immediate Dominance (ID) Principle
Every headed phrase must satisfy exactly one of the ID
schemata
The exact inventory of valid ID schemata is language speciﬁc
We will introduce a set of ID schemata for English
IA161 Syntactic Formalisms for Parsing Natural Languages 284 / 476
Lecture 7
Immediate Dominance (ID) Schemata
DTRS head-struc
phrase
DTRS
DTRS
DTRS
DTRS
SS | LOC | CAT | VAL | COMPS
head-spr-struc
head-marker-struc
marker
head-adj-struc
MARK-DTR | SS | LOC | CAT | HEAD
ADJ-DTR | SS | LOC | CAT | HEAD | MOD
HEAD-DTR | SS
(head-subject)
head-comps-struc (head-complement)
(head-speciﬁer)
(head-marker)
(head-adjunct)
...
SS | LOC | CAT | VAL | COMPS
DTRS head-subj-struc
1
1
IA161 Syntactic Formalisms for Parsing Natural Languages 285 / 476
Lecture 7
head-adjunct structure
PHON
SS | LOC | CAT
DTRS
<red,boook>
HEAD
VAL | SPR
head-adj-struc
PHON
SS | LOC | CAT | HEAD
<red>
PRD -
MOD
adj
PHON <book>
SS LOC | CAT
HEAD
VAL | SPR
noun
LOC | CAT | HEAD det
1
2
2
1
3 3
A H
IA161 Syntactic Formalisms for Parsing Natural Languages 286 / 476
Lecture 7
Semantic principle
The CONTENT value of a headed phrase is token identical to the
CONTENT value of the semantic head daughter
The semantic head daughter is identiﬁed as
The ADJ-DTR in a head-adjunct phrase
The HEAD-DTR in other headed phrases
DTRS
phrase
head-struc
SYNSEM | LOC | CONT
DTRS
SYNSEM | LOC | CONT
DTRS
head-adj-struc
ADJ-DTR| SYNSEM | LOC | CONT
HEAD-DTR | SYNSEM | LOC | CONT
(head-adjunct)
(non-head-adjunct)
1
1
1
1
head-adj-struc
IA161 Syntactic Formalisms for Parsing Natural Languages 287 / 476
Lecture 7
Example 2
Kim likes bagels
word
PHON
SYNSEM
Kim
LOCAL
CAT
CONT
HEAD
SUBJ
SPR
COMPS
ARG-ST
INDEX
KEY
RELS
noun
ARG 3sg
named_rel
INST
ARG Kim
1
1
2
2
IA161 Syntactic Formalisms for Parsing Natural Languages 288 / 476
Lecture 7
Example 2
Kim likes(1) bagels
nowARG2
ARG1
t_overlap_rel
ARG2
ARG1
EVENT
like_rel
RELS
KEY
INDEX
content
CONT
ARG-ST
INDEX
content
CONT
COMPS
SPR
SUBJ
HEAD noun
category
CAT
local
LOCAL
synsem
INDEX
content
CONT
COMPS
SPR
SUBJ
HEAD
noun
3sgARG
category
CAT
local
LOCAL
synsem
ﬁn
verb
FORM
HEAD
SUBJ
SPR
COMPS
CAT
LOCAL
SYNSEM
category
local
synsem
word
PHON likes
3
3
4
5
.6
6
3
5
2
1 2,
4
1
IA161 Syntactic Formalisms for Parsing Natural Languages 289 / 476
Lecture 7
Example 2
Kim likes(2) bagels
word
PHON
SYNSEM
likes
LOCAL
CAT
CONT
HEAD
SUBJ
SPR
COMPS
ARG-ST
INDEX
RELS
3sgNP NP
like_rel
EVENT
ARG1
ARG2
ARG2
ARG1
t-overlap_rel
3
now
3
4
5
,
3
1 4 52
1
2
FORM
verb
ﬁn
,
IA161 Syntactic Formalisms for Parsing Natural Languages 290 / 476
Lecture 7
Example 2
Kim likes bagels
word
PHON
SYNSEM
bagels
LOCAL
CAT
CONT
HEAD
SUBJ
SPR
COMPS
ARG-ST
INDEX
KEY
RELS
INST
bagel_rel
DetP
AGR pl
noun
1
2
1
2
3
3
IA161 Syntactic Formalisms for Parsing Natural Languages 291 / 476
Lecture 7
Example 2
head-complement schema
head-comps-ph
PHON
SYNSEM
HEAD-DTR
NON-HEAD-DTRS
LOCAL
PHON
SYNSEM
PHON
SYNSEM RELS
PHON
SYNSEM RELS
CAT
CONT
LOCAL
CAT
CONT
HEAD
SUBJ
SPR
COMPS
INDEX
KEY
RELS
HEAD
SUBJ
SPR
COMPS
INDEX
KEY
RELS
sts
Z...
N
M...
D
, ...
E
2
3
F
1
A
B
E
C
2
3
F M Z...
1
A
B
C D N...
IA161 Syntactic Formalisms for Parsing Natural Languages 292 / 476
Lecture 7
Example 2
head-complement schema headed by likes
head-comps-ph
PHON
SYNSEM
HEAD-DTR
NON-HEAD-DTRS
LOCAL
CAT
CONT
PHON
SYNSEM
PHON
SYNSEM LOCAL | CONT | RELS
LOCAL
CAT
CONT
RELS
like_rel
EVENT
ARG1
ARG2
t-overlap_rel
ARG1
ARG2 now
KEY
INDEX
COMPS NP
SPR
SUBJ
HEAD
NP 3sg
likes
RELS
KEY
INDEX
COMPS
SPR
SUBJ
HEAD
F6
D
22
4
5
,3E
2
3
6 5
B
A 4
verb
FORM ﬁn1
A
2
3
E F
1
A
B
C D
IA161 Syntactic Formalisms for Parsing Natural Languages 293 / 476
Lecture 7
Example 2
Kim likes bagels
head-comps-ph
PHON
SYNSEM LOCAL
CAT
CONT
HEAD
SUBJ
SPR
COMPS
INDEX
KEY
RELS
FORM
NP
EVENT
ARG1
ARG2
ARG1
ARG2
INST
likes, bagels
verb
ﬁn
3sg
like_rel
t-overlap_rel
now
bagel_rel
52
2
4
5
, ,
3
2
3
4
IA161 Syntactic Formalisms for Parsing Natural Languages 294 / 476
Lecture 7
Example 2
head-subject schema
head-subj-ph
PHON
SYNSEM
HEAD-DTR
NON-HEAD-DTRS
LOCAL
CAT
CONT
HEAD
SUBJ
SPR
COMPS
INDEX
KEY
RELS
PHON
SYNSEM LOCAL
CAT
CONT
HEAD
SUBJ
SPR
COMPS
INDEX
KEY
RELS
FORM
verb
ﬁn
PHON
SYNSEM LOCAL | CONT | RELS F4
B
2
3
E
D
C
4
1
A
2
3
E F
1
C
D
B A
IA161 Syntactic Formalisms for Parsing Natural Languages 295 / 476
Lecture 7
Example 2
head-subject schema headed by likes bagels
NON-HEAD-DTRS
PHON
SYNSEM LOCAL | CONT | RELS F4
B
LOCAL
CAT
CONT
HEAD
SUBJ
SPR
COMPS
INDEX
KEY
RELS
FORM
NP
EVENT
ARG1
ARG2
ARG1
ARG2
INST
likes, bagels
verb
ﬁn
3sg
like_rel
t-overlap_rel
now
bagel_rel
2
2
5 , ,
3
2
3
4
head-subj-ph
PHON
SYNSEM
HEAD-DTR
LOCAL
CAT
CONT
HEAD
SUBJ
SPR
COMPS
INDEX
KEY
RELS
2
3
E F
1
C
D
B A
6
6
E
D
C
5
1
PHON
SYNSEM
A
IA161 Syntactic Formalisms for Parsing Natural Languages 296 / 476
Lecture 7
Example 2
Kim likes bagels
head-subj-ph
PHON Kim, likes, bagels
SYNSEM LOCAL
CAT
CONT
HEAD
SUBJ
SPR
COMPS
INDEX
KEY
RELS
verb
ﬁnFORM
named_rel
INST
ARG Kim
like_rel
EVENT
ARG1
ARG2
t-overlap_rel
ARG1
ARG2 now
bagel_rel
INST 6
2
2
5
6
5
2
3
, , ,
IA161 Syntactic Formalisms for Parsing Natural Languages 297 / 476
Lecture 7
Example 2
Tree of Kim likes bagels
head-subj-ph
verbHEAD
SUBJ
SPR
COMPS
word head-comps-ph
verbnoun HEAD
SUBJ
SPR
COMPS
HEAD
SUBJ
SPR
COMPS
HEAD
SUBJ
SPR
COMPS
word word
verb noun
Kim
likes bagels
2
2
1
1 1
HEAD
SUBJ
SPR
COMPS
IA161 Syntactic Formalisms for Parsing Natural Languages 298 / 476
Lecture 7
Compare HPSG to CFG
Each sign or HPSG rule consists of SYNSEM, DTRS, and PHON
parts.
The SYNSEM part speciﬁes how the syntax and semantics of the
phrase (or word) are constrained. It corresponds roughly to the
left-hand side of CFG rules but contains much more information.
The DTRS part speciﬁes the constituents that make up the
phrase (if it is a phrase). (Each of these constituents is a
complete sign.) This corresponds to part of the information on
the right-hand side of CFG rules, but not to ordering
information.
The PHON part speciﬁes the ordering of the constituents in
DTRS (where this is constrained) and the pronunciation of these
(if this is speciﬁable). This corresponds to the the ordering
information on the right-hand side of CFG rules.
IA161 Syntactic Formalisms for Parsing Natural Languages 299 / 476
Lecture 7
Simulation of Bottom-up parsing algorithm in HPSG
Unify input lexical-signs with lexical-signs in the lexicon.
Until no more such uniﬁcations are possible
Unify instantiated signs with the daughters of instantiated phrasal
signs or with phrasal signs in the grammar.
.
......
if
all instantiated signs but one saturated one (S) are associated with daughters of
other instantiated signs and the PHON value of all instantiated signs is
completely speciﬁed
return the complete S structure
else fail.
IA161 Syntactic Formalisms for Parsing Natural Languages 300 / 476
Lecture 7
Example 2: processing of uniﬁcation
Kim walks
The words in the sentence specify only their pronunciations
and their positions.
1 [PHON (( 0 1 kim))]
2 [PHON (( 1 2 walks))]
.
STEP 1: Unifying 1 with the lexical entry for Kim gives
..
......
3 [PHON ((0 1 kim))
SYNSEM [CAT [HEAD noun SUBCAT ()]
CONTENT [INDEX 1 [PER 3rd NUM sing]]
CONTEXT [BACKGR {[RELN naming BEARER 1 NAME Kim]}]]]
We now know something about the meaning of Kim (it refers to somebody named
Kim) and something about its syntactic properties (it is third person singular).
IA161 Syntactic Formalisms for Parsing Natural Languages 301 / 476
Lecture 7
Example 2: processing of uniﬁcation
1 [PHON (( 0 1 kim))]
2 [PHON (( 1 2 walks))]
.
STEP 2: Unifying 2 with the lexical entry for walks gives
..
......
4 [PHON ((1 2 walks))
SYNSEM [CAT [HEAD [VFORM ﬁn]
SUBCAT ([CAT [HEAD noun SUBCAT ()]
CONTENT [INDEX 1 [PER 3rd NUM sing]]])]
CONTENT [RELN walk WALKER 1]]]
We know that walks refers to walking and that it requires a subject noun phrase which
refers to the walker but doesn’t require any object.
IA161 Syntactic Formalisms for Parsing Natural Languages 302 / 476
Lecture 7
Example 2: processing of uniﬁcation
.
HEAD-DTR rule
..
......
[SYNSEM [CAT [HEAD 1 SUBCAT (2)]
CONTENT 4]
DTRS [HEAD-DTR [SYNSEM [CAT [HEAD 1 SUBCAT (2)]
CONTENT 4]
PHON 3]
SUBJ-DTRS ()]
PHON 3]
.
STEP 3: Unifying 4 with the HEAD-DTR of this rule gives
..
......
5 [SYNSEM [CAT [HEAD [VFORM ﬁn]
SUBCAT 2([CAT [HEAD noun SUBCAT ()]
CONTENT [INDEX 1 [PER 3rd NUM sing]]])]
CONTENT 4[RELN walk WALKER 1]]
DTRS [HEAD-DTR [SYNSEM [CAT [HEAD [VFORM ﬁn] SUBCAT (2)]]
CONTENT [4]
PHON 3((1 2 walks))]
SUBJ-DTRS ()]
PHON 3((1 2 walks))]
Now we have a VP with the transitive verb walks as its head (and only constituent).
IA161 Syntactic Formalisms for Parsing Natural Languages 303 / 476
Lecture 7
Example 2: processing of uniﬁcation
.
HEAD-DTR rule
..
......
6 [SYNSEM [CAT [HEAD 1 SUBCAT ()]
CONTENT 4]
DTRS [HEAD-DTR [SYNSEM [CAT [HEAD 1 SUBCAT (2)]
CONTENT 4]
PHON 3]
SUBJ-DTRS ([PHON 5
SYNSEM 2])]
PHON (5 < 3)]
.
STEP 4: Unifying 5 with the HEAD-DTR of this rule gives
..
......
7 [SYNSEM [CAT [HEAD 1[VFORM ﬁn SUBCAT ()]]
CONTENT 4[RELN walk WALKER ]]
DTRS [HEAD-DTR [SYNSEM [CAT [HEAD 1[VFORM ﬁn]
SUBCAT 2([CAT [HEAD noun SUBCAT ()]
CONTENT [INDEX
[PER 3rd NUM sing]]])]
CONTENT [RELN walk WALKER 4]]
PHON 3((1 2 walks))]
SUBJ-DTRS ([PHON 5
SYNSEM 2[CAT [HEAD noun SUBCAT ()]
CONTENT [INDEX ]]])]
PHON ( 5 < 3((1 2 walks)))]
IA161 Syntactic Formalisms for Parsing Natural Languages 304 / 476
Lecture 7
Example 2: processing of uniﬁcation
.
STEP 5: Unifying 3 with the SUBJ-DTR of 7 gives
..
......
8 [SYNSEM [CAT [HEAD [VFORM ﬁn SUBCAT ()]]
CONTENT [RELN walk WALKER [PER 3rd NUM sing]]]
DTRS [HEAD-DTR [SYNSEM [CAT [HEAD [VFORM ﬁn]
SUBCAT ([CAT [HEAD noun SUBCAT ()]
CONTENT [INDEX [PER 3rd NUM sing]]])
CONTENT [RELN walk WALKER [PER 3rd NUM sing]]]
PHON ((1 2 walks))]
SUBJ-DTRS ([PHON ((0 1 kim))
SYNSEM [CAT [HEAD noun SUBCAT ()]
CONTENT [INDEX [PER 3rd NUM sing]]])]
PHON ((0 1 kim) (1 2 walks))]
Now the subject of the sentence is pronounceable, and we’re done.
IA161 Syntactic Formalisms for Parsing Natural Languages 305 / 476
Lecture 7
Phenomena covered by HPSG parsers
Case assignment
Word order : scrambling
Long distance dependency
Coordination
Scope of adverbs and negation
Topic drop
Agreement
Relative clause
…
IA161 Syntactic Formalisms for Parsing Natural Languages 306 / 476
Lecture 7
Example 3: unbounded dependency construction
An unbounded dependency construction
involves constituents with diﬀerent functions
involves constituents of diﬀerent categories
is in principle unbounded
Two kind of unbounded dependency constructions (UDCs)
Strong UDCs
Weak UDCs
IA161 Syntactic Formalisms for Parsing Natural Languages 307 / 476
Lecture 7
Strong UDCs
An overt constituent occurs in a non-argument position:
Topicalization:
Kimi, Sandy loves_ i.
Wh-questions:
I wonder [whoi Sandy loves_ i].
Wh-relative clauses:
This is the politician [whoi Sandy loves_ i].
It -clefts:
It is Kim i [whoi Sandy loves_ i].
Pseudoclefts:
[Whati Sandy loves_ i ] is Kimi.
IA161 Syntactic Formalisms for Parsing Natural Languages 308 / 476
Lecture 7
Weak UDCs
No overt constituent in a non-argument position:
Purpose inﬁnitive (for -to clauses):
I bought iti for Sandy to eat_ i .
Tough movement:
Sandyi is hard to love_ i .
Relative clause without overt relative pronoun:
This is [the politician]i [Sandy loves_ i ].
It-clefts without overt relative pronoun:
It is Kimi [Sandy loves_ i ].
IA161 Syntactic Formalisms for Parsing Natural Languages 309 / 476
Lecture 7
Using the feature SLASH
To account for UDCs, we will use the feature SLASH (so-named
because it comes from notation like S/NP to mean an S missing
an NP)
This is a non-local feature which originates with a trace, gets
passed up the tree, and is ﬁnally bound by a ﬁller
IA161 Syntactic Formalisms for Parsing Natural Languages 310 / 476
Lecture 7
The bottom of a UDC: Traces
word
PHON
SYNSEM
LOCAL
NONLOC
INHERITED | SLASH
TO-BIND | SLASH
1
1
phonologically null, but structure-shares local and slash values
IA161 Syntactic Formalisms for Parsing Natural Languages 311 / 476
Lecture 7
Traces
Because the local value of a trace is structure-shared with the
slash value, constraints on the trace will be constraints on the
ﬁller.
For example, hates speciﬁes that its object be accusative, and this
case information is local
So, the trace has [synsem|local|cat|head|case acc] as part of its
entry, and thus the ﬁller will also have to be accusative
*Hei/Himi, John likes_ i
IA161 Syntactic Formalisms for Parsing Natural Languages 312 / 476
Lecture 7
The middle of a UDC: The Nonlocal Feature
Principle (NFP)
For each NON-LOCAL feature, the inherited value on the mother
is the union of the inherited values on the daughter minus the
to-bind value on the head daughter.
In other words, the slash information (which is part of inherited)
percolates “up” the tree
This allows the all the local information of a trace to “move up”
to the ﬁller
IA161 Syntactic Formalisms for Parsing Natural Languages 313 / 476
Lecture 7
The middle of a UDC: The Nonlocal Feature
Principle (NFP)
The top of a UDC: ﬁller-head structures
Example for a structure licensed by the ﬁller-head schema
NLOC | INHERITED | SLASH
LOCAL
NLOC
INHERITED | SLASH
TO-BIND | SLASH 1
1..., ,...1
F H
IA161 Syntactic Formalisms for Parsing Natural Languages 314 / 476
Lecture 7
The middle of a UDC: The Nonlocal Feature
Principle (NFP)
The analysis of the UDC example
Johni we know She likes_i
S
NLOC | INHERITED | SLASH
NLOC
INHERITED | SLASH
TO-BIND | SLASH
S
VP
NLOC
INHERITED | SLASH
TO-BIND | SLASH
S
LOC | CAT | SUBCAT
NLOC
INHERITED | SLASH
TO-BIND | SLASH
VP
LOC | CAT | SUBCAT
NLOC
INHERITED | SLASH
TO-BIND | SLASH
V
LOC | CAT | SUBCAT
NONLOC | TO-BIND | SLASH
LOC
NLOC | INHER | SLASH
NP
NP
V
NP
John
we
know
she
likes -i
3
1
1 2
2
3
3
3
3
3
3
NP
,
HF
LOCAL 3
i
HS
CH
HS
H C
1
IA161 Syntactic Formalisms for Parsing Natural Languages 315 / 476
Lecture 7
Example 4
John reads a new book
PHON
SYNSEM | LOC
CAT
CONT
READER
READEE
HEAD
VAL
VFORM
AUX
INV
SUBJ
COMPS
SPR
NP
NP
reads
ﬁn
bool
bool
nom,-PRD
acc,-PRD
word read
verb
3rd,sg
1
2
1
2
IA161 Syntactic Formalisms for Parsing Natural Languages 316 / 476
Lecture 7
Example 4
John reads a new book
PHON
SYNSEM | LOCAL
CAT | HEAD
CONT
MOD LOCAL
PRD
CAT
CONT
HEAD
VAL | SPR
INDEX
RESTR
INDEX
RESTR
RELN
ARG
new
noun
new
cat
nom-obj
localsynsem
adj
nom-obj
local
word
-
1
1
1
2
2
IA161 Syntactic Formalisms for Parsing Natural Languages 317 / 476
Lecture 7
Example 4
John reads a new book
Note: apply head-adjunct schema
PHON
SS | LOC
CAT
CONT
HEAD
VAL | SPR
PHON
SS | LOC
CAT | HEAD | MOD
CONT
INDEX
RESTR
RELN
ARG
PHON
SS LOC
CAT
CONT
HEAD
VAL | SPR
INDEX
RESTR
PER
NUM
GEN
RELN
INST
book
neut
sg
3rd
book
new
new book
nom-obj
nom-obj
new
1
1
2
2
3
4
4
4
4
3
5
5
A
H
IA161 Syntactic Formalisms for Parsing Natural Languages 318 / 476
Lecture 7
Example 4
John reads a new book
PHON
SS | LOC
CAT
CONT
HEAD
VAL | SPR
PHON
SS LOC | CAT | HEAD
SPEC
PHON
SS LOC
CAT
CONT
HEAD
VAL | SPR
a new book
a
new book
det
1
1
2
2
6 67
7
SPR H
IA161 Syntactic Formalisms for Parsing Natural Languages 319 / 476
Lecture 7
Example 4
John reads a new book
PHON
SS | LOC
CAT
CONT
HEAD
VAL
SUBJ
COMPS
NP nom,-PRD
PHON
SS | LOC
CAT
CONT
HEAD
VAL
VFORM
SUBJ
COMPS NP acc,-PRD
READER
READEE
PHON
SS LOC
CAT
CONT | INDEX
HEAD
VAL
CASE
PRD
SUBJ
COMPS
SPR
reads a new book
reads a new book
3rd,sg
ﬁn
acc
noun
read
verb
7
7
4
4 4
8
8
9
9
10
10
11
11
H
C
IA161 Syntactic Formalisms for Parsing Natural Languages 320 / 476
Lecture 7
Example 4
John reads a new book - completed analysis
PHON
SS | LOC
CAT
CONT
HEAD
VAL
SUBJ
COMPS
SPR
PHON
SS LOC
CAT
CONT
HEAD
VAL
INDEX
RESTR
CASE
PRD -
SUBJ
COMPS
SPR
PER
NUM
GEND
NAME
INST
PHON
SS | LOC
CAT
CONT
HEAD
VAL
SUBJ
COMPS
READER
READEE
John reads a new book
John
reads a new book
4
7
7
8
8
9
9
11
11
11
nom
3rd
sg
masc
John
naming
nom-obj
read
noun
SUBJ
H
IA161 Syntactic Formalisms for Parsing Natural Languages 321 / 476
Lecture 7
.
......
Syntactic Formalisms for Parsing
Natural Languages
Aleš Horák, Miloš Jakubíček, Vojtěch Kovář
(based on slides by Juyeon Kang)
ia161@nlp.fi.muni.cz
Autumn 2013
IA161 Syntactic Formalisms for Parsing Natural Languages 322 / 476
Lecture 7
Outline
Applicative system
Combinators
Combinators vs. λ-expressions
Application to natural language parsing
Combinators used in CCG
IA161 Syntactic Formalisms for Parsing Natural Languages 323 / 476
Lecture 7
Applicative system
CL (Curry & Feys, 1958, 1972) as an applicative system
CL is an applicative system because the basic unique operation
in CL is the application of an operator to an operand
Operator(Operand)
Operator Operand
IA161 Syntactic Formalisms for Parsing Natural Languages 324 / 476
Lecture 7
Combinators
CL deﬁnes general operators, called Combinators.
Each combinator composes between them the
elementary combinators and deﬁnes the
complexe combinators.
Certains combinators are considered as the basic combinators
to deﬁne the other combinators.
IA161 Syntactic Formalisms for Parsing Natural Languages 325 / 476
Lecture 7
Elementary combinators
I =def λx.x (identiﬁcator)
K =def λx.λy.x (cancellator)
W =def λx.λy.xyy (duplicator)
C =def λx.λy.λz.xzy (permutator)
B =def λx.λy.λz.x(yz) (compositor)
S =def λx.λy.λz.xz(yz) (substitution)
Φ =def λx.λy.λz.λu.x(yu)(zu) (distribution)
Ψ =def λx.λy.λz.λu.x(yz)(yu) (distribution)
IA161 Syntactic Formalisms for Parsing Natural Languages 326 / 476
Lecture 7
β-reductions
The combinators are associated with the β-reductions in a
canonical form:
β-reduction relation between X and Y
X ≥β Y
Y was obtained from X by a β-reduction
IA161 Syntactic Formalisms for Parsing Natural Languages 327 / 476
Lecture 7
β-reductions
Ix ≥β x
Kxy ≥β x
Wxy ≥β xyy
Cxyz ≥β xzy
Bxyz ≥β x(yz)
Sxyz ≥β xz(yz)
Φxyzu ≥β x(yu)(zu)
Ψxyzu ≥β x(yz)(yu)
.
......
Each combinator is an operator which has a certain number of arguments (operands);
sequences of the arguments which follow the comnator are called “the scope of
combinator”.
IA161 Syntactic Formalisms for Parsing Natural Languages 328 / 476
Lecture 7
β-reductions
Intuitive interpretations of the elementary combinators are
given by the associated β-reductions.
The combinator I expresses the identity.
The combinator K expresses the constant function.
The combinator W expresses the diagonalisation or the
duplication of an argument.
The combinator C expresses the conversion, that is, the
permutation of two arguments of an binary operator.
The combinator B expresses the functional composition of two
operators.
The combinator S expresses the functional composition and the
duplication of argument.
The combinator Φ expresses the composition in parallel of
operators acting on the common data.
The combinator Ψ expresses the composition by distribution.
IA161 Syntactic Formalisms for Parsing Natural Languages 329 / 476
Lecture 7
Introduction and elimination rules of combinators
Introduction and elimination rules of combinators can be
presented in the style of Gentzen (natural deduction).
Elim. Rules Intro. Rules
If f
- - - [e-I] - - - [i-I]
f If
Kfx f
- - - - - [e-K] - - - - [i-K]
f Kfx
IA161 Syntactic Formalisms for Parsing Natural Languages 330 / 476
Lecture 7
Introduction and elimination rules of combinators
Elim. Rules Intro. Rules
Cfx xf
- - - [e-C] - - - [i-C]
xf Cfx
Bfxy f(xy)
- - - - - [e-B] - - - - [i-B]
f(xy) Bfxy
Φfxyz f(xz)(yz)
- - - - - [e-Φ] - - - - [i-Φ]
f(xz)(yz) Φfxyz
IA161 Syntactic Formalisms for Parsing Natural Languages 331 / 476
Lecture 7
Combinators vs. λ -expressions
The most important diﬀerence between the CL and λ-calculus
is the use of the bounded variables.
Every combinator is an λ -expression.
Bfg ≡ λx.f(gx)
Tx ≡ λf.fx
Sfg ≡ λx.fx(gx)
IA161 Syntactic Formalisms for Parsing Natural Languages 332 / 476
Lecture 7
Application to natural language parsing
John is brilliant
The predicate is brilliant is an operator which operate on the
operand John to construct the ﬁnal proposition.
The applicative representation associated to this analysis is the
following:
(is-brillant)John
We deﬁne the operator John* as being constructed from the
lexicon John by
[John* = C* John].
1 John* (is-brillant)
2 [John* = C* John]
3 C*John (is-brillant)
4 is-brillant (John)
IA161 Syntactic Formalisms for Parsing Natural Languages 333 / 476
Lecture 7
Application to natural language parsing
John is brilliant in λ-term
Operator John* by λ-expression
[John* = λx.x (John’)]
1 John*(λx.is-brilliant’(x))
2 [John* = λx.x (John’)]
3 (λx.x(John’))(λx.is-brilliant’(x))
4 (λx.is-brilliant’(x))(John’)
5 is-brillinat’(John’)
IA161 Syntactic Formalisms for Parsing Natural Languages 334 / 476
Lecture 7
Passivisation
Consider the following sentences
a. The man has been killed.
b. One has killed him.
→ Invariant of meaning
→ Relation between two sentences
:a. unary passive predicate (has-been-killed)
:b. active transitive predicate (have-killed)
IA161 Syntactic Formalisms for Parsing Natural Languages 335 / 476
Lecture 7
Deﬁnition of the operator of passivisation ’PASS’
[PASS = B
∑
C =
∑
◦ C]
where B and C are the combinator of composition and of
conversion and where
∑
is the existential quantiﬁcator which,
by applying to a binary predicate, transforms it into the unary
predicate.
IA161 Syntactic Formalisms for Parsing Natural Languages 336 / 476
Lecture 7
Deﬁnition of the operator of passivisation ’PASS’
.
...... [PASS = B
∑
C =
∑
◦ C]
1/ has-been-killed (the-man) hypothesis
2/ [has-been-killed=PASS(has killed)] passive lexical predicate
3/ PASS (has-killed)(the-man) repl.2.,1.
4/ [PASS = B
∑
C ] deﬁnition of ’PASS’
5/ B
∑
C (has-killed)(the-man) repl.4.,3.
6/
∑
(C(has-killed))(the-man) [e-B]
7/ (C(has-killed)) x (the-man) [e-
∑
]
8/ (has-killed)(the-main) x [e-C]
9/ [x in the agentive subject position = one] deﬁnition of ’one’
10/ (has-killed)(the-man)one repl.9.,8., normal form
IA161 Syntactic Formalisms for Parsing Natural Languages 337 / 476
Lecture 7
Deﬁnition of the operator of passivisation ’PASS’
We establish the paraphrastic relation between the passive
sentence with expressed agent and its active counterpart:
The man has been killed by the enemy
↓
The enemy has killed the man
IA161 Syntactic Formalisms for Parsing Natural Languages 338 / 476
Lecture 7
Deﬁnition of the operator of passivisation ’PASS’
.
......Relation between give-to and receive-from
z gives y to x
↕
x receives y from x
.
......
The lexical predicate “give-to” has a predicate converse associated to “receive-from”;
[receive-from z y x = give-to x y z]
IA161 Syntactic Formalisms for Parsing Natural Languages 339 / 476
Lecture 7
Deﬁnition of the operator of passivisation ’PASS’
1/ (receive-from) z y x
2/ C((receive-from) z) x y
3/ BC(receive-from) z x y
4/ C(BC(receive-from)) z x y
5/ C(C(BC(receive-from)) x) y z
6/ BC(C(BC(receive-from))) x y z
7/ [give-to=BC(C(BC(receive-from)))]
8/ give-to x y z
IA161 Syntactic Formalisms for Parsing Natural Languages 340 / 476
Lecture 7
Combinators used in CCG
Motivation of applying the combinators
to natural language parsing
Linguistic: complex phenomena of natural language applicable
to the various languages
Informatics: left to right parsing (LR)
ex: reduce the spurious-ambiguity
IA161 Syntactic Formalisms for Parsing Natural Languages 341 / 476
Lecture 7
Parsing a sentence in CCG
Step 1: tokenization
Step 2: tagging the concatenated lexicon
Step 3: calculate on types attributed to the concatenated
lexicons by applying the adequate combinatorial rules
Step 4: eliminate the applied combinators (we will see how to do
on next week)
Step 5: ﬁnding the parsing results presented in the form of an
operator/operand structure (predicate -argument structure)
IA161 Syntactic Formalisms for Parsing Natural Languages 342 / 476
Lecture 7
Parsing a sentence in CCG
Example: I requested and would prefer musicals
STEP 1 : tokenization/lemmatization → ex) POS Tagger,
tokenizer, lemmatizer
a. I-requested-and-would-prefer-musicals
b. I-request-ed-and-would-prefer-musical-s
STEP 2 : tagging the concatenated expressions → ex)
Supertagger, Inventory of typed words
I NP
Requested (S\NP)/NP
And CONJ
Would (S\NP)/VP
Prefer VP/NP
musicals NP
IA161 Syntactic Formalisms for Parsing Natural Languages 343 / 476
Lecture 7
Parsing a sentence in CCG
STEP 3 : categorial calculus
c. apply the coordination rules Coordination: (< & >)
X conj X ⇒ X
b. apply the functional composition rules Forward Composition: (> B)
X/Y : f Y/Z : g ⇒ X/Z : Bfg
a. apply the type-raising rules Subject Type-raising (> T)
NP : a ⇒ T/(T\NP) : Ta
7/ S
6/ S/NP NP (>)
5/ S/(S\NP) (S\NP)/NP NP (>B)
4/ S/(S\NP) (S\NP)/NP NP (> Φ)
3/ S/(S\NP) (S\NP)/NP CONJ (S\NP)/NP NP (>B)
2/ S/(S\NP) (S\NP)/NP CONJ (S\NP)/VP VP/NP NP (>T)
1/ NP (S\NP)/NP CONJ (S\NP)/VP VP/NP NP
I- requested- and- would- prefer- musicals
IA161 Syntactic Formalisms for Parsing Natural Languages 344 / 476
Lecture 7
Parsing a sentence in CCG
STEP 4 : semantic representation (predicate-argument
structure)
7/S: and’(will’(prefer’ musicals’) i’)(request’ musicals’ i’)
6/ :λy.and’(would’(prefer’ musicals’)y)(request’ musicals’ y)
5/ : λxλy.and’(will’(prefer’x)y)(request’xy)
4/ : λxλy.and’(will’(prefer’x)y)(request’xy)
3/ : λx.λy.will’(prefer’x)y
2/ :λf.f I’
1/ :i’ :request’ :and’ : will’ :prefer’ : musicals’
I requested and would prefer musicals
IA161 Syntactic Formalisms for Parsing Natural Languages 345 / 476
Lecture 7
Semantic representation in term of the
combinators
I- requested and- would- prefer musicals
1/ NP (S\NP)/NP CONJ (S\NP)/VP VP/NP NP
2/ S/(S\NP) (S\NP)/NP CONJ (S\NP)/VP VP/NP NP (>T)
C*I requested and would prefer musicals
3/ S/(S\NP) (S\NP)/NP CONJ (S\NP)/NP NP (>B)
C*I requested and B would prefer musicals
4/ S/(S\NP) (S\NP)/NP NP (> Φ)
C*I Φ and requested (B would prefer) musicals
5/ S/NP NP (>B)
B((C*I)(Φ and requested (B would prefer))) musicals
6/ S (>)
B((C*I)(Φ and requested (B would prefer))) musicals
IA161 Syntactic Formalisms for Parsing Natural Languages 346 / 476
Lecture 7
Semantic representation in term of the
combinators
.
...... I requested and would prefer musicals
S: B((C*I)(Φ and requested (B would prefer))) musicals
1/ B((C*I)(Φ and requested (B would prefer))) musicals
2/ (C*I)((Φ and requested (B would prefer))) musicals) [e-B]
3/ ((Φ and requested (B would prefer))) musicals) I [e-C*]
4/ (and (requested musicals) ((B would prefer) musicals)) I [e-Φ]
5/ ((and (requested musicals) (would (prefer musicals))) I ) [e-B]
IA161 Syntactic Formalisms for Parsing Natural Languages 347 / 476
Lecture 7
Normal form
A normal form is a combinatory expression which is irreducible
in the sense that it contain any occurrence of a redex.
If a combinatory expression X reduce to a combinatory
expression N which is in normal form, so N is called the
normal form of X.
.
Example
..
......
Bxyz is reducible to x(yz).
x(yz) is a normal form of the combinatory expression Bxyz.
IA161 Syntactic Formalisms for Parsing Natural Languages 348 / 476
Lecture 7
Normal form
.
Example
..
......
Prove xyz is the normal form of BBCxyz.
BBCxyz →β xyz
1/ BBCxyz
2/ C(Cx)yz [e-B]
3/ Cxzy [e-C]
4/ xyz [e-C]
IA161 Syntactic Formalisms for Parsing Natural Languages 349 / 476
Lecture 7
Classwork
Give the semantic representation in term of combinators.
Please refer to the given paper on last lecture on CCG Parsing.
IA161 Syntactic Formalisms for Parsing Natural Languages 350 / 476
Lecture 9
.
......
Syntactic Formalisms for Parsing
Natural Languages
Aleš Horák, Miloš Jakubíček, Vojtěch Kovář
(based on slides by Juyeon Kang)
ia161@nlp.fi.muni.cz
Autumn 2013
IA161 Syntactic Formalisms for Parsing Natural Languages 351 / 476
Lecture 9
Outline
HPSG Parser : Enju
Parsing method
Description of parser
Result
CCG Parser : C&C Tools
Parsing method
Description of parser
Result
IA161 Syntactic Formalisms for Parsing Natural Languages 352 / 476
Lecture 9
Theoretical backgrounds
Lecture 3 about HPSG Parsing
Lecture 6 & 7 about CCG Parsing and Combinatory Logic
IA161 Syntactic Formalisms for Parsing Natural Languages 353 / 476
Lecture 9
Enju (Y. Miyao, J.Tsujii, 2004, 2008)
Syntactic parser for English
Developed by Tsujii Lab. Of the University of Tokyo
Based on the wide-coverage probabilistic HPSG
HPSG theory [Pollard and Sag, 1994]
Useful links to Enju
http://www-tsujii.is.s.u-tokyo.ac.jp/enju/demo.html
http://www-tsujii.is.s.u-tokyo.ac.jp/enju/
IA161 Syntactic Formalisms for Parsing Natural Languages 354 / 476
Lecture 9
Motivations
Parsing based on a proper linguistic formalism is one of the
core research ﬁelds in CL and NLP.
But!
a monolithic, esoteric and inward looking ﬁeld, largely
dissociated from real world application.
IA161 Syntactic Formalisms for Parsing Natural Languages 355 / 476
Lecture 9
Motivations
So why not!
The integration of linguistic grammar formalisms with
statistical models to propose an robust, eﬃcient and open to
eclectic sources of information other than syntactic ones
IA161 Syntactic Formalisms for Parsing Natural Languages 356 / 476
Lecture 9
Motivations
Two main ideas
Development of wide-coverage linguistic grammars
Deep parser which produces semantic representation
(predicate-argument structures)
IA161 Syntactic Formalisms for Parsing Natural Languages 357 / 476
Lecture 9
Parsing method
Application of probabilistic model in the HPSG grammar and
development of an eﬃcient parsing algorithm
Accurate deep analysis
Disambiguation
Wide-coverage
High speed
Useful for high level NLP application
IA161 Syntactic Formalisms for Parsing Natural Languages 358 / 476
Lecture 9
Parsing method
1 Parsing based on HPSG
Mathematically well-deﬁned with sophisticated constraint-based
system
Linguistically justiﬁed
Deep syntactic grammar that provides semantic analysis
IA161 Syntactic Formalisms for Parsing Natural Languages 359 / 476
Lecture 9
Parsing method
Diﬃculties in parsing based on HPSG
Diﬃcult to develop a broad-coverage HPSG grammar
Diﬃcult to disambiguate
Low eﬃciency: very slow
IA161 Syntactic Formalisms for Parsing Natural Languages 360 / 476
Lecture 9
Parsing method
Solution:
Corpus-oriented development of an HPSG grammar
The principal aim of grammar development is treebank
construction
Penn treebank is coverted into an HPSG treebank
A lexicon and a probabilistic model are extracted from the HPSG
treebank
IA161 Syntactic Formalisms for Parsing Natural Languages 361 / 476
Lecture 9
Parsing method
Approach:
develop grammar rules and an HPSG treebank
collect lexical entries from the HPSG treebank
.
......
How to make an HPSG treebank?
Convert Penn Treebank into HPSG and develop grammar by restructuring a treebank
in conformity with HPSG grammar rules
IA161 Syntactic Formalisms for Parsing Natural Languages 362 / 476
Lecture 9
Parsing method
HPSG = lexical entries and grammar rules
Enju grammar has 12 grammar rules and
3797 lexical entries for 10,536 words
(Miyao et al. 2004)
IA161 Syntactic Formalisms for Parsing Natural Languages 363 / 476
Lecture 9
Parsing method
Overview of grammar development
1. Treebank
conversion
2. Grammar rule
application
3. Lexical entry
collection
Modify constituent structures
by adding feature structures
Apply the grammar rule
when a parse tree contains
correct analysis and
speciﬁed feature values are ﬁlled
Collect terminal nodes
of HPSG parse trees
and assign
predicate-argument structure
IA161 Syntactic Formalisms for Parsing Natural Languages 364 / 476
Lecture 9
Parsing method
2 Probabilistic model and HPSG:
Log-linear model for uniﬁcation-based grammars
(Abney 1997, Johnson et al. 1999, Riezler et al. 2000, Miyao
et al. 2003, Malouf and van Noord 2004, Kaplan et al. 2004,
Miyao and Tsujii 2005)
p(T|w)
w = “A blue eyes girl with white hair and skin walked
T =
A blue eyes girl with white hair and skin walked
NP
NP
NP
NP
S
NP
NP
PP
VP
IA161 Syntactic Formalisms for Parsing Natural Languages 365 / 476
Lecture 9
Parsing method
T1 T2 T3 T4 Tn
All possible parse trees derived from w with a grammar.
For example, p(T3|w) is the probability of selecting T3 from T1,
T2, …, and Tn.
IA161 Syntactic Formalisms for Parsing Natural Languages 366 / 476
Lecture 9
Parsing method
Log-linear model for uniﬁcation-based grammars
Input sentence: w
w = w1/P1, w2/P2, . . . wn/Pn
Output parse tree T
Normalization
factor
Weight for a
feature function
Feature function
IA161 Syntactic Formalisms for Parsing Natural Languages 367 / 476
Lecture 9
Description of parser
IA161 Syntactic Formalisms for Parsing Natural Languages 368 / 476
Lecture 9
Description of parser
parsing proceeds in the following steps:
1. preprocessing
Preprocessor converts an input sentence into a word lattice.
2. lexicon lookup
Parser uses the predicate to ﬁnd lexical entries for the word
lattice
3. kernel parsing
Parser does phrase analysis using the deﬁned grammar rules
in the kernel parsing process.
IA161 Syntactic Formalisms for Parsing Natural Languages 369 / 476
Lecture 9
Description of parser
Chart
data structure
two dimensional table
we call each cell in the table ‘CKY cell.’
Example
Let an input sentence s(= w1, w2, w3, ..., wn), w1 = ”I”, w2 =
”saw”, w3 = ”a”, w4 = ”girl”, w5 = ”with”, w6 = ”a”, w7 =
”telescope” for the sentence “I saw a girl with a telescope”,
the chart is arranged as follows.
I saw a girl with a telescope
0,1 1,2 2,3 3,4 4,5 5,6 6,7
0,2 1,3 2,4 3,5 4,6 5,7
0,3 1,4 2,5 3,6 4,7
0,4 1,5 2,6 3,7
0,5 1,6 2,7
0,6 1,7
0,7
IA161 Syntactic Formalisms for Parsing Natural Languages 370 / 476
Lecture 9
Description of parser
System overview
Supertagger
Enumeration of
assignments
Deterministic
disambiguation
Mary loved John
HEAD noun
Subj < >
COMPS < >
HEAD noun
Subj < >
COMPS < >
HEAD noun
Subj < >
COMPS < >
HEAD verb
Subj <NP>
COMPS <NP>
HEAD noun
Subj < >
COMPS < >
HEAD noun
Subj < >
COMPS < >
HEAD noun
Subj < >
COMPS < >
HEAD verb
Subj <NP>
COMPS <NP>
HEAD verb
Subj <NP>
COMPS <NP>
HEAD noun
Subj < >
COMPS < >
HEAD verb
Subj <NP>
COMPS <NP>
HEAD noun
Subj < >
COMPS < >
Mary loved John
Mary loved John
IA161 Syntactic Formalisms for Parsing Natural Languages 371 / 476
Lecture 9
Demonstration
http://www-tsujii.is.s.u-tokyo.ac.jp/enju/demo.html
IA161 Syntactic Formalisms for Parsing Natural Languages 372 / 476
Lecture 9
Results
Fast, robust and accurate analysis
Phrase structures
Predicate argument structures
Accurate deep analysis – the parser can output both phrase
structures and predicate-argument structures. The accuracy of
predicate-argument relations is around 90% for newswire
articles and biomedical papers.
High speed – parsing speed is less than 500 msec. per
sentence by default (faster than most Penn Treebank parsers),
and less than 50 msec when using the highspeed setting
(”mogura”).
IA161 Syntactic Formalisms for Parsing Natural Languages 373 / 476
Lecture 9
C&C tools
Developed by Curran and Clark [Clark and Curran, 2002,
Curran, Clark and Bos, 2007], University of Edinburgh
Wide-coverage statistical parser based on the CCG: CCG Parser
Computational semantic tools named Boxer
Useful links
http://svn.ask.it.usyd.edu.au/trac/candc
http://svn.ask.it.usyd.edu.au/trac/candc/wiki/Demo
IA161 Syntactic Formalisms for Parsing Natural Languages 374 / 476
Lecture 9
CCG Parser [Clark, 2007]
Statistical parsing and CCG
Advantages of CCG
providing a compositional semantic for the grammar
→completely transparent interface between syntax and
semantics
the recovery of long-range dependencies can be integrated into
the parsing process in a straightforward manner
IA161 Syntactic Formalisms for Parsing Natural Languages 375 / 476
Lecture 9
Parsing method
Penn Treebank conversion : TAG, LFG, HPSG and CCG
CCGBank [Hockenmaier and Steedman, 2007]
CCG version of the Penn Treebank
Grammar used in CCG parser
CCGBank
Some rules
used as the grammar
Lexical category set
Training data for
the statistical models
Supertagger Parser
IA161 Syntactic Formalisms for Parsing Natural Languages 376 / 476
Lecture 9
Parsing method-CCG Bank
Corpus translated from the Penn Treebank, CCGBank contains
Syntactic derivations
Word-word dependencies
Predicate-argument structures
IA161 Syntactic Formalisms for Parsing Natural Languages 377 / 476
Lecture 9
Parsing method-CCG Bank
Semi automatic conversion of
phrase-structure trees in the Penn Treebank into
CCG derivations
Consists mainly of newspaper texts
Grammar:
Lexical category set
Combinatory rules
Unary type-changing rules
Normal-form constraints
Punctuation rules
IA161 Syntactic Formalisms for Parsing Natural Languages 378 / 476
Lecture 9
Parsing method
Supertagging [Clark, 2002]
uses conditional maximum entropy models
implement a maximum entropy supertagger
ADV NOM PRP PRO:DEM NOM KON VER:pres VER:inﬁ DET:ART
tout commentaire sur cette proposition et prefere avancer les
(s\1 s)/(s
np/n
(s\1 s)/n
np/np
s\1 s
n
np
(np\np)/n
(s\1 s)/np
(n\n)/np
pp_sur/np np/n n ((np\s)\(
((np\s)/n
(np\s)/np
(s/np)/(n
np\s
(np\s)/(n
((np\s_inf)
(np\s_inf) np/n
IA161 Syntactic Formalisms for Parsing Natural Languages 379 / 476
Lecture 9
Parsing method-Supertagger
Set of 425 lexical categories from the CCGbank
The per-word accuracy of the Supertagger is around 92% on
unseen WSJ text.
→ Using the multi-supertagger increases the accuracy
signiﬁcantly – to over 98% – with only a small cost in
increased ambiguity.
IA161 Syntactic Formalisms for Parsing Natural Languages 380 / 476
Lecture 9
Parsing method-Supertagger
Log-linear models in NLP applications:
POS tagging
Name entity recognition
Chunking
Parsing
→ referred as maximum entropy models and random
ﬁelds
IA161 Syntactic Formalisms for Parsing Natural Languages 381 / 476
Lecture 9
Parsing method-Supertagger
Log-linear parsing models for CCG
1 the probability of a dependency structure
2 the normal-form model: the probability of a single derivation
→ modeling 2) is simpler than 1)
1 deﬁned as P(π|S) =
∑
d∈∆(π)
P(d, π|S)
2 deﬁned using a log-linear form as follows: P(w|S) = 1
ZS
eλ.f(w)
ZS =
∑
w∈p(S)
eλ.f(w′)
IA161 Syntactic Formalisms for Parsing Natural Languages 382 / 476
Lecture 9
Parsing method-Supertagger
Features common to the dependency and normal-form models
Feature type Example
LexCat + word (S/S)/NP + Before
LexCat + POS (S/S)/NP + IN
RootCat S[dcl]
RootCat + World S[dcl] + was
RootCat + POS S[dcl] + VBD
Rule S[dcl] → NP S[dcl]\NP
Rule + Word S[dcl] → NP S[dcl]\NP + bought
Rule + POS S[dcl] → NP S[dcl]\NP + VBD
IA161 Syntactic Formalisms for Parsing Natural Languages 383 / 476
Lecture 9
Parsing method-Supertagger
Predicate-argument dependency features for the dependency
model
Feature type Example
Word-Word ⟨bought, (S\NP1)/NP2, 2, stake, (NP\NP)/(S[dcl]/NP)⟩
Word-POS ⟨bought, (S\NP1)/NP2, 2, NN, (NP\NP)/(S[dcl]/NP)⟩
POS-Word ⟨VBD, (S\NP1)/NP2, 2, stake, (NP\NP)/(S[dcl]/NP)⟩
POS-POS ⟨VBD, (S\NP1)/NP2, 2, NN, (NP\NP)/(S[dcl]/NP)⟩
Word + Distance(words) ⟨bought, (S\NP1)/NP2, 2, (NP\NP)/(S[dcl]/NP)⟩ + 2
Word + Distance(punct) ⟨bought, (S\NP1)/NP2, 2, (NP\NP)/(S[dcl]/NP)⟩ + 0
Word + Distance(verbs) ⟨bought, (S\NP1)/NP2, 2, (NP\NP)/(S[dcl]/NP)⟩ + 0
POS + Distance(words) ⟨VBD, (S\NP1)/NP2, 2, (NP\NP)/(S[dcl]/NP)⟩ + 2
POS + Distance(punct) ⟨VBD, (S\NP1)/NP2, 2, (NP\NP)/(S[dcl]/NP)⟩ + 0
POS + Distance(verbs) ⟨VBD, (S\NP1)/NP2, 2, (NP\NP)/(S[dcl]/NP)⟩ + 0
IA161 Syntactic Formalisms for Parsing Natural Languages 384 / 476
Lecture 9
Parsing method-Supertagger
Rule dependency features for the normal-form model
Feature type Example
Word-Word ⟨company, S[dcl] → NP S[dcl]\NP, bought⟩
Word-POS ⟨company, S[dcl] → NP S[dcl]\NP, VBD⟩
POS-Word ⟨NN, S[dcl] → NP S[dcl]\NP, bought⟩
POS-POS ⟨NN, S[dcl] → NP S[dcl]\NP, VBD⟩
Word + Distance(words) ⟨bought, S[dcl] → NP S[dcl]\NP⟩+ > 2
Word + Distance(punct) ⟨bought, S[dcl] → NP S[dcl]\NP⟩ + 2
Word + Distance(verbs) ⟨bought, S[dcl] → NP S[dcl]\NP⟩ + 0
POS + Distance(words) ⟨VBD, S[dcl] → NP S[dcl]\NP⟩+ > 2
POS + Distance(punct) ⟨VBD, S[dcl] → NP S[dcl]\NP⟩ + 2
POS + Distance(verbs) ⟨VBD, S[dcl] → NP S[dcl]\NP⟩ + 0
IA161 Syntactic Formalisms for Parsing Natural Languages 385 / 476
Lecture 9
Description of parser
Input sentence
CCGBank
C&C taggers
Supertaggers
POStagger
Chunker
Parser
Boxer
IA161 Syntactic Formalisms for Parsing Natural Languages 386 / 476
Lecture 9
Demonstration
http://svn.ask.it.usyd.edu.au/trac/candc/wiki/Demo
IA161 Syntactic Formalisms for Parsing Natural Languages 387 / 476
Lecture 9
Results
Supertagger ambiguity and accuracy on section00
β k CATS/WORD ACC SENT ACC ACC(POS) SENT ACC
0.075 20 1.27 97.34 67.43 96.34 60.27
0.030 20 1.43 97.92 72.87 97.05 65.50
0.010 20 1.72 98.37 77.73 97.63 70.52
0.005 20 1.98 98.52 79.25 97.86 72.24
0.001 150 3.57 99.17 87.19 98.66 80.24
IA161 Syntactic Formalisms for Parsing Natural Languages 388 / 476
Lecture 9
Results
Parsing accuracy on DepBank
Relation
dependent
aux
conj
ta
det
arg_mod
mod
ncmod
xmod
cmod
pmod
arg
CCG parser CCGbank
Prec Rec F Prec Rec F # GRs
84.07 82.19 83.12 88.83 84.19 86.44 10,696
95.03 90.75 92.84 96.47 90.33 93.30 400
79.02 75.97 77.46 83.07 80.27 81.65 595
51.52 11.64 18.99 62.07 12.59 20.93 292
95.23 94.97 95.10 97.27 94.09 95.66 1,114
81.46 81.76 81.61 86.75 84.19 85.45 8,295
71.30 77.23 74.14 77.83 79.65 78.73 3,908
73.36 78.96 76.05 78.88 80.64 79.75 3,550
42.67 53.93 47.64 56.54 60.67 58.54 178
51.34 57.14 54.08 64.77 69.09 66.86 168
0.00 0.00 0.00 0.00 0.00 0.00 12
85.76 80.01 82.78 89.79 82.91 86.21 4,387
DepBank: Parc Dependency Bank
[King et al. 2003]
IA161 Syntactic Formalisms for Parsing Natural Languages 389 / 476
Lecture 9
Results
subj_or_dobj 86.08 83.08 84.56 91.01 85.29 88.06 3,127
subj 84.08 75.57 79.60 89.07 78.43 83.41 1,363
nesubj 83.89 75.78 79.63 88.86 78.51 83.37 1,354
xsubj 0.00 0.00 0.00 50.00 28.57 36.36 7
csubj 0.00 0.00 0.00 0.00 0.00 0.00 2
comp 86.16 81.71 83.88 89.92 84.74 87.25 3,024
obj 86.30 83.08 84.66 90.42 85.52 87.90 2,328
dobj 87.01 88.44 87.71 92.11 90.32 91.21 1,764
obj2 68.42 65.00 66.67 66.67 60.00 63.16 20
iobj 83.22 65.63 73.38 83.59 69.81 76.08 544
clausal 77.67 72.47 74.98 80.35 77.54 78.92 672
xcomp 77.69 74.02 75.81 80.00 78.49 79.24 381
ccomp 77.27 70.10 73.51 80.81 76.31 78.49 291
pcomp 0.00 0.00 0.00 0.00 0.00 0.00 24
macroaverage 65.71 62.29 63.95 71.73 65.85 68.67
microaverage 81.95 80.35 81.14 86.86 82.75 84.76
IA161 Syntactic Formalisms for Parsing Natural Languages 390 / 476
Lecture 10
.
......
Syntactic Formalisms for Parsing
Natural Languages
Aleš Horák, Miloš Jakubíček, Vojtěch Kovář
(based on slides by Juyeon Kang)
ia161@nlp.fi.muni.cz
Autumn 2013
IA161 Syntactic Formalisms for Parsing Natural Languages 391 / 476
Lecture 10
Study materials
Course materials and homeworks are available on the
following web site
https://is.muni.cz/course/fi/autumn2011/IA161
IA161 Syntactic Formalisms for Parsing Natural Languages 392 / 476
Lecture 10
Outline
Introduction to Statistical parsing methods
Statistical Parsers
RASP system
Stanford parser
Collins parser
Charniak parser
Berkeley parser
IA161 Syntactic Formalisms for Parsing Natural Languages 393 / 476
Lecture 10
1. Introduction to statistical parsing
The main theoretical approaches behind modern statistical
parsers
Over the last 12 years statistical parsing has succeeded
signiﬁcantly!
NLP researchers have produced a range of statistical parsers
→ wide-coverage and robust parsing accuracy
They continues to improve the parsers year on year.
IA161 Syntactic Formalisms for Parsing Natural Languages 394 / 476
Lecture 10
Application domains of statistical parsing
Question answering systems of high precision
Named entity extraction
Syntactically based sentence compressions
Extraction of people’s opinion about products
Improved interaction in computer ganes
Helping linguists ﬁnd data
IA161 Syntactic Formalisms for Parsing Natural Languages 395 / 476
Lecture 10
NLP parsing problem and solution
The structure of language is ambiguous!
→ local and global ambiguities
Classical parsing problem
→ simple 10 grammar rules can generate 592 parsers
→ real size wide-coverage grammar generates millions of
parses
IA161 Syntactic Formalisms for Parsing Natural Languages 396 / 476
Lecture 10
NLP parsing problem and solution
NLP parsing solution
We need mechanisms that allow us to ﬁnd the most likely
parses
→ statistical parsing lets us work with very loose grammars
that admit millions of parses for sentences but to still quickly
ﬁnd the best parses
IA161 Syntactic Formalisms for Parsing Natural Languages 397 / 476
Lecture 10
Improved methodology for robust parsing
The annotated data: Penn Treebank (early 90’s)
Building a treebank seems a lot slower and less useful than
building a grammar
But it has many helpful things
Reusability of the labor
Broad coverage
Frequencies and distributional information
A way to evaluate systems
IA161 Syntactic Formalisms for Parsing Natural Languages 398 / 476
Lecture 10
Characterization of Statistical parsing
What the grammar which determines the set of legal syntactic
structures for a sentence? How is that grammar obtained?
What is the algorithm for determining the set of legal parses for
a sentence?
What is the model for determining the probability of diﬀerent
parses for a sentence?
What is the algorithm, given the model and a set of possible
parses which ﬁnds the best parse?
IA161 Syntactic Formalisms for Parsing Natural Languages 399 / 476
Lecture 10
Characterization of Statistical parsing
Tbest = arg max Score(T, S)
Two components:
The model: a function Score which assigns scores
(probabilities) to tree and sentence pairs
The parser: the algorithm which implements the search for
Tbest
IA161 Syntactic Formalisms for Parsing Natural Languages 400 / 476
Lecture 10
Characterization of Statistical parsing
Statistical parsing seen as more of a
pattern recognition/Machine Learning problem plus
search
The grammar is only implicitly deﬁned by the training data
and the method used by the parser for generating hypotheses
IA161 Syntactic Formalisms for Parsing Natural Languages 401 / 476
Lecture 10
Statistical parsing models
Probabilistic approach would suggest the following for the
Score function
Score(T, S) = P(T|S)
Lots of research on diﬀerent probability models for Penn
Treebank trees
Generative models, log-linear (maximum entropy) models, …
IA161 Syntactic Formalisms for Parsing Natural Languages 402 / 476
Lecture 10
2. Statistical parsers
Many kinds of parsers based on the statistical
methods:probability, machine learning
Diﬀerent objectives: research, commercial, pedagogical
RASP, Stanford parser, Berkeley parser,
IA161 Syntactic Formalisms for Parsing Natural Languages 403 / 476
Lecture 10
RASP system
Robust Accurate Statistical Parsing (2nd
release):
[Briscoe&Carroll, 2002; Briscoe et al. 2006]
system for syntactic annotation of free text
Semantically-motivated output representation
Enhanced grammar and part-of-speech tagger lexicon
Flexible and semi-supervised training method for structural
parse ranking model
Useful links to RASP
http://ilexir.co.uk/applications/rasp/download/
http://www.informatics.susx.ac.uk/research/groups/nlp/rasp/
IA161 Syntactic Formalisms for Parsing Natural Languages 404 / 476
Lecture 10
Components of system
Tokeniser
PoS Tagger
Lemmatiser
Parser/Grammar
Parse Ranking Model
raw text Input:
unannotated text or transcribed (and punctuated)
speech
1st
step:
sentence boundary detection and tokenisation
modules
2nd
step:
Tokenized text is tagged with one of 150
POS and punctuation labels (derived from
the CLAWS tagset)
→ ﬁrst-order (’bigram’) HMM tagger
→ trained on the manually corrected
tagged version of the Susanne, LOB and
BNC corpora
IA161 Syntactic Formalisms for Parsing Natural Languages 405 / 476
Lecture 10
Components of system
Tokeniser
PoS Tagger
Lemmatiser
Parser/Grammar
Parse Ranking Model
raw text 3rd
step:
Morphological analyzer
4th
step:
Manually developed wide-coverage tag sequence
grammar in the parser
→ 689 uniﬁcation based phrase structure
rules
→ preterminals to this grammar are the
POS and punctuation tags
→ terminals are featural description of the
preterminals
→ non-terminals project information up the
tree using an X-bar scheme with 41 attributes
with a maximum of 33 atomic
values
IA161 Syntactic Formalisms for Parsing Natural Languages 406 / 476
Lecture 10
Components of system
Tokeniser
PoS Tagger
Lemmatiser
Parser/Grammar
Parse Ranking Model
raw text 5th
step:
Generalized LR Parser
→ a non-deterministic LALR table is constructed
automatically from CF ’backbone’
compiled from the featurebased grammar
→ the parser builds a packed parse forest
using this table to guide the actions it
performs
→ the n-best parses can be eﬃciently
extracted by unpacking sub-analyses,
following pointers to contained
subanalyses and choosing alternatives in
order of probabilistic ranking
IA161 Syntactic Formalisms for Parsing Natural Languages 407 / 476
Lecture 10
Components of system
dependent
ta arg_mod det aux conj
mod arg
subj_or_dobj
subj comp
ncmod xmod cmod pmod
ncsubj xsubj csubj
obj pcomp clausal
dobj obj2 iobj xcomp ccomp
Output:
set of named grammatical relations
(GRs)
→ resulting set of ranked parses
can be displayed or passed on for
further processing
→ transformation of derivation
trees into a set of named GRs
→ GR scheme captures those aspects
of predicate-argument struc-
ture
IA161 Syntactic Formalisms for Parsing Natural Languages 408 / 476
Lecture 10
Evaluation
The system has been evaluated using the re-annotation of the
PARC dependency bank (DepBank, King et al., 2003)
It consists of 560 sentences chosen randomly from section 23 of
the WSJ with grammatical relations compatible with RASP
system.
Form of relations
(relation subtype head dependent initial)
Type of relationship
between the head and
the dependent
Encoding additional speciﬁcations of the relation
type for some relations and the initial or underlying
logical relation of the grammatical subject in
constructions such as passive
IA161 Syntactic Formalisms for Parsing Natural Languages 409 / 476
Lecture 10
Evaluation
Relation Precision Recall F1 std GRs
dependent 79.76 77.49 78.61 10696
aux 93.33 91.00 92.15 400
conj 72.39 72.27 72.33 595
ta 42.61 51.37 46.58 292
det 87.73 90.48 89.09 1114
arg_mod 79.18 75.47 77.28 8295
mod 74.43 67.78 70.95 3908
ncmod 75.72 69.94 72.72 3550
xmod 53.21 46.63 49.70 178
cmod 45.95 30.36 36.56 168
pmod 30.77 33.33 32.00 12
arg 77.42 76.45 76.94 4387
subj_or_dobj 82.36 74.51 78.24 3127
subj 78.55 66.91 72.27 1363
ncsubj 79.16 67.06 72.61 1354
xsubj 33.33 28.57 30.77 7
csubj 12.50 50.00 20.00 2
comp 75.89 79.53 77.67 3024
obj 79.49 79.42 79.46 2328
dobj 83.63 79.08 81.29 1764
obj2 23.08 30.00 26.09 20
iobj 70.77 76.10 73.34 544
clausal 60.98 74.40 67.02 672
xcomp 76.88 77.69 77.28 381
ccomp 46.44 69.42 55.55 291
pcomp 72.73 66.67 69.57 26
macroaverage 62.12 63.77 62.94
microaverage 77.66 74.98 76.29
Parsing accuracy on DepBank [Briscoe et al., 2006]
Micro-averaged precision,
recall and F1 score are
calculated from the counts for
all relations in the hierarchy
Macro-averaged scores are
the mean of the individual
scores for each relation
Micro-averaged F1 score of
76.3% across all relations
IA161 Syntactic Formalisms for Parsing Natural Languages 410 / 476
Lecture 10
Stanford parser
Java implementation of probabilistic natural language
parsers (version 1.6.9)
: [Klein and Manning, 2003]
Parsing system for English and has been used in Chinese,
German, Arabic, Italian, Bulgarian, Portuguese
Implementation, both highly optimized PCFG and lexicalized
dependency parser, and lexicalized PCFG parser
Useful links
http://nlp.stanford.edu/software/lex-parser.shtml
http://nlp.stanford.edu:8080/parser/
IA161 Syntactic Formalisms for Parsing Natural Languages 411 / 476
Lecture 10
Stanford parser
Input
various form of plain text
Output
Various analysis formats
→ Stanford Dependencies (SD): typed dependencies
as GRs
→ phrase structure trees
→ POS tagged text
makes
distributes
Bell
based
Angeles
Los
products
electronic
computer building
conj_and
dobj
dobj
nsubj
nsubj
partmod
prep_in
nn
amod
amodamod
conj_and
conj_and
Graphical representation of the SD for the sentence
“Bell, based in Los Angeles, makes and distributes
electronic, computer and building products.”
IA161 Syntactic Formalisms for Parsing Natural Languages 412 / 476
Lecture 10
Standford typed dependencies [De Marmette and
Manning, 2008]
provide a simple description of the grammatical relationships in
a sentence
represents all sentence relationships uniformly as typed
dependency relations
quite accessible to non-linguists thinking about tasks involving
information extraction from text and is quite eﬀective in relation
extraction applications.
IA161 Syntactic Formalisms for Parsing Natural Languages 413 / 476
Lecture 10
Standford typed dependencies [De Marnette and
Manning, 2008]
For an example sentence:
Bell, based in Los Angeles, makes and distributes electronic,
computer and building products.
Stanford Dependencies (SD) representation is:
nsubj(makes-8, Bell-1)
nsubj(distributes-10, Bell-1)
partmod(Bell-1, based-3)
nn(Angeles-6, Los-5)
prep_in(based-3, Angeles-6)
root(ROOT-0, makes-8)
conj_and(makes-8, distributes-10)
amod(products-16, electronic-11)
conj_and(electronic-11, computer-13)
amod(products-16, computer-13)
conj_and(electronic-11, building-15)
amod(products-16, building-15)
dobj(makes-8, products-16)
dobj(distributes-10, products-16)
IA161 Syntactic Formalisms for Parsing Natural Languages 414 / 476
Lecture 10
Output
A lineup of masseurs was waiting to take the media in hand.
.
POS tagged text
..
......
Parsing [sent. 4 len. 13]: [A, lineup, of, masseurs,
was, waiting, to, take, the, media, in, hand, .]
.
CFPSG representation
..
......
(ROOT
(S
(NP
(NP (DT A) (NN lineup))
(PP (IN of)
(NP (NNS masseurs))))
(VP (VBD was)
(VP (VBG waiting)
(S
(VP (TO to)
(VP (VB take)
(NP (DT the) (NNS media))
(PP (IN in)
(NP (NN hand))))))))
(. .)))
.
Typed dependencies
representation
..
......
det(lineup2, A1)
nsubj(waiting6, lineup2)
xsubj(take8, lineup2)
prep_of(lineup2, masseurs4)
aux(waiting6, was5)
root(ROOT0, waiting6)
aux(take8, to7)
xcomp(waiting6, take8)
det(media10, the9)
dobj(take8, media10)
prep_in(take8, hand12)
IA161 Syntactic Formalisms for Parsing Natural Languages 415 / 476
Lecture 10
Berkeley parser
Learning PCFGs, statistical parser (release 1.1, version
09.2009)
: [Petrov et al., 2006; Petrov and Klein, 2007]
Parsing system for English and has been used in Chinese,
German, Arabic, Bulgarian, Portuguese, French
Implementation of unlexicalized PCFG parser
Useful links
http://nlp.cs.berkeley.edu/
http://tomato.banatao.berkeley.edu:
8080/parser/parser.html
http://code.google.com/p/berkeleyparser/
IA161 Syntactic Formalisms for Parsing Natural Languages 416 / 476
Lecture 10
Comparison of parsing an example sentence
A lineup of masseurs was waiting to take the media in hand.
IA161 Syntactic Formalisms for Parsing Natural Languages 417 / 476
Lecture 10
charniak parser
Probabilistic LFG F-Structure Parsing
: [Charniak, 2000; Bikel, 2002]
Parsing system for English
PCFG based wide coverage LFG parser
Useful links
http://nclt.computing.dcu.ie/demos.html
http://lfg-demo.computing.dcu.ie/lfgparser.html
IA161 Syntactic Formalisms for Parsing Natural Languages 418 / 476
Lecture 10
Collins parser
Head-Driven Statistical Models for natural language
parsing (Release 1.0, version 12.2002)
: [Collins, 1999]
Parsing system for English
Useful links
http://www.cs.columbia.edu/~mcollins/code.html
IA161 Syntactic Formalisms for Parsing Natural Languages 419 / 476
Lecture 10
Bikel’s parser
Multilingual statistical parsing engine (release 1.0,
version 06.2008)
: [Charniak, 2000; Bikel, 2002]
Parsing system for English, Chinese, Arabic, Korean
http://www.cis.upenn.edu/~dbikel/#stat-parser
http://www.cis.upenn.edu/~dbikel/software.html
IA161 Syntactic Formalisms for Parsing Natural Languages 420 / 476
Lecture 10
Comparing parser speed on section 23 of WSJ Penn
Treebank
Parser Time (min.)
Collins 45
Charniak 28
Sagae 11
CCG 1.9
IA161 Syntactic Formalisms for Parsing Natural Languages 421 / 476
Lecture 11
.
......
Syntactic Formalisms for Parsing
Natural Languages
Aleš Horák, Miloš Jakubíček, Vojtěch Kovář
(based on slides by Juyeon Kang)
ia161@nlp.fi.muni.cz
Autumn 2013
IA161 Syntactic Formalisms for Parsing Natural Languages 422 / 476
Lecture 11
Study materials
Course materials and homeworks are available on the following
web site:
https://is.muni.cz/course/ﬁ/autumn2011/IA161
Refer to Dependency Parsing, Synthesis: Lectures on Human
Language Technologies, S. kübler, R. McDonald and J. Nivre,
2009
IA161 Syntactic Formalisms for Parsing Natural Languages 423 / 476
Lecture 11
Outline
Introduction to Dependency parsing methods
Dependency Parsers
IA161 Syntactic Formalisms for Parsing Natural Languages 424 / 476
Lecture 11
Introduction to Dependency parsing
Motivation
a. dependency-based syntactic representation seem to be useful in
many applications of language technology: machine translation,
information extraction
→ transparent encoding of predicate-argument structure
b. dependency grammar is better suited than phrase structure
grammar for language with free or ﬂexible word order
→ analysis of diverse languages within a common framework
c. leading to the development of accurate syntactic parsers for a
number of languages
→ combination with machine learning from syntactically
annotated corpora (e.g. treebank)
IA161 Syntactic Formalisms for Parsing Natural Languages 425 / 476
Lecture 11
Introduction to Dependency parsing
Dependency parsing
“Task of automatically analyzing the dependency structure of a
given input sentence”
Dependency parser
“Task of producing a labeled dependency structure of the kind
depicted in the follow ﬁgure, where the words of the sentence
are connected by typed dependency relations”
ROOT Economic news had little eﬀect on ﬁnancial markets .
PRED
PU
PC
ATTATT
OBJ
ATTSBJATT
IA161 Syntactic Formalisms for Parsing Natural Languages 426 / 476
Lecture 11
Deﬁnitions of dependency graphs and dependency
parsing
Dependency graphs: syntactic structures over sentences
Def. 1.: A sentence is a sequence of tokens denoted by
S = w0w1 . . . wn
Def. 2.: Let R = {r1, . . . , rm} be a ﬁnite set of possible
dependency relation types that can hold between any two
words in a sentence. A relation type r ∈ R is additionally called
an arc label.
IA161 Syntactic Formalisms for Parsing Natural Languages 427 / 476
Lecture 11
Deﬁnitions of dependency graphs and dependency
parsing
Dependency graphs: syntactic structures over sentences
Def. 3.: A dependency graph G = (V, A) is a labeled directed
graph, consists of nodes, V, and arcs, A, such that for
sentence S = w0w1 . . . wn and label set R the following holds:
1 V ⊆ {w0w1 . . . wn}
2 A ⊆ V × R × V
3 if (wi, r, wj) ∈ A then (wi, r′
, wj) /∈ A for all r′
̸= r
IA161 Syntactic Formalisms for Parsing Natural Languages 428 / 476
Lecture 11
Approach to dependency parsing
a. data-driven
it makes essential use of machine learning from linguistic data
in order to parse new sentences
b. grammar-based
it relies on a formal grammar, deﬁning a formal language, so
that it makes sense to ask whether a given input is in the
language deﬁned by the grammar or not.
→ Data-driven have attracted the most attention in
recent years.
IA161 Syntactic Formalisms for Parsing Natural Languages 429 / 476
Lecture 11
Data-driven approach
.
......
according to the type of parsing model adopted,
the algorithms used to learn the model from data
the algorithms used to parse new sentences with the model
a. transition-based
start by deﬁning a transition system, or state machine, for
mapping a sentence to its dependency graph.
b. graph-based
start by deﬁning a space of candidate dependency graphs for a
sentence.
IA161 Syntactic Formalisms for Parsing Natural Languages 430 / 476
Lecture 11
Data-driven approach
a. transition-based
learning problem: induce a model for predicting the next state
transition, given the transition history
parsing problem: construct the optimal transition sequence for
the input sentence, given induced model
b. graph-based
learning problem: induce a model for assigning scores to the
candidate dependency graphs for a sentence
parsing problem: ﬁnd the highest-scoring dependency graph for
the input sentence, given induced model
IA161 Syntactic Formalisms for Parsing Natural Languages 431 / 476
Lecture 11
Transition-based Parsing
Transition system consists of a set C of parser conﬁgurations
and of a set D of transitions between conﬁgurations.
Main idea: a sequence of valid transitions, starting in the
initial conﬁguration for a given sentence and ending in one of
several terminal conﬁgurations, deﬁnes a valid dependency
tree for the input sentence.
D1′m = d1(c1), . . . , dm(cm)
IA161 Syntactic Formalisms for Parsing Natural Languages 432 / 476
Lecture 11
Transition-based Parsing
Deﬁnition
Score of D1′m factors by conﬁguration-transition pairs (ci, di):
s(D1′m) =
∑m
i=1 s(ci, di)
Learning
Scoring function s(ci, di) for di(ci) ∈ D1′m
Inference
Search for highest scoring sequence D∗
1′m given s(ci, di)
IA161 Syntactic Formalisms for Parsing Natural Languages 433 / 476
Lecture 11
Transition-based Parsing
Inference for transition-based parsing
Common inference strategies:
Deterministic [Yamada and Matsumoto 2003, Nivre et al. 2004]
Beam search [Johansson and Nugues 2006, Titov and Henderson
2007]
Complexity given by upper bound on transition sequence length
Transition system
Projective O(n) [Yamada and Matsumoto 2003, Nivre 2003]
Limited non-projective O(n) [Attardi 2006, Nivre 2007]
Unrestricted non-projective O(n2) [Nivre 2008, Nivre 2009]
IA161 Syntactic Formalisms for Parsing Natural Languages 434 / 476
Lecture 11
Transition-based Parsing
Learning for transition-based parsing
Typical scoring function:
s(ci, di) = w · f(ci, di) where f(ci, di) is a feature vector over
conﬁguration ci and transition di and w is a weight vector
[wi = weight of featurefi(ci, di)]
Transition system
Projective O(n) [Yamada and Matsumoto 2003, Nivre 2003]
Limited non-projective O(n) [Attardi 2006, Nivre 2007]
Unrestricted non-projective O(n2) [Nivre 2008, Nivre 2009]
Problem
Learning is local but features are based on the global history
IA161 Syntactic Formalisms for Parsing Natural Languages 435 / 476
Lecture 11
Graph-based Parsing
For a input sentence S we deﬁne a graph Gs = (Vs, As) where
Vs = {w0, w1, . . . , wn} and
As = {(wi, wj, l)|wi, wj ∈ V and l ∈ L}
Score of a dependency tree T factors by subgraphs Gs, . . . , Gs:
s(T) =
∑m
i−1 s(Gi)
Learning: Scoring function s(Gi) for a subgraph Gi ∈ T
Inference: Search for maximum spanning tree scoring sequence
T∗
of Gs given s(Gi)
IA161 Syntactic Formalisms for Parsing Natural Languages 436 / 476
Lecture 11
Graph-based Parsing
Learning graph-based models
Typical scoring function:
s(Gi) = w · f(Gi) where f(Gi) is a high-dimensional feature vector
over subgraphs and w is a weight vector
[wj = weight of feature fj(Gi)]
Structured learning [McDonald et al. 2005a, Smith and
Johnson 2007]:
Learn weights that maximize the score of the correct dependency
tree for every sentence in the training set
Problem
Learning is global (trees) but features are local (subgraphs)
IA161 Syntactic Formalisms for Parsing Natural Languages 437 / 476
Lecture 11
Grammar-based approach
a. context-free dependency parsing
exploits a mapping from dependency structures to CFG
structure representations and reuses parsing algorithms
originally developed for CFG → chart parsing algorithms
b. constraint-based dependency parsing
parsing viewed as a constraint satisfaction problem
grammar deﬁned as a set of constraints on well-formed
dependency graphs
ﬁnding a dependency graph for a sentence that satisﬁes all the
constraints of the grammar (having the best score)
IA161 Syntactic Formalisms for Parsing Natural Languages 438 / 476
Lecture 11
Grammar-based approach
a. context-free dependency parsing
Advantage: Well-studied parsing algorithms such as CKY,
Earley’s algorithm can be used for dependency parsing as well.
→ need to convert dependency grammars into eﬃciently
parsable context-free grammars; (e.g. bilexical CFG, Eisner and
Smith, 2005)
b. constraint-based dependency parsing
deﬁnes the problem as constraint satisfaction
Weighted constraint dependency grammar (WCDG, Foth and
Menzel, 2005)
Transformation-based CDG
IA161 Syntactic Formalisms for Parsing Natural Languages 439 / 476
Lecture 11
Dependency parsers
Trainable parsers
Probabilistic dependency parser (Eisner, 1996, 2000)
MSTParser (McDonald, 2006)-graph-based
MaltParser (Nivre, 2007, 2008)-transition-based
K-best Maximum Spanning Tree Dependency Parser (Hall, 2007)
Vine Parser
ISBN Dependency Parser
Parsers for speciﬁc languages deﬁnes the problem as
constraint satisfaction
Minipar (Lin, 1998)
WCDG Parser (Foth et al., 2005)
Pro3Gres (Schneider, 2004)
Link Grammar Parser (Laﬀerty et al., 1992)
CaboCha (Kudo and Matsumoto, 2002)
IA161 Syntactic Formalisms for Parsing Natural Languages 440 / 476
Lecture 11
MaltParser
Data-driven dependency parsing system (Last version,
1.6.1, J. Hall, J. Nilsson and J. Nivre)
Transition-based parsing system
Implementation of inductive dependency parsing
Useful for inducing a parsing model from treebank data
Useful for parsing new data using an induced model
Useful links
http://maltparser.org
IA161 Syntactic Formalisms for Parsing Natural Languages 441 / 476
Lecture 11
Components of system
Deterministic parsing
algorithms
History-based models
Discriminative learning
Building labeled
dependency graphs
Predicting the next parser
action at nondeterministic
choice points
Mapping histories to
parser actions
IA161 Syntactic Formalisms for Parsing Natural Languages 442 / 476
Lecture 11
MSTParser
Running system
Input: part-of-speech tags or word forms
1 Den _ PO PO DP 2 SS _ _
2 blir _ V BV PS 0 ROOT _ _
3 gemensam _ AJ AJ _ 2 SP _ _
4 für _ PR PR _ 2 OA _ _
5 alla _ PO PO TP 6 DT _ _
6 inkomsttagare _ N NN HS 4 PA _ _
7 oavsett _ PR PR _ 2 AA _ _
8 civilständ _ N NN SS 7 PA _ _
9 . _ P IP _ 2 IP _ _
Output: column containing a dependency label
IA161 Syntactic Formalisms for Parsing Natural Languages 443 / 476
Lecture 11
MSTParser
Minimum Spanning Tree Parser (Last version, 0.2, R.
McDonald et al., 2005, 2006)
Graph-based parsing system
Useful links
http://www.seas.upenn.edu/ strctlrn/MSTParser/MSTParser.html
IA161 Syntactic Formalisms for Parsing Natural Languages 444 / 476
Lecture 11
MSTParser
Running system
Input data format:
w1 w2 . . . wn
p1 p2 . . . pn
l1 l2 . . . ln
d1 d2 . . . d2
Where,
w1 ... wn are the n words of the sentence (tab deliminated)
p1 ... pn are the POS tags for each word
l1 ... ln are the labels of the incoming edge to each word
d1 ... dn are integers representing the postition of each
words parent
Example:
.
......
For example, the sentence ”John hit the ball” would be:
John hit the ball
N V D N
SBJ ROOT MOD OBJ
2 0 4 2
IA161 Syntactic Formalisms for Parsing Natural Languages 445 / 476
Lecture 11
MSTParser
Running system
Output: column containing a dependency label
IA161 Syntactic Formalisms for Parsing Natural Languages 446 / 476
Lecture 11
Comparing parsing accuracy
Graph-based Vs. Transition-based MST Vs. Malt
Language MST Malt
Arabic 66.91 66.71
Bulgarian 87.57 87.41
Chinese 85.90 86.92
Czech 80.18 78.42
Danish 84.79 84.77
Dutch 79.19 78.59
German 87.34 85.82
Japanese 90.71 91.65
Portuguese 86.82 87.60
Slovene 73.44 70.30
Spanish 82.25 81.29
Swedish 82.55 84.58
Turkish 63.19 65.68
Average 80.83 80.75
Presented in Current Trends in Data-Driven Dependency Parsing by Joakim Nivre, 2009
IA161 Syntactic Formalisms for Parsing Natural Languages 447 / 476
Lecture 11
Link Parser
Syntactic parser of English, based on the Link Grammar
(version, 4.7.4, Feb. 2011, D. Temperley, D, Sleator, J.
Laﬀerty, 2004)
Words as blocks with connectors + or Words
rules for deﬁning the connection between the connectors
Deep syntactic parsing system
Useful links
http://www.link.cs.cmu.edu/link/index.html
http://www.abisource.com/
IA161 Syntactic Formalisms for Parsing Natural Languages 448 / 476
Lecture 11
Link Parser
Example of a parsing in the Link Grammar:
let’s test our proper sentences!
http://www.link.cs.cmu.edu/link/submit-sentence-4.html
IA161 Syntactic Formalisms for Parsing Natural Languages 449 / 476
Lecture 11
Link Parser
John gives a book to Mary.
IA161 Syntactic Formalisms for Parsing Natural Languages 450 / 476
Lecture 11
Link Parser
Some fans on Friday will be seeking to add another store-opening shirt to collections
they’ve assembled as if they were rare baseball cards.
IA161 Syntactic Formalisms for Parsing Natural Languages 451 / 476
Lecture 11
WCDG parser
Weighted Constraint Dependency Grammar Parser
(version, 0.97-1, May, 2011, W. Menzel, N. Beuck, C.
Baumgärtner )
incremental parsing
syntactic predictions for incomplete sentences
Deep syntactic parsing system
Useful links
http://nats-www.informatik.uni-
hamburg.de/view/CDG/ParserDemo
IA161 Syntactic Formalisms for Parsing Natural Languages 452 / 476
Lecture 12
.
......
Syntactic Formalisms for Parsing
Natural Languages
Aleš Horák, Miloš Jakubíček, Vojtěch Kovář
(based on slides by Juyeon Kang)
ia161@nlp.fi.muni.cz
Autumn 2013
IA161 Syntactic Formalisms for Parsing Natural Languages 453 / 476
Lecture 12
.
...... Parsing Evaluation
IA161 Syntactic Formalisms for Parsing Natural Languages 454 / 476
Lecture 12
Parsing Results
usually some complex (i.e. non-scalar) structure, mostly a tree
or a graph-like structure
crucial question: how to measure the “goodness” of the result?
IA161 Syntactic Formalisms for Parsing Natural Languages 455 / 476
Lecture 12
Extrinsic vs. Intrinsic Evaluation
Intrinsic
by comparing to a “gold”, i.e. correct, representation
Extrinsic
by exploiting the result in a 3rd party task and evaluating its
results
Which is better?
IA161 Syntactic Formalisms for Parsing Natural Languages 456 / 476
Lecture 12
Intrinsic Evaluation – Phrase-Structure Syntax
i.e. compare two phrase-structure trees and tell a number
PARSEVAL metric
LAA (Leaf-ancestor assessment) metric
IA161 Syntactic Formalisms for Parsing Natural Languages 457 / 476
Lecture 12
PARSEVAL metric
basic idea: penalize crossing brackets in the tree
i.e. compare all constituents in the test tree to the gold tree
⇒ parsing viewed as classiﬁcation problem
IA161 Syntactic Formalisms for Parsing Natural Languages 458 / 476
Lecture 12
Precision, recall
for classiﬁcation problems in NLP, the standard evaluation is by
means of precision and recall
precision = |test ∩ gold|
|test| recall = |test ∩ gold|
|gold|
two numbers, we just want to have one – F-score
F1 score = 2·precision·recall
precision+recall
IA161 Syntactic Formalisms for Parsing Natural Languages 459 / 476
Lecture 12
F-score
also F-measure
general form: Fβ score
Fβ score = (1 + β2) · precision·recall
(β2+precision)+recall
special case of β = 1 corresponds to the harmonic mean of
precision and recall
β can be used for favouring precision over recall (for β < 1) or
vice versa (for β > 1)
IA161 Syntactic Formalisms for Parsing Natural Languages 460 / 476
Lecture 12
PARSEVAL metric
basic idea: penalize crossing brackets in the tree
i.e. compare all constituents in the test tree to the gold tree
⇒ parsing viewed as classiﬁcation problem
⇒ F-score on correct bracketings/constituents
might even disregard non-terminal names
sort of standardized tool available: the evalb script at
http://nlp.cs.nyu.edu/evalb/
IA161 Syntactic Formalisms for Parsing Natural Languages 461 / 476
Lecture 12
PARSEVAL metric – example
test vs. gold
test:[S [NP John][VP [V likes][NP ice cream] [PP with chocolate]]]
gold:[S [NP John][VP [V likes][NP [NP ice cream] [PP with chocolate]]]]
precision = 6/6 = 1.0, recall = 6/7 = 0.86, F-score = 0.92
IA161 Syntactic Formalisms for Parsing Natural Languages 462 / 476
Lecture 12
PARSEVAL metric
test vs. gold
test:[S [NP John][VP [V likes][NP ice cream] [PP with chocolate]]]
gold:[S [NP John][VP [V likes][NP [NP ice cream] [PP with chocolate]]]]
precision = 6/6 = 1.0, recall = 6/7 = 0.86, F-score = 0.92
IA161 Syntactic Formalisms for Parsing Natural Languages 463 / 476
Lecture 12
PARSEVAL metric
often subject to criticism (see e.g. Sampson, 2000)
Sampson proposed another metric, the leaf-ancestor
assessment (LAA)
IA161 Syntactic Formalisms for Parsing Natural Languages 464 / 476
Lecture 12
LAA metric
basic idea: for each leaf (word), compare the path to the root of
the tree, compute the edit distance between both paths, ﬁnally
take the average of all words
in the previous example, the paths (lineages) are:
(John) NP S vs. (John) NP S
(likes) V VP S vs. (likes) V VP S
(ice cream) NP VP S vs. (ice cream) NP NP VP S
(with chocolate) PP VP S vs. (with chocolate) PP NP VP S
IA161 Syntactic Formalisms for Parsing Natural Languages 465 / 476
Lecture 12
Intrinsic Evaluation – Dependency Syntax
much easier
just precision, labeled or unlabeled (as the number of correct
dependencies)
IA161 Syntactic Formalisms for Parsing Natural Languages 466 / 476
Lecture 12
Intrinsic Evaluation – Building Treebanks
treebank = a syntactically annotated text corpus
manual annotation according to some guidelines
from the evaluation point of view: inter-annotator agreement
(IAA) is a crucial property
IA161 Syntactic Formalisms for Parsing Natural Languages 467 / 476
Lecture 12
Measuring IAA
naïve approach: count how many times people agreed on
problem: it does not account for agreement by chance
IA161 Syntactic Formalisms for Parsing Natural Languages 468 / 476
Lecture 12
Chance-corrected coeﬃcients for IAA
S (Benett, Alpert and Goldstein, 1954)
π (Scott, 1955)
κ (Cohen, 1960)
(there is lot of terminology confusion, we follow Ron Artstein,
Massimo Poesio: Inter-coder Agreement for Computational
Linguistics, 2008)
Ao – observed agreement
Ae – expected (chance) agreement
for all coeﬃcients, they compute:
S, π, κ =
Ao − Ae
1 − Ae
IA161 Syntactic Formalisms for Parsing Natural Languages 469 / 476
Lecture 12
Chance-corrected coeﬃcients for IAA
S (Benett, Alpert and Goldstein, 1954)
assumes that all categories and all annotators have uniform
probability distribution
π (Scott, 1955)
assumes that diﬀerent categories have diﬀerent distributions
shared across annotators
κ (Cohen, 1960)
assumes that diﬀerent categories and diﬀerent annotators have
diﬀerent distributions
devised for 2 annotators, various modiﬁcations for more than 2
annotators available
IA161 Syntactic Formalisms for Parsing Natural Languages 470 / 476
Lecture 12
Intrinsic Evaluation – Conclusions
generally not easy
builds on the assumption of having THE correct parse
there is evidence that it does not correlate with extrinsic
evaluation, i.e. how good the tool is for some particular job
IA161 Syntactic Formalisms for Parsing Natural Languages 471 / 476
Lecture 12
Extrinsic Evaluation
= evaluation on a particular task/application
advantages: measures direct ﬁtness for that task
disadvantages: may not generalize for other tasks
leads to crucial question: what can be parsing used for?
IA161 Syntactic Formalisms for Parsing Natural Languages 472 / 476
Lecture 12
What can parsing be used for?
in theory, (full) parsing is suitable/appropriate/necessary for
many NLP tasks
practically it turns out to be:
often not accurate enough
often too complicated to exploit
sometimes just an overkill compared to shallow parsing or yet
simpler approaches
IA161 Syntactic Formalisms for Parsing Natural Languages 473 / 476
Lecture 12
What can parsing be used for?
in theory, (full) parsing is suitable/appropriate/necessary for
many NLP tasks
information extraction
information retrieval
machine translation
corpus linguistics
computer lexicography
question answering
…
IA161 Syntactic Formalisms for Parsing Natural Languages 474 / 476
Lecture 12
Where is parsing actually used now?
prototype systems
academia work
production systems ???
IA161 Syntactic Formalisms for Parsing Natural Languages 475 / 476
Lecture 12
What to evaluate parsing on
Sample (more or less well deﬁned) applications
(partial) morphological disambiguation
text correcting systems
word sketches
phrase extraction
simple treebank of high IAA
IA161 Syntactic Formalisms for Parsing Natural Languages 476 / 476