Japanese IL0 Manual
Features
on All Nodes
Nouns
Verbs and Auxiliaries
(excluding copulas)
Copular Constructions
Adjectives and Adverbials
Conjunctions
Postpositions (Case-markers)
Sentence-final Particles
Punctuation
Abbreviations in glosses:
1 -- first
person
3 -- third person
CAUSE -- causative
CONT -- continuative base form
CONJ -- conjunction
DAT -- dative
f -- female
GEN -- genitive
IND -- indicative
INF -- infinitive
OBJ -- object marker
sg -- singular
SUBJ -- subject marker
TOP -- topic marker
Features on All Nodes
Each node in the dependency tree can be thought of as an
attribute-value matrix, i.e., a bundle of features with values. All
values must be set for each node in the tree. This will require
checking each node before finishing the analysis. Here is a list of
features:
Position (wpos) The linear
position of the
word in the sentence. This should not be modified or annotated, except
for new empty nodes created by the annotator, which should be given a
wpos greater than wpos of the word immediately before it and less than
the one immediately after it.
Word. This is the surface form
associated
with the node. It is almost always correctly displayed already.
It is displayed in romaji (for technical reasons). Example: itta.
Part-of-Speech (POS). This is
the lexical class, taken from a short list. Example: verb. The
following list was used for the Prague Dependency Treebank, and will be
edited in the future:
- V -- verbs, but not auxiliary verbs (=Aux)
- N -- common nouns
- PN -- proper nouns
- Adj -- adjectives
- Adv -- adverbs
- P -- adpositions and subordinating conjunctions
- G -- genitives
- Conj -- coordinating conjunctions, but not subordinating
conjunctions; also includes the comma used in enumerations instead of
repeated and
- Det -- determiners
- Aux -- auxiliary verbs
- Pun -- punctuation marks, but not the comma used in conjunctions
- Sym -- various symbols (dollar signs and the like)
- Uh -- speech-specific sounds, even if meaningful (such as /UH
HUH/)
- Misc -- everything else, including greetings (Hi, Hello) and
interjections (Okay)
Lexeme. This is the base form (lexeme)
of the inflected form. A first "guess" will be included, which needs to
be checked and corrected. Example: 行く go-IND.
Morphological Features (Feat). These are inflections that
do not appear as detachable morphemes, for example, 読み yomi, 読め yome.
Possibilities are:
VERBS (and variants of the copula)
- indicative (読む yomu)
- mood -- indicative, imperative (読め yome)
- continuative (読み yomi)
- potential (読める yomeru)
Deep syntactic role (DSyntRole).
This is the role of the
node with respect to its mother, in some deeper representation. This is
a little murky. We will use strictly syntactic criteria. Specifically,
DRole is different from SRole only if there is a form of the verb in
which it is realized with more arguments. DRole reflects the argument
patterns of the the verb if it were in its active, agentive form.
- Subj -- deep subject. The surface subject, except for passives
or non-agentive verbs (the door opened), in which case there is no deep
subject or it is expressed (for passives) by the by phrase. The deep
subject may, but need not, agree in person and number with the tensed
verb; empty surface subjects are like overt surface subjects in that
they may or may not be deep subjects.
- Obj -- deep object. This will never have a preposition
associated with it in the underlying form (on the surface it might).
This is in surface subject position for passives and non-agentives.
This is
also the deep role of a complement of a preposition.
- Obj2 -- deep indirect object. This is for ditransitive verbs.
Whether or not
the dependent has a preposition associated with it, if it can be
realized without preposition, it is an Obj2 (typically, the
recipient). This will most likely not be used in Japanese.
- PObj -- deep prepositional object. An object which is
always dominated by a preposition. It is in fact the preposition which
gets the role "PrepObj". Examples (PObj underlined): 本をテーブルの上に置く book OBJ table GEN above at put. 本
を太郎にあげた book OBJ Taroo-kun to give-PAST. Since
all adpositional phrases can be omitted from a sentence, a distinction
between PP adjuncts and arguments is hard to make. Therefore,
adjuncts are also classified as PObj.
- Mod -- includes modifiers, auxiliaries,
appositions, and the like.
- Root -- root of sentence
Done. This feature is only a
check to make
sure that the default values have been checked. Set it to "Y" when you
are done with the features for one node.
Nouns
Nominal modifiers
Compound nouns
Proper nouns
Quantifier-headed NPs
Numeral Nouns and
Classifiers & Counters
Suffix nouns
Dependent nouns
(Functional nouns)
Two or more nouns in
Conjunction
Nominal modifiers
The head of a noun phrase is the head noun. Any adnominals (e.g.
demonstrative (pro)nouns この、その/あの、これらの、
それらの/あれらの this, that,
these, those; indefinite pronouns いくつかの、どんな~も some, any;
and adjectival noun-based adnominals大きな、めったな、いろんな big, rare,
various) and adjectives are dependents of the noun. If
there are multiple adjectives, the default structure will simply have
each
adjective as a direct dependent of the noun.
This includes cases where multiple adnominals are present. For
example,
both the demonstrative adnominal その “that”
and the adjectival noun-based adnominal 小さな “small”
are direct
dependents of the noun 島
“island” in the noun phrase その小さな島“that small
island”. As explained in more detail
below, in the
case of compound nouns such as 歴史年表 annals of history, chronological
table for history and 大型合併
big amalgamation, the second (right-most) noun is the head and the
first
one is modifier and dependent.
Compound nouns
Compound noun phrases, when clear, can have
(multiple) noun
phrases as dependents. For example, in 航空割引運賃 “air
discount fare,” “fare” is the head and “air” and “discount” are its
direct
dependents. A good test for this is to
remove each noun in turn, to see if the phrase still retains part of
its
original sense. Because an air discount
fare is both airfare and a discount fare, this analysis is the one we
want. It is also the case for 官営八幡製鉄所 “government-managed Yawata
ironworks”; “government-managed” and “Yawata (place name)” both are
dependents
of the right-most noun “ironworks” because each modifies “ironworks” as
in
“government-managed ironworks” and “Yawata ironworks”.
In contrast, a phrase like 先行期間拡大 “elongation of
a trial period (lit. preceding period
expansion)” should be
annotated with “expansion” as the head, “period” as its dependent, and “preceding” as the dependent of “period” à
period expansion vs. *preceding expansion. The following example also belongs to the
second type---経済成長率
economic growth rate: “economic” is the
dependent of “growth” and “growth” is the dependent of the head N
“rate”.
In 設備能力過剰対策 “equipment capacity excess
counter-measure (counter-measure for
the excess of equipment capacity” where there are four nouns, the
rightmost
noun “counter-measure” is the head. The
first noun “equipment” is dependent of the noun “capacity” and then
“capacity”
is dependent of “excess”. The noun
“excess” is in turn dependent of the noun “counter-measure”.
In ambiguous cases or where it is not clear
whether or which
nouns modify each other, the default compound structure will have all
modifying
nouns as direct dependents on the rightmost noun.
Proper Nouns
Compound noun phrases
with a proper noun are analyzed as if the proper noun functions as an
adjective
to the following common noun(s). So in 馬島漁協 “Umashima
(fishermen’s) cooperative”, “cooperative”
is the head and has “Umashima”, the name of area, as its dependent. The same applies to 北九州市 “Kitakyushu city”, in which
“city” is the head and “Kitakyuushuu”, the name
of city, is a dependent that modifies “city”.
Proper nouns that
actually consist of more than two nouns may be interpreted as one
single noun
instead of compound noun phrases, especially if the proper nouns stand
for
brands (prop) or names of companies/organizations (prop org). For example, 日産自動車 “Nissan Motors” should be annotated as one
single unit instead of analyzing it as compound proper nouns as above,
“Nissan”
modifying “Motors”. The same applies to
the words such as 農林水産省 “the Ministry of
Agriculture, Forestry and Fisheries”, 三菱化学 “Mitsubishi Chemical”, and 富士写真フィルム “Fuji
Film”.
Interpretation of a
full personal name depends on if it is used alone or followed by some
title. For example, in田中正太郎がTanaka
(family) Shotaro (first) particle (casemk), “Tanaka” becomes
dependent of
“Shotaro” (“Tanaka” functions as if it is modifier of
“Shotaro”---“Shotaro”
from “the Tanakas”). On the other hand,
in 田中正太郎首相が
Tanaka Shotaro Prime minister particle (casemk), “Shusho” (prime
minister) is the head of NP and this head has both the first and family
names
equally as dependents.
If a noun phrase
consists of more than one proper nouns indicating place names (prop
lo), the
smaller unit should be the head of NP.
That is, if NP consists of a prefecture and city name such as 島根県出雲市に
In Shimane prefecture Izumo city, “city” is the head and
“Izumo”
is the direct dependent along with “prefecture”, and “Shimane” is a
dependent
of “prefecture”. So in 北九州市小倉北区から from Kitakyushu
city Ogurakita ward, “ward” is the head with dependents
“Ogurakita”, “city”,
which in turn has a dependent “Kitakyushu”,
and “from”.
Quantifier headed NPs
In a noun phrase consisting of a quantifier, the
quantifier
should be the head of the NP. Any modifying phrases are directly
dependent on
it.
少年たちみんなが一生懸命に勉
強して 5人/何人か/たくさん/ほとんど が大学に行った。
All of the boys worked hard and five/some/many/most went to
college.
Numeral
Nouns and Classifiers & Counters
Numeral nouns such
as 二十、十三、七 twenty,
thirteen, seven are
usually followed
by counters such as 才、年、回、人、分 year old, year,
times, yen, people, minutes.
These counters function as suffix to numeral
nouns but the head of NP with a numeral noun is the counter. So in 九十人 ninety people, “people” is the
head and “ninety” modifies “people” giving the information on how many
people
there are.
Numeral noun phrases may be
preceded by such as 第、約、計 ordinal number,
about, in total. For
example, 第二番
means the second
or number two, 約二億人 about 2, 000, 000, 000 people, and
計三
万円 total of 30, 000 yen. These classifiers are
also dependents of the head noun; thus in約二億人 about 2, 000, 000,
000 people, both “about” and “2, 000, 000,
000” comes under the node “people”.
Suffix
Nouns
As mentioned above, nouns that
function as a suffix to other nouns,
called a “suffix noun”, often become the head when NP with a suffix
noun does
not have other nouns following. For
example, in NP若年層の of young
generation, 層
“generation” is the head of 若年
“young” (in Japanese, it is a noun-‘youth’) and a particle “of”. The following are some other example of NPs
with suffix nouns.
投資家が investigator
particle SUBJ (lit. invest
person SUBJ): the head is 家 “person”
知識層が the
intellectual class SUBJ: the head is 層
“class”
However, when these NPs with suffix
nouns are followed by
other nouns, it is often the case that nouns following suffix nouns are
the
head. For example, in 年齢別構成を
distribution depending on ages –wo (lit. age different
distribution
–wo), though there is a suffix noun別 “difference”,
the following noun
構成
“distribution” is the head.
Dependent Nouns
(Functional Nouns)
In dependent noun phrases,
dependent nouns such as の、こと、もの、くせ (dependent
noun-general), 限り、最中、うち (dependent
noun-adverbial), and ふう、みたい (dependent
noun-adjstem) usually follow VP, ADJP, and NP, but the head is the
dependent
noun and the other VPs, ADJPs, and NPs are dependents of this noun. For example, in 強化したのも三十四年である It was also in
1934 that (they) strengthened
(regulations), の,
dependent noun-general that nominalyzes VP in
front of the noun, is the head and VP 強化したstrengthened is a
dependent of
the noun. In 今回のようにlike this time, the
dependent noun よう
is the head and the
noun今回, particles の andに, are dependents of
the noun.
Two
or
more nouns in Conjunction
When two more nouns are connected
by comma as in 釜石、輪西、三菱、九州、富士製鉄の(大合同による)
(by means of amalgamation) by Kamaishi, Rinzai, Mitsubishi,
Kyuushuu, Fuji
Ironworks, the right-most noun (ironworks) is the head, and the
dependent-tree goes downwards to the left with each noun being
dependent of the
nouns right to them. For example, “Fuji”
is dependent of “ironworks” with “Kyushu” as its
own
dependent, “Kyushu” has “Mitsubishi” as its
dependent,
“Mitsubishi” has its dependent “Rinzai” on the left, and so on.
Verbs and
Auxiliaries (excluding copulas)
Choosing a head
Grammatical Relations
Missing Constituents
(Empty Nodes)
Non-finite clauses
Questions
Imperatives
Relative Clauses
Passive
Causative
Choosing a head
Main verb: If the
independent
clause of the sentence consists of only one verb, that verb is the main
verb.
Otherwise, if the independent clause consists of more than
one verb, the main verb is the last verb in the verb complex that
changes the argument
structure (e.g. see Causative section).
The head of any complete clausal
utterance is the root of the main verb.
Incomplete utterances
(NPs, PPs, greetings) should have as their head the usual head for that
type of phrase.
Unless otherwise specified, the inflected verb is represented by one node.
Grammatical Relations
Every node must have an DSyntRoles specified, in relation to its parent.
Example: 私は太郎に本をあげた。1sg
TOP Taroo DAT book
OBJ give-PAST (私 1sg
is Subj, 太郎 Taroo is
PObj, and 本 book is Obj)
Topicalization:
Topicalized NPs (marked by は) should be assigned one of the
DSynthRoles, if possible. Otherwise, mark them as subject.
Examples:
猫は
魚を食べた。Neko TOP fish OBJ eat-PAST (Subj)
魚は猫が食べた。Fish TOP cat SUBJ eat-PAST (Obj)
太郎には私がこれを食べさせた。Taroo-kun DAT-TOP 1sg SUBJ this OBJ
eat-CAUSE-PAST (Obj of
させる CAUSE, Subj of 食べる
eat)
太郎には本をあげた。Taroo-kun DAT-TOP book OBJ give-PAST (PObj)
駅には人が多い。Station
DAT-TOP people SUBJ many (Mod)
太郎は
背が高い。Taroo-kun TOP height SUBJ tall (Subj of
背が高い height SUBJ tall, though
not of 高い tall, so mark as
Subj)
Missing
Constituents (Empty Nodes)
Constituents not in the sentence but whose presences are implied should
be represented by an empty node.
Empty nominal nodes: big-PRO, and related cases
Japanese does not have cases requiring big-PRO. In the
following examples, the main verbs take a VP as argument:
- 太郎は話しつづけた Taroo TOP
speak-CONT continue-PAST: つづけた takes the VP 太郎は話すas an argument.
- 私は彼女に来るように頼んだ 1sg TOP
3fsg DAT come manner DAT request-past: 頼んだ takes 私は...来るように as an argument;
- 私は猫に魚を食べさせた 1sg TOP cat
DAT eat-CAUSE-PAST: -させた takes 私は...食べ as an argument.
Empty nominal nodes: little-pro, missing arguments
- Omitted categories. For example, in 魚を食べた fish OBJ eat-PAST, the subject of
食べた eat-PAST is not
specified, but is usually implied in context.
In these cases, we label both the lexeme and
the word feature of the new node "<pro>". In case of doubt
("<pro>" or "<太郎>"), ask yourself: can I tell from
syntax alone what this node means? If no, "<pro>". If yes, fill
in the lexeme.
Non-finite clauses
In general, non-finite clauses will be dependents of main verbs.
Morphemes attached to verbs that make them
non-finite
(e.g. -て -te, -たら -tara, -れば -reba) should be dependents of the
verbs they are attached to.
Missing constituents in these clauses should be represented by empty
nodes, as they should be for other sentences.
Questions
Questions are treated as any other sentence, with the question word
taking the same position and grammatical relation as the answer to the
question would take.
The question marker か ka is a
dependent of the head of the question (see
Sentence-final Particles).
Imperatives
Include an empty node for the subject (usually second-person) if
missing. Otherwise, analysis is same as for declarative sentence.
The head of the sentence is the imperative verb, and not verbs like
なさる nasaru, 下さる kudasaru, くれる kureru, もらう morau, etc., that may follow it.
Relative Clauses
The head of the relative clause will be the dependent of the head node
of whatever the relative clause modifies.
The arc is labeled Mod.
The structure of the relative clause is the same as for regular
sentences.
An empty node should be inserted in the relative
clause, in place of the relativized NP. The word and lexeme of
this node should be the word/lexeme of the node that it co-references,
but in angled brackets.
Example: In 車がある人 car SUBJ have person/people (People
who have cars), the
underlined relative clause would contain the empty node <人> as
the object.
Passive
The passive morpheme (-れる -reru or
-られる -rareru) has its own
node, and is the parent of the verb it is attached to.
The deep syntactic role is subject for the deep subject (surface
oblique object), and
object for the deep object (surface subject).
The subject of the passive morpheme is the causer, while its object is
the entire caused action.
The subject of the caused action is the causee.
For example, in 太郎は弁当を私に食べられた, 私 is the subject of -られた, while 太郎は弁当を食べ
is its object. 太郎 and 弁当 are the subject and object of 食べる,
respectively.
Causative
The causative morpheme (-せる -seru
or -させる -sareru) has its own
node, and is the
parent of the verb it is attached to.
The subject of the causative morpheme is the causing NP, while the
subject of the verb is the caused NP.
The object of the causative morpheme is the VP it inflects, while the
subject of the verb is an empty node coreferencing the caused NP (see
the section "Missing Constituents (Empty Nodes)").
For example, in 私は猫に魚を食べさせた 1sg TOP
cat DAT fish OBJ eat-CAUSE-PAST, the subject of させる CAUSE is 私; the subject of 食べる eat is 猫, and the
object of 食べる eat is 魚 fish; the entire clause 私は猫に魚を食べ is
the object of させた.
Copular Constructions
In copula constructions, the predicate is the sentence head, and the
subject will have the DSynthRole "Subj."
If present, the copular verb, which is some variation of だ da (including
の no, な na,
and ではありませんでした dewa arimasendeshita),
will be omitted.
Note: Sentences ending with
forms of のです no desu are
considered to be copular constructions.
Examples (predicates underlined):
猫は静かです。cat TOP quiet be
静かな猫はここにいます。quiet be cat TOP here at be
この白の車は私のです。this white be car TOP 1sg GEN be (note: この kono and 私の watashi no are not copular
constructions: この kono is a
word, and the の no after 私 watashi is a possessive marker.)
Adjectives and Adverbials
Attributive Adjectives
Predicate Adjectives
Verbs with
adjectival inflections
Dependent Adjectives
Adverbials
Attributive Adjectives
As in English, when adjectives are used to modify nouns that
are placed in front of nouns, they are dependent on those nouns. For example, in 深いつきあいが (なかった) (there was no) close relationship,
adjective 深
い close modifies
the noun つき
あい relationship and
is a dependent of the noun. In 小さい組合で in a small union, 小さいsmall is a dependent of the
following noun 組合
union, and in 新しい年が a new year (begins), 新しい new is a dependent of the noun 年 year.
Predicate
Adjectives
If an
adjective functions as a predicate and is not followed by a copular or
a noun that it modifies, it becomes the head of the sentence.
For example, in 語る島民が多い there
are a lot of islanders who
talk …(lit. islanders who talk …are many), the adjective 多い many is the head and the noun 島民が islanders has the DSyntRole
of Subj. When this predicate adjective has
the
following copular で
すto
be, the copular verb
is the head of the sentence. See
the manual section on "copular constructions" on how to handle
sentences with copular verbs.
Verbs with
adjectival inflections (-ない, -たい)
Verbs ending with -ない (-nai, not) and -たい (-tai, want to) inflect like adjectives,
and are
considered adjectives. They share a node with the verb they
inflect, even when both are together.
There may also be other adjectival inflections (e.g. -ぬ nu).
Dependent
Adjectives
In Japanese, there are adjectives that act as suffixes, attaching to
the stem form of verbs. These adjectives
should take the verbs they attach to as dependents. For
example, in わかりやすい easy to
understand, the adjective -やすいeasy is the
parent of the inflected form of the head verb わかるto understand.
In 投
資しにくい
hard to invest, the adjective -にくいhard is the parent of the
inflected form of the head verb 投資する to invest.
Adverbs
Adverbs are dependents of the head of the phrase they modify. The following are some of the
examples in Japanese:
初めて知った
knew for the first time:
初めて (adv.) modifies 知った (past tense “to know”)
やはり(口座を) 開くopen (the account) as I planned/as well/after all: や
はり (adv.) modifies 開く(“to
open”)
きちんとする
set (things) right: きちんと (adv.) modifies する (“to do”)
Some
adverbials can be followed by particles and those
particles are used to add meaning to the adverbs (usually emphasis);
these should take the adverbials as dependents. For example, in はっきりと(する) (make it)
clear, the adverb はっきりis followed
by the
particle と and the
whole phrase means “clear(ly)”.
Conjunctions
Coordinating Conjunctions
Subordinating Conjunctions
Conjunction has its own part-of-speech (Conj).
Coordinating Conjunctions
The first conjunct is dependent (with role Obj) on the conjunct that
follows it, which is dependent (with role Mod) on the second conjunct,
etc.
The coordinating conjunction (や ya,
と to, も mo, etc.) is placed as a dependent
of the first conjunct with role
Mod, and the second conjunct is a dependent of the conjunction with
role Obj.
If a comma acts as a conjunction, it is treated as such (given
part-of-speech Conj and analyzed as in the above paragraph).
When there is a case assigned to the entire conjunct (e.g. 猫と犬と鳥を cat CONJ dog CONJ bird OBJ), the
head of the whole coordinate NP becomes the dependent of the case
marker, as usual.
Subordinating Conjunctions
Subordinating conjunctions should take as a dependent the head of the
principal clause in the sentence.
They should take as a dependent the head of the clause it subordinates.
Postpositions (Case
markers)
Postpositions and case markers should take as a dependent the head of
the constituent they are attached to.
For case markers such as には DAT-TOP,
the latter postposition is the head of the
entire phrase before it (i.e. に is dependent on は).
Note: includes の no when it
nominalizes a VP, which includes forms of のですno desu.
Sentence-final Particles
Sentence-final particles should be dependent on the head of the
sentence they are attached to.
Punctuation
Remove all punctuation, except meaningful punctuation. Examples:
- Quotes -- leave them (open and closed)
attached to the constituent which is quoted. If the quoted passage is
not a constituent, quote each piece separately.
- Commas that act as conjuncts (see Conjunctions).
Do remove:
- All non-conjunctive commas.
- All sentence-final punctuation.
- All dashes and so on.