Constructing web-accessible semantic role labels and frames for Japanese as additions to the NPCMJ parsed corpus

Koichi Takeuchi, Alastair Butler, Iku Nagasaki, Takuya Okamura, Prashant Pardeshi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

As part of constructing the NINJAL Parsed Corpus of Modern Japanese (NPCMJ), a web-accessible language resource, we are adding frame information for predicates, together with two types of semantic role labels that mark the contributions of arguments. One role type consists of numbered semantic roles, like in PropBank, to capture relations between arguments in different syntactic patterns. The other role type consists of semantic roles with conventional names. Both role types are compatible with hierarchical frames that belong to related predicates. Adding semantic role and frame information to the NPCMJ will support a web environment where language learners and linguists can search examples of Japanese for syntactic and semantic features. The annotation will also provide a language resource for NLP researchers making semantic parsing models (e.g., for AMR parsing) following machine learning approaches. In this paper, we describe how the two types of semantic role labels are defined under the frame based approach, i.e., both types can be consistently applied when linked to corresponding frames. Then we show special cases of syntactic patterns and the current status of the annotation work.

Original languageEnglish
Title of host publicationLREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings
EditorsNicoletta Calzolari, Frederic Bechet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
PublisherEuropean Language Resources Association (ELRA)
Pages3153-3161
Number of pages9
ISBN (Electronic)9791095546344
Publication statusPublished - 2020
Event12th International Conference on Language Resources and Evaluation, LREC 2020 - Marseille, France
Duration: May 11 2020May 16 2020

Publication series

NameLREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings

Conference

Conference12th International Conference on Language Resources and Evaluation, LREC 2020
Country/TerritoryFrance
CityMarseille
Period5/11/205/16/20

Keywords

  • Predicate frames
  • PropBank
  • Semantic roles
  • Sentence level meaning
  • Thesaurus

ASJC Scopus subject areas

  • Language and Linguistics
  • Education
  • Library and Information Sciences
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Constructing web-accessible semantic role labels and frames for Japanese as additions to the NPCMJ parsed corpus'. Together they form a unique fingerprint.

Cite this