Examination of effective features for CRF-based bibliography extraction from reference strings

Daiki Matsuoka, Manabu Ohta, Atsuhiro Takasu, Jun Adachi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Citations (Scopus)

Abstract

Metadata such as bibliographic information about documents are indispensable in the effective use of digital libraries. In particular, the reference fields of academic papers contain much bibliographic information such as authors' names and document titles. We are therefore developing a method for automatically extracting bibliographic information from reference strings using a conditional random field (CRF). The features used by the CRF determine the accuracy of this method. We examine effective features for accurate extraction by experimentally changing the features used. The experiments showed that lexical features were quite effective in accurate extraction and augmenting lexicons properly could lead to further improvements in accuracy.

Original languageEnglish
Title of host publication2016 11th International Conference on Digital Information Management, ICDIM 2016
EditorsRamiro Robles, Pit Pichappan, Pit Pichappan, Antonio J. Tallon-Ballesteros
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages243-248
Number of pages6
ISBN (Electronic)9781509026401
DOIs
Publication statusPublished - 2016
Event2016 11th International Conference on Digital Information Management, ICDIM 2016 - Porto, Portugal
Duration: Sept 19 2016Sept 21 2016

Publication series

Name2016 11th International Conference on Digital Information Management, ICDIM 2016

Other

Other2016 11th International Conference on Digital Information Management, ICDIM 2016
Country/TerritoryPortugal
CityPorto
Period9/19/169/21/16

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems and Management

Fingerprint

Dive into the research topics of 'Examination of effective features for CRF-based bibliography extraction from reference strings'. Together they form a unique fingerprint.

Cite this