TY - GEN
T1 - Table-structure recognition method using neural networks for implicit ruled line estimation and cell estimation
AU - Ohta, Manabu
AU - Yamada, Ryoya
AU - Kanazawa, Teruhito
AU - Takasu, Atsuhiro
N1 - Funding Information:
This work was supported by a JSPS Grant-in-Aid for Scientific Research (C) (18K11989), the Cross-ministerial Strategic Innovation Promotion Program (SIP) Second Phase, “Big-data and AI-enabled Cyberspace Technologies” by NEDO, and ROIS NII Open Collaborative Research 2020 (20FC07) and 2021 (21FC04).
Publisher Copyright:
© 2021 ACM.
PY - 2021/8/16
Y1 - 2021/8/16
N2 - Tables are often used to summarize accurate values in academic papers, while graphs are used to show them visually. Automatic graph generation from a table is therefore a topic of research interest. Given that the way tables are written varies depending on the author, in earlier work we proposed a cell-detection-based table-structure recognition method. Our method achieved fair performance in experiments using the ICDAR 2013 table competition dataset, but could not outperform the top-ranked participant in the competition. This paper proposes an improved method using two neural networks: one estimates implicit ruled lines that are necessary to separate cells but are undrawn, and the other estimates cells by merging detected tokens in a table. We demonstrated the effectiveness of the proposed method by experiments using the same ICDAR 2013 dataset. It achieved an F-measure of 0.955, thereby outperforming the other methods including the top-ranked participant.
AB - Tables are often used to summarize accurate values in academic papers, while graphs are used to show them visually. Automatic graph generation from a table is therefore a topic of research interest. Given that the way tables are written varies depending on the author, in earlier work we proposed a cell-detection-based table-structure recognition method. Our method achieved fair performance in experiments using the ICDAR 2013 table competition dataset, but could not outperform the top-ranked participant in the competition. This paper proposes an improved method using two neural networks: one estimates implicit ruled lines that are necessary to separate cells but are undrawn, and the other estimates cells by merging detected tokens in a table. We demonstrated the effectiveness of the proposed method by experiments using the same ICDAR 2013 dataset. It achieved an F-measure of 0.955, thereby outperforming the other methods including the top-ranked participant.
KW - neural network
KW - PDF
KW - table recognition
KW - table-structure analysis
KW - XML
UR - http://www.scopus.com/inward/record.url?scp=85113679708&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85113679708&partnerID=8YFLogxK
U2 - 10.1145/3469096.3469870
DO - 10.1145/3469096.3469870
M3 - Conference contribution
AN - SCOPUS:85113679708
T3 - DocEng 2021 - Proceedings of the 2021 ACM Symposium on Document Engineering
BT - DocEng 2021 - Proceedings of the 2021 ACM Symposium on Document Engineering
PB - Association for Computing Machinery, Inc
T2 - 21st ACM Symposium on Document Engineering, DocEng 2021
Y2 - 24 August 2021 through 27 August 2021
ER -