ParsCit: an Open-source CRF Reference String Parsing Package 论文

2008引用 288
Natural Language Processing TechniquesTopic ModelingSpeech Recognition and Synthesis

摘要

We describe ParsCit, a freely available, open-source implementation of a reference string parsing package. At the core of ParsCit is a trained conditional random field (CRF) model used to label the tokensequencesinthereferencestring. Aheuristicmodelwraps this core with added functionality to identify reference strings fromaplaintextfile,andtoretrievethecitationcontexts.Thepackagecomes with utilities to run it as a web service or as a standalone utility. We compare ParsCit on three distinct reference string datasets and show that it compares well with other previously published work. 1.