Generation that exploits corpus-based statistical knowledge 论文

1998引用 357
Natural Language Processing TechniquesTopic ModelingBiomedical Text Mining and Ontologies

摘要

We describe novel aspects of a new natural language generator called Nitrogen. This generator has a highly flexible input representation that allows a spectrum of input from syntactic to semantic depth, and shifts' the burden of many linguistic decisions to the statistical post-processor. The generation algorithm is compositional, making it efficient, yet it also handles non-compositional aspects of language. Nitrogen's design makes it robust and scalable, operating with lexicons and knowledge bases of one hundred thousand entities.