A statistical model for domain-independent text segmentation 论文

2001引用 283
Natural Language Processing TechniquesTopic ModelingAdvanced Text Analysis Techniques

详细信息

发表日期
2001-01-01
发表年份
2001

关键词

Natural Language Processing TechniquesTopic ModelingAdvanced Text Analysis Techniques

摘要

We propose a statistical method that finds the maximum-probability segmentation of a given text. This method does not require training data because it estimates probabilities from the given text. Therefore, it can be applied to any text in any domain. An experiment showed that the method is more accurate than or at least as accurate as a state-of-the-art text segmentation system.