Discovering newsworthy themes from sequenced data a step towards computational journalism

Page 1

Discovering Newsworthy Themes from Sequenced Data A Step Towards Computational Journalism

Abstract: Automatic discovery of newsworthy themes from sequenced data can relieve journalists from manually poring over a large amount of data in order to find interesting news. In this paper, we propose a novel k -Sketch Sketch query that aims to find k striking streaks to best summarize a subject. Our scoring function takes into account streak strikingness and streak coverage at the same time. We study the k -Sketch query uery processing in both offline and online scenarios, and propose various streak-level level pruning techniques to find striking candidates. Among those candidates, we then develop approximate methods to discover the k most representative streaks with theoretica theoreticall bounds. We conduct experiments on four real datasets, and the results demonstrate the efficiency and effectiveness of our proposed algorithms: the running time achieves up to 500 times speedup and the quality of the generated summaries is endorsed by the anonymous users from Amazon Mechanical Turk.


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.