SMERP Data Challenge Track - ECIR 2017
Objective
The SMERP Data Challenge, organized by the 39th European Conference on Information Retrieval (ECIR 2017), was about extracting and summarizing information relevant to a set of practical information needs (topics) that are critical for post-disaster relief operations, such as need and availability of resources, infrastructure damage and restoration, etc. Specifically, each topic was shared in TREC format - including a title, a brief description, and a more detailed narrative on what type of tweets will be considered relevant to the topic. The track used a dataset of microblogs posted during the August 2016 earthquake in central Italy.
Approach
We demonstrate that topic-wise summarization of tweets during disaster events can be better accomplished by initially summarizing the tweets using some basic summarizer, then categorizing the relevant and non-relevant tweets, and finally summarizing again. We present a semi-automatic extractive summarization method that exploits a combination of SumBasic Summarizer and different classifiers to summarize the topic wise relevant microblogs (tweets) extracted through manually identified query-term matching. Acordingly, our method obtained the overall first place in the SMERP 2017 Data Challenge - Summarization Track, highlighting that it is an effective approach in summarizing tweets in a disaster scenario and can be replicated across diverse domains.