Resumo:
Over the past decade, the Internet has changed the way people work, shop and socialize.
Those changes resulted in the increase of User Generated Content (UGC) such as: ratings,
reviews, wikis, and videos. UCG contains relevant information for decision-making,
especially with regard to the acquisition of goods and services. However, the large volume
and dispersion of this content makes it difficult to obtain relevant information. Text
summarization appears as a way to make this content more accessible to people.
A summary A can be considered better than another B when A is shorter than B while
maintaining the same content relevance, or when A, despite being longer, presents more
relevant content. Analyzing the literature, we observed that it is possible to produce better
quality summaries than those produced by algorithms that correspond to the state of the
art in text summarization. We present a multilingual automatic text summarizer that
combines and extends the algorithms Latent Dirichlet Allocation (LDA) and TextRank.
Our approach, when compared to the state of the art, generates better text summaries in
terms of size and content relevance.