AUTOMATIC IDENTIFICATION OF USER INTEREST FROM SOCIAL MEDIA
Date
2015-04-01
Authors
Kumar, Mathavan
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Automatic identification of user interest from social media has gained much attention in the recent years. In Twitter, users could post tweets about a wide range of topics. These tweets could be analyzed to identify the user’s interests, which could be used to personalize recommendations for that user. But the short length of these tweets poses a huge challenge in classifying the tweets using traditional classification algorithms. In this thesis, a hybrid approach has been proposed to overcome this challenge. All tweets containing URLs are grouped as sessions with session duration as 1 hour, which increases the text length considerably. These sessions are then classified into 8 pre-defined categories using logistic regression. Based on the categories which appeared frequently in these sessions, top 3 categories are identified as the interests of the user. Experiments show that the proposed approach is able to identify the user interest in a precise manner.
Description
Keywords
Text classification, Social media, Twitter