AUTOMATIC IDENTIFICATION OF USER INTEREST FROM SOCIAL MEDIA
MetadataShow full item record
Automatic identification of user interest from social media has gained much attention in the recent years. In Twitter, users could post tweets about a wide range of topics. These tweets could be analyzed to identify the user’s interests, which could be used to personalize recommendations for that user. But the short length of these tweets poses a huge challenge in classifying the tweets using traditional classification algorithms. In this thesis, a hybrid approach has been proposed to overcome this challenge. All tweets containing URLs are grouped as sessions with session duration as 1 hour, which increases the text length considerably. These sessions are then classified into 8 pre-defined categories using logistic regression. Based on the categories which appeared frequently in these sessions, top 3 categories are identified as the interests of the user. Experiments show that the proposed approach is able to identify the user interest in a precise manner.