Predicting Political Donations Using Data Driven Lifestyle Profiles Generated from Character N-Gram Analysis of Heterogeneous Online Sources
This paper describes an approach for generating multi-dimensional Activities, Interests, and Opinions (AIO) insights from disparate web sources. The method involves identifying psychographic profiles using text analysis of social media data. The approach is tested on tweets from 438 Twitter profiles, 219 of which are integrated with filing records from the United States Federal Election Commission, 219 others were used for control. Profiles were matched using demographic criteria and analyzed using political parties and donation values as labels. Standard probabilistic, entropy and kernel based approaches are used to make predictions based on word n-grams, while the CNG technique is explored as an alternative. Using CNG two predictive models were created that were able to exceed benchmarks extracted from the literature. Using these models, we are able to demonstrate a method for generating qualitative psychographic profiles, which can in turn be used to label customers for marketing insight.