Repository logo
 

TopVis: Visual Text Analytics for Deep Topic Modeling of Reddit Data

dc.contributor.authorRajendran, Muthukumar
dc.contributor.copyright-releaseNot Applicableen_US
dc.contributor.degreeMaster of Computer Scienceen_US
dc.contributor.departmentFaculty of Computer Scienceen_US
dc.contributor.ethics-approvalReceiveden_US
dc.contributor.external-examinern/aen_US
dc.contributor.graduate-coordinatorDr. Michael McAllisteren_US
dc.contributor.manuscriptsNoen_US
dc.contributor.thesis-readerDr. Fernando Paulovichen_US
dc.contributor.thesis-readerDr. Ana Maguitmanen_US
dc.contributor.thesis-supervisorDr. Evangelos E. Miliosen_US
dc.contributor.thesis-supervisorDr. Axel Sotoen_US
dc.date.accessioned2021-11-10T19:18:05Z
dc.date.available2021-11-10T19:18:05Z
dc.date.defence2021-10-29
dc.date.issued2021-11-10T19:18:05Z
dc.description.abstractThe COVID-19 pandemic and its broader impact have generated new research questions in social sciences and psychology. Social media remain a crucial resource for scientists to access opinions, concerns, and questions expressed by people. The vast amount of data makes traditional close-reading practices prohibitive for analysts. Several computational methods have focused on the analysis of social media data. In this context, topic modeling approaches have been commonly used to identify salient topics in posts. However, the output of such topic modeling is not easily consumable by non-technical persons, who otherwise need to make sense out of multiple probability distributions. Therefore, we propose TopVis, a novel visual analytics tool for topical analysis of social media data. TopVis uses deep language models to obtain sentence embeddings of posts, which then undergo dimensionality reduction. Embeddings are then hierarchically clustered, and clusters are visualized in the form of a graph. Users can select a cluster and visualize the topic modeling results utilizing Top2Vec. Interactive visualizations allow users to explore and inspect different topics to answer their research questions on a large body of social media posts. We showcase how social scientists and psychologists can benefit from this visual analysis to complement their standard practices.en_US
dc.identifier.urihttp://hdl.handle.net/10222/80959
dc.language.isoenen_US
dc.subjectTopic modelling, Deep learning, Visual analyticsen_US
dc.titleTopVis: Visual Text Analytics for Deep Topic Modeling of Reddit Dataen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
MuthukumarRajendran2021.pdf
Size:
1.61 MB
Format:
Adobe Portable Document Format
Description:
Thesis

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: