VISUAL ANALYTICS OF RESEARCH COMMUNITY EXPERTISE IN SPACE AND TIME
Date
2019-08-14T18:23:09Z
Authors
Munjal, Deepak
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Association for Computing Machinery (ACM) is an international learned society for computing. ACM operates the Distinguished Speaker Program (ACMDSP). ACMDSP maintains a list of speakers, who can be invited to deliver lectures on Computer Science topics at different locations worldwide. Currently, speakers' lectures are classified into topics manually and ACMDSP committee accesses the speaker and lecture data directly through the database. This thesis is attempting to make it more intuitive to access the database through a visualization system, and in classifying the lectures on offer into topics. It uses Google Map to visualize the speaker, topic and lecture data. It displays the speaker's location and contact details on Google Map.
Each lecture delivered by the speakers is assigned to one or more topics from the set of topics defined by the ACMDSP committee. The problem of categorizing lectures into topics is similar to the problem of categorizing research papers into topics. Hence, for each topic, we have manually associated a set of keywords from the NSERC list of research topics. These keywords are used to create training sets for each topic. Title and abstract information of these research papers along with a lecture topic are used to train the machine learning models, which classify each lecture title and abstract into one or more topics of a predefined topic structure.
This thesis uses three document representations, based on bag of words, bag of concepts and bag of categories. We have used three consensus methods, which include linear regression, class with maximum probability and voting based. Each of these methods is a consensus method in itself and every individual consensus method forms an agreement to predict a topic.
This thesis expanded on the previous classification model based on semantic representations of lecture titles/abstracts that can classify a large set of lectures into topics. Previous work used the topics to construct the training data. However, this thesis used the NSERC keywords to describe the ACMDSP topics and construct the training data. The classifier can predict up to three topics for a single Lecture.
Description
Keywords
classification, machine learning, visual analytics, text analytics