Compromised Tweet Detection Using weighted sub-word embeddings

Joshi, Mihir

dc.contributor.author	Joshi, Mihir
dc.date.accessioned	2019-08-07T18:18:46Z
dc.date.available	2019-08-07T18:18:46Z
dc.date.issued	2019-08-07T18:18:46Z
dc.identifier.uri	http://hdl.handle.net/10222/76216
dc.description.abstract	Extracting features and writing styles from short text messages for compromised tweet detection is always a challenge. Short messages, such as tweets, do not have enough data to perform statistical authorship attribution. Besides, the vocabulary used in these texts is sometimes improvised or misspelled. Therefore, in this thesis, I propose combining four feature extraction techniques namely character n-grams, word n-grams, Flexible Patterns and a new sub-word embedding using the skip-gram model. The proposed system uses a Multi-Layer Perceptron to utilize these features from tweets to analyze short text messages. This proposed system achieves 85\% accuracy, which is a considerable improvement over previous systems. Furthermore, Siamese networks are employed to model the representation of user tweets in order to identify them based on a limited amount of ground truth data. The results show that the proposed system achieves a promising accuracy as the number of authors increase.	en_US
dc.language.iso	en_US	en_US
dc.subject	Natural Language Processing	en_US
dc.subject	Machine Learning	en_US
dc.subject	Security Management	en_US
dc.title	Compromised Tweet Detection Using weighted sub-word embeddings	en_US
dc.type	Thesis	en_US
dc.date.defence	2019-07-31
dc.contributor.department	Faculty of Computer Science	en_US
dc.contributor.degree	Master of Computer Science	en_US
dc.contributor.external-examiner	n/a	en_US
dc.contributor.graduate-coordinator	Michael McAllister	en_US
dc.contributor.thesis-reader	Dr. Srinivas Sampalli	en_US
dc.contributor.thesis-reader	Dr. Malcolm Heywood	en_US
dc.contributor.thesis-supervisor	Dr. Nur zincir-Heywood	en_US
dc.contributor.ethics-approval	Not Applicable	en_US
dc.contributor.manuscripts	Not Applicable	en_US
dc.contributor.copyright-release	Not Applicable	en_US

Find Full text

Files in this item

Name:: Joshi-Mihir-MCS-CSCI-July-2019.pdf
Size:: 1000.Kb
Format:: PDF
Description:: Thesis Manuscript

View/Open

This item appears in the following Collection(s)

Faculty of Graduate Studies Online Theses

Show simple item record