Show simple item record

dc.contributor.authorGujarati, Afsan
dc.date.accessioned2019-08-07T17:50:35Z
dc.date.available2019-08-07T17:50:35Z
dc.date.issued2019-08-07T17:50:35Z
dc.identifier.urihttp://hdl.handle.net/10222/76215
dc.description.abstractIn Authorship Attribution (AA), a task of identifying the author on an unseen document, it is often hard to obtain large amounts of training text written by an author. In our research, we analyze the influence of the size of training data and we propose a novel alternative of using the documents read by the authors for the AA task. Although it becomes significantly more difficult to identify the author of an unseen document with less written data, classification performance can be drastically improved by using the documents read by the author. The Support Vector Machine method outperformed all the classifiers in the presence of the read documents with an average accuracy of 94.35%, a 23.57% increase after the addition of the read documents. It was found through the feature analysis that there exists a semantic similarity between the written and the read documents that played an important role in improved performance.en_US
dc.language.isoen_USen_US
dc.subjectAuthorship attributionen_US
dc.subjectmachine learningen_US
dc.subjectdocument classificationen_US
dc.subjectnatural language processingen_US
dc.subjectn-grams approachen_US
dc.subjectdata processingen_US
dc.subjectdata collectionen_US
dc.subjectlimited training dataen_US
dc.subjectread documentsen_US
dc.titleAuthorship Attribution using Written and Read Documentsen_US
dc.date.defence2019-07-05
dc.contributor.departmentFaculty of Computer Scienceen_US
dc.contributor.degreeMaster of Electronic Commerceen_US
dc.contributor.external-examinern/aen_US
dc.contributor.graduate-coordinatorMichael McAllisteren_US
dc.contributor.thesis-readerDr. Stan Matwinen_US
dc.contributor.thesis-readerDr. Evangelos Miliosen_US
dc.contributor.thesis-supervisorDr. Vlado Keseljen_US
dc.contributor.ethics-approvalNot Applicableen_US
dc.contributor.manuscriptsNot Applicableen_US
dc.contributor.copyright-releaseNot Applicableen_US
 Find Full text

Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record