Show simple item record

dc.contributor.authorWang, Xiangru
dc.date.accessioned2015-08-13T14:05:55Z
dc.date.available2015-08-13T14:05:55Z
dc.date.issued2015
dc.identifier.urihttp://hdl.handle.net/10222/59903
dc.description.abstractTraditionally, text document similarity is based on lexical overlap between documents. Documents are represented based on bag of words (BOW), which ignores the relatedness among terms. One existing method to address this problem is to use external resources to enhance the BOW representation. Documents are represented by the background knowledge derived from external resources to create bag of concepts (BOC). Then BOC is used along with or instead of BOW to make a new representation. However, this approach assumes concepts to be independent, which is known as the orthogonality assumption. This work focuses on developing new semantic similarity measures. By employing Wikipedia as the knowledge resource to create a BOC model, we get document similarities by following different concept mapping procedures combined with concept relatedness. We evaluate proposed measures in text clustering. Experimental results show that our BOC based similarity method can improve clustering performance.en_US
dc.language.isoen_USen_US
dc.subjectWikipediaen_US
dc.subjectDocument Clusteringen_US
dc.subjectSemantic Similarityen_US
dc.titleTEXT DOCUMENT SIMILARITIES BASED ON WIKIPEDIA CONCEPT RELATEDNESSen_US
dc.typeThesisen_US
dc.date.defence2015-07-22
dc.contributor.departmentFaculty of Computer Scienceen_US
dc.contributor.degreeMaster of Computer Scienceen_US
dc.contributor.external-examinern/aen_US
dc.contributor.graduate-coordinatorDr. Evangelos Miliosen_US
dc.contributor.thesis-readerDr. Vlado Keseljen_US
dc.contributor.thesis-readerDr. Stan Matwinen_US
dc.contributor.thesis-supervisorDr. Evangelos Miliosen_US
dc.contributor.ethics-approvalNot Applicableen_US
dc.contributor.manuscriptsNot Applicableen_US
dc.contributor.copyright-releaseNot Applicableen_US
 Find Full text

Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record