dc.contributor.author | Tilbury, Kyle | |
dc.date.accessioned | 2018-11-01T18:02:37Z | |
dc.date.available | 2018-11-01T18:02:37Z | |
dc.date.issued | 2018-11-01T18:02:37Z | |
dc.identifier.uri | http://hdl.handle.net/10222/74927 | |
dc.description.abstract | Word embeddings are becoming pervasive in natural language processing (NLP), with one of their main strengths being their ability to capture semantic relationships between words. Rather than training their own embeddings many NLP practitioners elect to use pre-trained word embeddings. These pre-trained embeddings are typically created and evaluated using general corpora. Thus, there is a deficiency in the understanding of their performance within a technical domain. In this thesis, we explore how the nature of the data used to train embeddings can affect their performance when computing semantic relatedness within different domains. The three main contributions are as follows. Firstly, we find that the performance of general pre-trained embeddings is lacking in the biomedical domain. Secondly, we provide key insights that should be considered when working with word embeddings for any semantic task. Finally, we develop new biomedical word embeddings and provide them as publicly available for use by others. | en_US |
dc.language.iso | en | en_US |
dc.subject | word embedding | en_US |
dc.subject | word vector | en_US |
dc.subject | semantic relatedness | en_US |
dc.subject | semantic similarity | en_US |
dc.subject | biomedical | en_US |
dc.title | Word Embeddings for Domain Specific Semantic Relatedness | en_US |
dc.date.defence | 2018-10-05 | |
dc.contributor.department | Faculty of Computer Science | en_US |
dc.contributor.degree | Master of Computer Science | en_US |
dc.contributor.external-examiner | n/a | en_US |
dc.contributor.graduate-coordinator | Michael McAllister | en_US |
dc.contributor.thesis-reader | Abidalrahman Mohammad | en_US |
dc.contributor.thesis-reader | Aminul Islam | en_US |
dc.contributor.thesis-supervisor | Evangelos Milios | en_US |
dc.contributor.thesis-supervisor | Meng He | en_US |
dc.contributor.ethics-approval | Not Applicable | en_US |
dc.contributor.manuscripts | Not Applicable | en_US |
dc.contributor.copyright-release | Not Applicable | en_US |