Non-uniform Language Detection in Technical Writing
Loading...
Authors
Wang, Weibo Jr
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Technical writing in professional environments, such as user manual authoring, requires uniform language. Non-uniform language detection is a novel task, which aims to guarantee the consistency for technical writing by detecting sentences in a document that are intended to have the same meaning within a similar context but use different words/writing style. This thesis proposes an approach that utilizes text similarity algorithms at lexical, syntactic, semantic and pragmatic levels. Different metrics are integrated by applying a machine learning classification method. We tested our method using smart phone user manuals, and compared the performance against the state-of-the-art methods in related area. The experiments demonstrate our approach is the most efficient solution to date.
Description
Keywords
NLP, text mining, supervised machine learning
