Repository logo

Non-uniform Language Detection in Technical Writing

Loading...
Thumbnail Image

Authors

Wang, Weibo Jr

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Technical writing in professional environments, such as user manual authoring, requires uniform language. Non-uniform language detection is a novel task, which aims to guarantee the consistency for technical writing by detecting sentences in a document that are intended to have the same meaning within a similar context but use different words/writing style. This thesis proposes an approach that utilizes text similarity algorithms at lexical, syntactic, semantic and pragmatic levels. Different metrics are integrated by applying a machine learning classification method. We tested our method using smart phone user manuals, and compared the performance against the state-of-the-art methods in related area. The experiments demonstrate our approach is the most efficient solution to date.

Description

Keywords

NLP, text mining, supervised machine learning

Citation