Repository logo
 

Non-uniform Language Detection in Technical Writing

Date

2016-04-21T12:39:38Z

Authors

Wang, Weibo Jr

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Technical writing in professional environments, such as user manual authoring, requires uniform language. Non-uniform language detection is a novel task, which aims to guarantee the consistency for technical writing by detecting sentences in a document that are intended to have the same meaning within a similar context but use different words/writing style. This thesis proposes an approach that utilizes text similarity algorithms at lexical, syntactic, semantic and pragmatic levels. Different metrics are integrated by applying a machine learning classification method. We tested our method using smart phone user manuals, and compared the performance against the state-of-the-art methods in related area. The experiments demonstrate our approach is the most efficient solution to date.

Description

Keywords

NLP, text mining, supervised machine learning

Citation