Repository logo
 

Text Embedding Based Topic Modeling on Noisy Historical Drilling Data

Date

2021-12-17T19:30:36Z

Authors

Narravula, Goutham

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

In oil industry, drilling reports play a vital role in documenting critical events on a drilling rig. Information in these reports will help foresee drilling risks and mitigate unwanted surprises beforehand, significantly reducing development costs and saving time for future projects. Manually going through thousands of reports can be time-consuming and laborious. This thesis proposes an approach for extracting human-interpretable topics that can best summarize clusters of reports using state-of-the-art text embedding techniques. Generated topics are used to optimize the existing information retrieval system. Due to various complexities of text, conventional data preprocessing and traditional topic models could not produce desired results. Hence, we propose an approach that uses distributed representations to capture semantic and syntactic context from a small, domain-specific dataset. Industry experts reviewed generated topics to examine topic diversity and assign appropriate labels. Detailed analysis shows that our results are more coherent and diverse than traditional methods.

Description

Keywords

Topic Model, Text Embedding, Oil and Gas

Citation