Text Embedding Based Topic Modeling on Noisy Historical Drilling Data
Date
2021-12-17T19:30:36Z
Authors
Narravula, Goutham
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
In oil industry, drilling reports play a vital role in documenting critical events on a drilling rig. Information in these reports will help foresee drilling risks and mitigate unwanted surprises beforehand, significantly reducing development costs and saving time for future projects. Manually going through thousands of reports can be time-consuming and laborious. This thesis proposes an approach for extracting human-interpretable topics that can best summarize clusters of reports using state-of-the-art text embedding techniques. Generated topics are used to optimize the existing information retrieval system. Due to various complexities of text, conventional data preprocessing and traditional topic models could not produce desired results. Hence, we propose an approach that uses distributed representations to capture semantic and syntactic context from a small, domain-specific dataset. Industry experts reviewed generated topics to examine topic diversity and assign appropriate labels. Detailed analysis shows that our results are more coherent and diverse than traditional methods.
Description
Keywords
Topic Model, Text Embedding, Oil and Gas