Text-mining and Analysis of the Doctors’ Meta-data and Text-reviews using Topic-modeling (LDA) Technique
Abstract
The developments in the internet and web technologies along with smart devices have empowered consumers to rate, comment, review, recommend products and services for others using a plethora of platforms, such as RateMDs.com. Therefore, feedback is critical to improve the overall quality of a process, product or service. Hence, the healthcare industry is no exception. This thesis aims to mine and analyze physicians’ online reviews using web-scrapping and topic-modeling (LDA) technique. RateMDs.com was chosen as a case study for the period from September 2013 to January 2019. The thesis employed web scrapping, to collect physicians’ meta-data, and LDA technique, a generative probabilistic model of text-corpus to the text-corpus, for text-mining among Canadian provinces. The results revealed that physicians, in some of the specialities, such as plastic surgery, had a higher probability of being rated than others in specialities such as Radiation, Oncology and Osteopathy. The research also revealed that East coast provinces had a relatively higher rating than those in the West of Canada. Finally, this thesis validates the use of Python (BeautifulSoup, spaCy, Gensim, NLTK, re) for text-mining with LDA.