Hybrid Query Expansion assisted Adaptive Visual Interface for Exploratory Information Retrieval
MetadataShow full item record
Query Expansion is an effective approach for improving the information retrieval (IR) system's performance as it addresses the vocabulary mismatch and distinct terminology issues. Traditional pseudo relevance feedback (PRF) based query expansion models assume that top-k retrieved documents to be positive feedback from which the expansion terms are selected. This approach might add terms out of context if the initial retrieved list contains a significant number of negative documents. Therefore, it is equally important to consider negative feedback along with positive feedback. Moreover, it has been observed that the terms suggested by the global expansion techniques, such as word-embeddings, are different from the local expansion technique. The proposed hybrid query expansion technique combines a word embedding model with the positive and negative feedback model based on the Expectation-Maximization algorithm. The experiment conducted on the CACM dataset demonstrates that integrating the global and local expansion techniques enhances the system's performance over the baselines. Subsequently, we provide an interactive visual interface assisted by the proposed hybrid query expansion techniques. Unlike static vector-space models like TF-IDF and Doc2Vec, this interface represents documents based on the relevance score with the other documents in the space. The document-query space is adaptive as query and expansion terms weights are added, based on whether they appear in the document. The other document terms are weighted according to their TF-IDF value. Moreover, the representation also adapts based on the user relevance feedback provided. The user scenario illustrates the visual interface's usefulness for navigating, analyzing, and providing feedback to the document in a large query-document space. The results confirmed that the system's adaptive nature, influenced by the expansion terms and user feedback, can improve the ranked list based on the documents closest to the query.