Repository logo
 

MTLV: A Library for Building deep multi-task learning Architectures

Date

2021-04-27T11:52:49Z

Authors

Rahimi, Fatemeh

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Multi-Task Learning (MTL) for text classification takes advantage of the data to train a single shared model with multiple task-specific layers on multiple related classification tasks to improve its generalization performance. We choose pre-trained language models (BERT-family) as the shared part of this architecture. Although they have achieved noticeable performance in different downstream NLP tasks, their performance in an MTL setting for the biomedical domain is not thoroughly investigated. In this work, we investigate the performance of BERT-family models in different MTL settings with Open-I (radiology reports) and OHSUMED (PubMed abstracts) datasets. We introduce the MTLV (Multi-task learning visualizer) library to facilitate building Multi-task learning-related architectures. This library uses existing infrastructure (e.g., Hugging Face Transformers and MLflow Tracking) to allow users to build and compare multi-task, multi-head, and single-task learning designs using available models of the Transformers library. Following previous work in the computer vision domain, we clustered tasks in few groups and trained each group separately on separate models (Grouped Multi-Task Learning (GMTL)) and a single model with different heads (Grouped multi-Head Learning (GMHL)) where each head includes a group of tasks. Contextual representation of class labels (Tasks) and their descriptions was used by the library as features to cluster the tasks. The set of models from GMTL are trained separately, each for the set of tasks(groups) that is arrived from task clustering. We experimented with a set of binary classification tasks that share the same dataset (multi-label classification). The contributions of this research are: (a) We observed that grouping tasks for training with few models (GMTL) outperforms both the multi-task (MTL) and multi-head learning settings (GMHL); (b) We proposed an approach to use task (label) names and their description embeddings as clustering feature of tasks; (c) Although GMTL have similar performance compared to Single Task Learning (STL), GMTL is computationally less expensive than the STL setting where a separate model is trained for each task; (d) Code of the MTLV library is available as open-source on GitHub(https://github.com/fatemerhmi/MTLV)

Description

Keywords

Multi-task learning, multi-label classification, pre-trained language models

Citation