Portuguese reading comprehension: A new dataset and a transfer learning model : GERAD

iCalendar

9 oct. 2019 10h30 — 12h00

Eraldo R. Fernandes – Universidade Federal do Mato Grosso do Sul, Brésil

In this talk, I will present FaQuAD: a reading comprehension (RC) dataset in the domain of Brazilian higher education. RC is a complex natural language understanding task whose input comprises a reading passage (usually, a paragraph) and a question related to this passage. The task consists in finding the question answer within the given reading passage (context). The correct answer is always a span of the context. FaQuAD follows the format of the well-known SQuAD dataset [Rajpurkar et al.2016]. It comprises 900 questions related to contexts taken from 39 documents: 18 official documents from the Computer Science College at UFMS and 21 Wikipedia articles related to Brazilian higher education system. Unlike many question answering (QA) datasets based on predefined question-answer pairs, FaQuAD is based on contexts. The system needs to interpret both the question and the context in order to return the best answer. As far as we know, FaQuAD is the first Portuguese reading comprehension dataset with this challenging format. Additionally, I will describe a deep learning model [Seo et al. 2016] to solve this task by means of transfer learning of an unsupervised language model [Peters et al. 2018]. This model (called BiDAF) can benefit from pre-trained representations in two levels: word and contextual representations. The word representation layer is based on the GloVe model [Pennington et al. 2014]; while the contextual representation layer is based on ELMo contextual representations [Peters et al. 2018]. We report on several ablation tests to assess different aspects of both the model and the dataset.

Entrée gratuite.
Bienvenue à tous!

Daniel Aloise responsable

Lieu

Salle 4488
Pavillon André-Aisenstadt
Campus de l'Université de Montréal

2920, chemin de la Tour
Montréal QC H3T 1J4
Canada

Axe de recherche

Axe 1 : Valorisation des données pour la prise de décision

Application de recherche

Marketing (intelligence d’affaires, gestion des revenus, systèmes de recommandation)