WP1: Experimental framework for annotation and analysis of ASR errors from a human perspective

This work package is dedicated to the definition of the entire experimental framework as well as its implementation, mainly including everything related to data (study plan, annotation, dissemination…), which will be the keystone of the DIETS project and one of its strong originalities. It will be a question of being able to provide a detailed experimental protocol, aiming to describe both the expectations of the perceptual experiments (sub-task 1.1) and the way in which these will unfold (sub-task 1.2). We will then handle the problem of transcription errors, and their annotation, by defining a precise annotation guide for these errors based both on their reception by users and on linguistic studies (sub-task 1.3).

Sub-task 1.1 Definition of the experimental protocol. We propose to define the experimental protocol necessary for the annotated corpus that we will distribute freely to the community. This corpus will take the form of audio document transcriptions whose errors will be manually annotated. The originality of this corpus is due to its annotation taking two different forms: on each error, information about its reception by the end-users, which has never been studied, as well as a fine linguistic categorization will be added.

Sub-task 1.2 Set up of human perceptive tests. We propose to set up the different perceptual tests defined in sub-task 1.1.

Sub-task 1.3 Categorization and annotation of transcription errors. This last sub-task will be based on the protocol defined in sub-task 1.1. We will seek to provide a fine annotation of transcription errors, in addition to perception tests performed in sub-task 1.2.