All the positions are now filled. The registration process is now closed, thanks to all for the good work.


Eligibility Criteria

- Early Stage Researchers (ESRs) shall, at the time of recruitment by the host organization, be in the first four years (full-time equivalent research experience) of their research careers.
- The ESR may be a national of a Member State, of an Associated Country or of any Third Country.

- The ESR must not have resided or carried out her/his main activity (work, studies, etc.) in the country of her/his host organization for more than 12 months in the 3 years immediately prior to her/his recruitment.

- Holds a Masters degree or equivalent, which formally entitles to embark on a Doctorate.

- Does not hold a PhD degree.

Duration of recruitment: 36 months.


Fellow ESR1 (position filled)
Host institution The University of Sheffield (USFD), United Kingdom
Primary supervisor(s) Dr. Heidi Christensen
Title (WP): Using Speech Analysis to Detect Onset and Monitor Cognitive Decline
Objectives: Develop speech technology that can detect, as early as possible, the onset of cognitive decline which might lead to dementia. This must be done in an unobtrusive manner. If a decline in cognitive ability is detected the solution will then monitor the progression and predict future cognitive ability. This project will contribute to the state-of-the-art by investigating ways of monitoring and tracking signs of cognitive decline that work on incidental speech as it occurs in people’s homes. This differs from most current research approaches that tend to focus on planned recordings such as timed naming or picture description tasks.
Fellow ESR2 (position filled)
Host institution Universität Augsburg (UAU), Germany
Primary supervisor(s) Prof. Björn Schuller
Title (WP): Automatic detection and classification of pathological speech conditions based on emotion expression
Objectives: The human-to-human speech communication process can be affected by various conditions including the affective/emotional state of the communicating humans. We aim at studying the various effects of different voice pathologies on emotion expression. Given that emotionally salient pathological speech instances are hard to source, the aim of this ESR is to develop state-of-the-art deep transfer learning topologies and multitask learning topologies - which have yet to be fully explored in pathological speech detection - to enable both the detection of pathological speech conditions and any underlying emotional cues. We focus on disorders and pathologies which affect emotional expression including mood disorders (depression), neuro-developmental disorders (autism) and neurological disorders (Parkinson’s Disease).
Fellow ESR3 (position filled)
Host institution Idiap Research Institute (Idiap), Switzerland
Primary supervisor(s) Dr. Mathew Magimai Doss
Title (WP): Learning detection of pathological speech
Objectives: The current understanding about the difference between typical speech and pathological speech is limited by the knowledge gained by prior observations from clinical studies.The goal of this project is to learn or discover and understand, in a data-driven manner, the differences between typical speech and pathological speech at various levels, namely, acoustic, linguistic and prosody using recent advances in machine learning. Specifically, developing end-to-end pathological speech condition detectors.
Fellow ESR4 (position filled)
Host institution Philips Research Eindhoven (Philips-NL), Netherlands
Primary supervisor(s) Dr. Aki Härmä
Title (WP): Acoustic speech-based monitoring for telehealth and predictive analytics
Objectives: The goal is to detect changes in the speech properties and conversational behaviour of telehealth customers during calls with the call centre respondents. The main focus is in stroke and other neural, cognitive, and mental conditions which require home care and rehabilitation services. The major technical challenge is in robust detection of relevant changes in the condition of the patient from telephone conversation data which may have varying quality and spectral characteristics. Part of the work is related to the preprocessing of the call centre recordings and meta-data available from the host. The developed technology can be used for new alarm services, to support planning and monitoring of interventions, and as additional features for predictive models.
Fellow ESR5 (position filled)
Host institution imec-Ghent University (imec-IDLab), Belgium
Primary supervisor(s) Prof. dr. ir. Kris Demuynck
Title (WP): Generating phonological feedback for evidence-based speech therapy
Objectives: The aim of this project is to develop speech technology that computes goodness-of-articulation scores for the uttered sounds according to the phonological classes the sounds belong to. Such detailed phonological feedback is expected to provide valuable information to the speech therapist concerning the deficiencies that cause the speech problems and the progress made by the patient during therapy. Starting from our existing system, the focus in this project is on various innovative techniques to make better use of the scarce and very diverse data for pathological speech: (1) detailed analysis techniques that are robust w.r.t. the large diversity in language, pathology, task, annotations etc. encountered when working with pathological speech data, (2) weakly-supervised training techniques to make better use of the large body of normal speech, and (3) cross-lingual data aggregation techniques. The performance of the proposed techniques, and the efficacy of phonological feedback in general, will be validated in the context of clinical trials with a focus on evidence-based speech therapy.
Fellow ESR6 (position filled)
Host institution Université Toulouse III-Paul Sabatier (UPS), France
Primary supervisor(s) Assoc. Prof. Julie Mauclair
Title (WP): Deep learning approaches to assess head and neck cancer voice intelligibility
Objectives: The project focuses on studying the link between the internal representation of deep neural networks and the subjective representation of speech intelligibility. We will explore the saliency detection capabilities of DNNs when used in a regression task for predicting speech intelligibility scores as given by human experts. By saliency, we mean to retrieve which frequency bands are important and used by a DNN to make its predictions. We expect to identify regions of interest in the speech signal, both in time and frequency, that characterise the level of speech impairment.
Fellow ESR7 (position filled)
Host institution Friedrich-Alexander-Univeristät Erlangen Nuernberg (FAU), Germany
Primary supervisor(s) Prof. Dr.-Ing. Elmar Nöth
Title (WP): Modelling the progression of neurological diseases
Objectives: Develop speech technology that can allow unobtrusive monitoring of many kinds of neurological diseases. The state of a patient can degrade slowly between medical check-ups. We want to track the state of a patient unobtrusively without the feeling of constant supervision. At the same time the privacy of the patient has to be respected. We will concentrate on PD and thus on acoustic cues of changes. The algorithms should run on a smartphone, track acoustic changes during regular phone conversations over time and thus have to be low-resource. No speech recognition will be used and only some analysis parameters of the conversation are stored on the phone and transferred to the server.
Fellow ESR8 (position filled)
Host institution Radboud Universiteit Nijmegen (RUN), Netherlands
Primary supervisor(s) Assoc. Prof. Helmer Strik
Title (WP): Developing valid measures of pathological speech intelligibility: human ratings and automatic scores

Speech disorders that lead to decreased speech intelligibility have severe consequences for the patients and can affect their Quality of Life as they run the risk of losing contact with friends and relatives and eventually becoming isolated from society. Intelligibility can be improved by means of speech therapy. However, little is known about which deviations in pathological speech most affect intelligibility, and how intelligibility can best be improved. Furthermore, measuring intelligibility is complex, since there are no standard procedures, and it is time consuming as it requires a lot of manual work. This project will study the following novel aspects: which deviations in pathological speech have most impact on intelligibility, how therapy can best improve intelligibility, what are good procedures for measuring intelligibility, and how can the work load in measuring intelligibility be reduced by making use of software tools.

For more details please see: http://hstrik.ruhosting.nl/esr8-intelligibility/

Fellow ESR9 (position filled)
Host institution Université Toulouse III-Paul Sabatier (UPS), France
Primary supervisor(s) Assoc. Prof. Julie Mauclair
Title (WP): Clinical relevance of the intelligibility measures
Objectives: "The main purpose of this ESR is to study the different measures used today when monitoring a patient according to different clinical objectives such as the observation of the disease progression on voice without treatment, of the effect of a treatment on a type of deficit and the measured ability to allow the adjustment of a therapy. These measures are also associated with different levels of impact in deficits which are also to be studied. The deficits can be at a physiological, acoustic-sound production or communication level. All these levels correspond to different therapeutic modes of action in order to help a person with a speech disability to interact with others."
Fellow ESR10 (position filled)
Host institution Antwerp University Hospital (UZA), Belgium
Primary supervisor(s) Prof. Marc de Bodt
Title (WP): Development of a virtual articulation therapist
Objectives: The goal of the Virtual Articulation Therapist (VAT) is to develop a software program for patients with articulation deficits in which a virtual therapist, acting like a real speech and language pathologist, guides the patient through an intensive treatment program for improving articulation and consequently speech intelligibility. The main target group is patients with dysarthria who have sufficient cognitive, motor and sensory (auditory and visual) skills to work with a computer and/or those who can rely on assistance. Development of the VAT requires development of an extensive database of exercises and implementation of algorithms for phonological error detection, acoustic-toarticulatory inversion and immediate visual and auditory feedback. These algorithms, based on existing models, will be developed in close collaboration with imec-IDLab and Idiap, in particular on acoustic-to-articulatory inversion. The set of new developments would be embedded in an attractive, patient-friendly user interface.
Fellow ESR11 (position filled)
Host institution The Netherlands Cancer Institute - Antoni van Leeuwenhoek (NKI-AVL), Netherlands
Primary supervisor(s) Dr. Rob van Son
Title (WP): Predicting and synthesizing plausible speech examples after oral cancer treatment
Objectives: This ESR will develop technology that can provide advanced oral cancer patients with information and plausible examples of how they will sound after treatment, e.g., surgery (partial or total glossectomy) or radiochemotherapy. Predictions will be based on modelling changes in the anatomy and movement characteristics of the tongue after treatment. These models are used to build an articulatory synthesiser that will be able to synthesise short fixed sequences with the patient’s voice before and after treatment.
Fellow ESR12 (position filled)
Host institution The University of Sheffield (USFD), United Kingdom
Primary supervisor(s) Dr. Heidi Christensen
Title (WP): Phrase-based speech recognition for people with moderate to severe dysarthria
Objectives: The objective is to explore methods for moving towards handling larger-vocabulary, phrase-based speech recognition of dysarthric speech including different input strategies, better acoustic modelling, better data capture approaches and better machine learning for an inherently sparse data domain.
Fellow ESR13 (position filled)
Host institution Idiap Research Institute, Switzerland
Primary supervisor(s) Dr. Mathew Magimai Doss
Title (WP): Rapid development of speech technology for pathological speech
Objectives: The goal of this individual project is to develop methodologies and tools to address acoustic and lexical resource constraint challenges in pathological speech processing for rapid development of state-of-the-art speech technologies. Towards that the research will build upon Idiap’s recent research on under-resourced languages and end-to-end acoustic modelling, and will focus on developing novel acoustic modelling and pronunciation modelling methods that can learn better models for atypical speech recognition with limited resources.
Fellow ESR14 (position filled)
Host institution INESC ID - Instituto de Engenhariade Sistemas E Computabores Investigacao E Desenvolvimento em Lisboa (INESC), Portugal
Primary supervisor(s) Asst. Prof. Alberto Abad
Title (WP): Developing speech therapy games for children with speech disorder
Objectives: This project aims to provide speech and language therapists with a set of tools that integrate speech and language technology into a serious game environment particularly designed for children. The gamification elements aim at engaging the patient during regular therapy sessions and to promote its usage at home. This will require novel techniques for modelling, recognising and assessing the correctness of speech produced by children with pathological speech. The project will focus both on speech sound disorders (articulation and phonological problems) and speechmotor disorders, such as Childhood Apraxia of Speech (CAS).
Fellow ESR15 (position filled)
Host institution Ludwig-Maximilians-Universität Muenchen (LMU), Germany
Primary supervisor(s) Dr. Maria Schuster
Title (WP): Automatic surveying of speech of Cochlear Implant users
Objectives: Deafness often leads to decreased speech intelligibility even after hearing rehabilitation by cochlear implantation. Though hearing outcome is regularly measured, speech production quality is seldom assessed in outcome evaluations. This project aims to use automatic speech technology to rectify this, by implementing a speech assessment tool on an Otorhinolaryngology-platform, where hearing data and therapy specifics are recorded. This will provide novel opportunities for research into the communication abilities of cochlear implant users, which could lead to increased rehabilitation success.