Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions _posts/2026-03-01-LREC-gmichel.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
---
layout: post
title: "S-VoCAL: A Dataset and Evaluation Framework for Inferring Speaking Voice Character Attributes in Literature"
date: 2026-03-01 10:00:00 +0200
category: Publication
author: gmichel
readtime: 1
domains:
- NLP
people:
- gmichel
- eepure
publication_type: conference
publication_title: "S-VoCAL: A Dataset and Evaluation Framework for Inferring Speaking Voice Character Attributes in Literature"
publication_year: 2026
publication_authors: Abigail Berthe-Pardo, Gaspard Michel, Elena V. Epure, Christophe Cerisara
publication_conference: LREC
publication_code: "https://github.com/AbigailBerthe/S-VoCAL"
publication_preprint: "https://arxiv.org/pdf/2603.00958"
---

With recent advances in Text-to-Speech (TTS) systems, synthetic audiobook narration has seen increased interest, reaching unprecedented levels of naturalness. However, larger gaps remain in synthetic narration systems' ability to impersonate fictional characters, and convey complex emotions or prosody. A promising direction to enhance character identification is the assignment of plausible voices to each fictional characters in a book. This step typically requires complex inference of attributes in book-length contexts, such as a character's age, gender, origin or physical health, which in turns requires dedicated benchmark datasets to evaluate extraction systems' performances. We present S-VoCAL (Speaking Voice Character Attributes in Literature), the first dataset and evaluation framework dedicated to evaluate the inference of voice-related fictional character attributes. S-VoCAL entails 8 attributes grounded in sociophonetic studies, and 952 character-book pairs derived from Project Gutenberg. Its evaluation framework addresses the particularities of each attribute, and includes a novel similarity metric based on recent Large Language Models embeddings. We demonstrate the applicability of S-VoCAL by applying a simple Retrieval-Augmented Generation (RAG) pipeline to the task of inferring character attributes. Our results suggest that the RAG pipeline reliably infers attributes such as Age or Gender, but struggles on others such as Origin or Physical Health.
22 changes: 22 additions & 0 deletions _posts/2026-04-21-ACL-gmichel.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
---
layout: post
title: "Computational Narrative Understanding for Expressive Text-to-Speech"
date: 2026-04-21 10:00:00 +0200
category: Publication
author: gmichel
readtime: 1
domains:
- NLP
people:
- gmichel
- eepure
publication_type: conference
publication_title: "Computational Narrative Understanding for Expressive Text-to-Speech"
publication_year: 2026
publication_authors: Gaspard Michel, Elena V. Epure, Christophe Cerisara
publication_conference: ACL
publication_code: "https://github.com/deezer/libriquote"
publication_preprint: "https://arxiv.org/pdf/2509.04072"
---

Recent advances in text-to-speech (TTS) have been driven by large, multi-domain speech corpora, yet the expressive potential of audiobook data remains underexamined. We argue that human-narrated audiobooks, particularly fictional works, contain rich and diverse prosodic cues arising from the natural alternation between neutral narration and expressive character dialogue. Building from this observation, we introduce LibriQuote, a large-scale 5.3K hours of expressive speech drawn from character quotations. Each quote is supplemented with contextual pseudo-labels for speech verbs and adverbs that characterize the intended delivery of direct speech (e.g., "he whispered softly"). We found that fine-tuning a flow-matching model on LibriQuote yields substantial improvements in expressivity and intelligibility, while training from scratch enhances expressiveness of an autoregressive TTS model. Benchmarking on LibriQuote-test highlights significant variability across systems in generating expressive speech. We publicly release the dataset, code, and evaluation resources to facilitate reproducibility. Audio samples can be found at this [url](https://libriquote.github.io/).
23 changes: 23 additions & 0 deletions _posts/2026-05-28-PrePrint-gmichel.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
---
layout: post
title: "GraphLit: Learning Text-Enriched Dynamic Character Network Representations for Literary Study"
date: 2026-05-28 10:00:00 +0200
category: Publication
author: gmichel
readtime: 1
domains:
- NLP
people:
- gmichel
- eepure
- rhennequin
publication_type: conference
publication_title: "GraphLit: Learning Text-Enriched Dynamic Character Network Representations for Literary Study"
publication_year: 2026
publication_authors: Gaspard Michel, Elena V. Epure, Romain Hennequin, Christophe Cerisara, Mirella Lapata
publication_conference: Preprint
publication_code: "https://github.com/gasmichel/GraphLit"
publication_preprint: "https://arxiv.org/pdf/2605.28643"
---

Methods to represent literary texts as graphs or sequences of graphs mainly focus on representing character interactions, and often overlook another crucial aspect: the textual context in which characters interact. We introduce Dynamic Heterogeneous Character Networks (DHCNs), which organize long novels into temporally localized heterogeneous graphs that align characters with their textual contexts. We extract around 20,000 DHCNs from Project Gutenberg, and propose GraphLit, a self-supervised learning framework that learns rich literary representations through a masked graph autoencoder objective. Across a wide-range of 12 character-related tasks, GraphLit improves over text-only and graph-only baselines, particularly on tasks requiring contextual understanding. Finally, we demonstrate the applicability of DHCNs and GraphLit for literary analysis by studying the link between narrative non-linearity and dynamic social features.