Publications | Francis Kulumba

2026

HALvest-Contrastive: Retrieval-Like Authorship Attribution with Patch-Level Late Interaction

Francis Kulumba, Wissam Antoun, Guillaume Vimont, and 2 more authors

2026

Abs Bib

Deciding whether two pieces of text share an author is made difficult by topical confound: two writers covering the same topic often look more alike than one writer covering two topics. We tackle this with HALvest, a 17-billion-token multilingual corpus of open-access scholarly papers, and its English contrastive derivative HALvest-Contrastive, in which same-author passages are drawn from distinct papers within a field to minimize topical overlap. We also revisit how documents are compared. Authorship systems traditionally compress each document into a single vector, we keep a sequence of vectors and compare them with late interaction, then introduce Patch-Level Late Interaction (PLI), which compresses neighboring tokens into patches before matching. Matching at the sequence level greatly improves performance over the single-vector baseline, but the optimal interaction granularity is subtle.
@misc{kulumba_halvest_2026, title = {HALvest-Contrastive: Retrieval-Like Authorship Attribution with Patch-Level Late Interaction}, author = {Kulumba, Francis and Antoun, Wissam and Vimont, Guillaume and Romary, Laurent and Cafiero, Florian}, year = {2026}, archiveprefix = {arXiv}, primaryclass = {cs.DL}, url = {https://arxiv.org/abs/2407.20595}, }
Language Triggers Hijack Language Circuits: A Mechanistic Analysis of Backdoor Behaviors in Large Language Models

Théo Lasnier, Wissam Antoun, Francis Kulumba, and 2 more authors

In Mechanistic Interpretability Workshop at ICML 2026, 2026

Abs Bib

Backdoor attacks pose significant security risks for Large Language Models (LLMs), yet the internal mechanisms by which triggers operate remain poorly understood. We present the first mechanistic analysis of language-switching backdoors, studying the GAPperon model family (1B, 8B, 24B parameters) which contains triggers injected during pretraining that cause output language switching. Using activation patching, we localize trigger formation to early layers (7.5-25% of model depth) and identify which attention heads process trigger information. Our central finding is that trigger-activated heads substantially overlap with heads naturally encoding output language across model scales, with Jaccard indices between 0.18 and 0.66 over the top heads identified. This suggests that backdoor triggers do not form isolated circuits but instead co-opt the model’s existing language components. These findings have implications for backdoor defense: detection methods may benefit from monitoring known functional components rather than searching for hidden circuits, and mitigation strategies could potentially leverage this entanglement between injected and natural behaviors.
@inproceedings{lasnier_triggers_2026, title = {Language Triggers Hijack Language Circuits: A Mechanistic Analysis of Backdoor Behaviors in Large Language Models}, author = {Lasnier, Th{\'e}o and Antoun, Wissam and Kulumba, Francis and Sagot, Beno{\\i}t and Seddah, Djam{\'e}}, booktitle = {Mechanistic Interpretability Workshop at ICML 2026}, year = {2026}, url = {https://openreview.net/forum?id=OitZlTv7xu}, }
Language-Switching Triggers Take a Latent Detour Through Language Models

Francis Kulumba, Wissam Antoun, Théo Lasnier, and 2 more authors

2026

Abs Bib

Backdoor attacks on language models pose a growing security concern, yet the internal mechanisms by which a trigger sequence hijacks model computations remain poorly understood. We identify a circuit underlying a language-switching backdoor in an 8B-parameter autoregressive language model, where a three-word Latin trigger (nine tokens) redirects English output to French. We decompose the circuit into three phases: (1) distributed attention heads at early layers compose the trigger tokens into the last sequence position; (2) the resulting signal propagates through mid-layers in a subspace orthogonal to the model’s natural language-identity direction; (3) the MLP at the final layer converts this latent signal into French logits. The entire circuit flows through a serial bottleneck at a single position: corrupting that position at any layer entirely mitigate the trigger but also hinder the model’s capabilities. The orthogonal latent encoding suggests that defenses that search for language-like signals in intermediate representations would miss this trigger entirely.
@misc{kulumba_language_2026, title = {Language-Switching Triggers Take a Latent Detour Through Language Models}, author = {Kulumba, Francis and Antoun, Wissam and Lasnier, Théo and Sagot, Benoît and Seddah, Djamé}, year = {2026}, archiveprefix = {arXiv}, primaryclass = {cs.CL}, url = {https://arxiv.org/abs/2605.18646}, }
Where Does Authorship Signal Emerge in Encoder-Based Language Models?

Francis Kulumba, Guillaume Vimont, Laurent Romary, and 1 more author

2026

Abs Bib

Authorship attribution models fine-tuned with the same pretrained encoder, data, and loss can differ four-fold in performance depending only on their scoring mechanism. We use mechanistic interpretability tools to explain this gap. Stylistic features such as word length, punctuation density, and function-word frequency are equally available at every layer in every model, including in an off-the-shelf control encoder, hence the gap not coming from representation quality. Instead, causal intervention shows that the scorer determines where the encoder consolidates authorship signal. Mean pooling forces consolidation by early to mid layers, while late interaction defers it to later layers. We further derive this difference from the gradient structure of each scorer, and training dynamics reveal distinct learning trajectories that follow from that difference.
@misc{kulumba_does_2026, title = {Where Does Authorship Signal Emerge in Encoder-Based Language Models?}, author = {Kulumba, Francis and Vimont, Guillaume and Romary, Laurent and Cafiero, Florian}, year = {2026}, archiveprefix = {arXiv}, primaryclass = {cs.CL}, url = {https://arxiv.org/abs/2605.19908}, }

2024

CamemBERT 2.0: A Smarter French Language Model Aged to Perfection

Wissam Antoun, Francis Kulumba, Rian Touchent, and 3 more authors

2024

Abs arXiv Bib HTML PDF

French language models, such as CamemBERT, have been widely adopted across industries for natural language processing (NLP) tasks, with models like CamemBERT seeing over 4 million downloads per month. However, these models face challenges due to temporal concept drift, where outdated training data leads to a decline in performance, especially when encountering new topics and terminology. This issue emphasizes the need for updated models that reflect current linguistic trends. In this paper, we introduce two new versions of the CamemBERT base model-CamemBERTav2 and CamemBERTv2-designed to address these challenges. CamemBERTav2 is based on the DeBERTaV3 architecture and makes use of the Replaced Token Detection (RTD) objective for better contextual understanding, while CamemBERTv2 is built on RoBERTa, which uses the Masked Language Modeling (MLM) objective. Both models are trained on a significantly larger and more recent dataset with longer context length and an updated tokenizer that enhances tokenization performance for French. We evaluate the performance of these models on both general-domain NLP tasks and domain-specific applications, such as medical field tasks, demonstrating their versatility and effectiveness across a range of use cases. Our results show that these updated models vastly outperform their predecessors, making them valuable tools for modern NLP systems. All our new models, as well as intermediate checkpoints, are made openly available on Huggingface.
@misc{antoun_camembert20_2024, title = {CamemBERT 2.0: A Smarter French Language Model Aged to Perfection}, author = {Antoun, Wissam and Kulumba, Francis and Touchent, Rian and de la Clergerie, Éric and Sagot, Benoît and Seddah, Djamé}, year = {2024}, archiveprefix = {arXiv}, primaryclass = {cs.CL}, url = {https://arxiv.org/abs/2411.08868}, }