What’s New?

Apr 2025 Keynote at the NTU IAS Frontiers Conference on AI
Feb 2025 1 paper accepted to CVPR
Jan 2025 4 papers accepted to NAACL
Jan 2025 INCLUDE is accepted to ICLR (spotlight)
Nov 2024 Named ELLIS Scholar
Jul 2024 Talk at ICML Large Language Models and Cognition Workshop
Jan 2024 Talk at AI House in Davos about the Swiss AI Initiative
Dec 2023 The Swiss AI Initiative is launched!
Dec 2023 Talk at EMNLP BlackboxNLP Workshop 2023
Nov 2023 Neuro-Symbolic AI Panel at ISWC 2023
Oct 2023 Talk at Johns Hopkins University
Oct 2023 Talk at University of Maryland
Jan 2023 Panel at Infrarouge
Jan 2023 Talk at IBM Neuro-symbolic AI Workshop
Mar 2022 Talk at EPFL Center for Intelligent Systems
Jan 2022 Talk at IBM Research
Dec 2021 Panel at World Congress of Science & Factual Producers
Nov 2021 Talk at ETH Zurich
Nov 2021 Talk at CIKM Workshop: Knowledge Injection in Neural Networks (KINN)
Nov 2021 Talk at KR Workshop: Knowledge Representation for Hybrid and Compositional AI (KRHCAI)
Sep 2021 Talk at Stanford Graph Learning Workshop
Aug 2021 Talk at IJCAI Workshop: Is Neuro-symbolic SOTA still a myth for NLI? (NSNLI)
Apr 2021 Named to the Forbes 30 under 30 list in Science & Healthcare
Mar 2021 Talk at Microsoft Research
Feb 2021 Talk at AAAI Workshop in Hybrid Artificial Intelligence
Feb 2021 Tutorial on Commonsense Knowledge Acquisition and Representation at AAAI 2021
Nov 2020 Tutorial on Neural Language Generation at EMNLP 2020
Nov 2020 Talk at UCSD Health Informatics Seminar
Nov 2020 Talk at Stanford Cognitive Science Seminar
Jul 2020 Tutorial on Commonsense Knowledge at ACL 2020
Sep 2019 Talk at WeCNLP 2019
                       

Research Interests

Reasoning agents are the next frontier of AI. My research investigates how we can develop AI reasoning agents for the benefit of society, focusing both on designing novel AI reasoning methodologies, and adapting them for applications such as health, education, and global fairness. My group’s research draws on methods in natural language processing, deep learning, machine learning, and artificial intelligence to investigate these problems.

Topics that I focus on include:

LLM Representations of Knowledge. Figuring out how to go from an LLM to a reasoning agent requires understanding what LLMs know, how they represent that knowledge, and how they compose that information internally. My research investigates how LLM subnetworks (and other LLM representations) encode discrete forms of knowledge, how those representations can be modified, and how closely they align with measurements of the human brain [1,2,3,4,5,6]

Reasoning Algorithms. LLMs fail dramatically and unexpectedly when presented with seemingly simple reasoning problems that humans effortlessly solve. We draw on methods from diverse research areas (e.g., symbolic systems, neuroscience, cognitive science, psychology) to devise new methods and frameworks for LLM reasoning. [1,2,3,4,5,6,7]

Large-scale AI Development. LLMs behave differently at small scale compared to large scale. Our work bridges the research gap between these two settings by developing open-source, open-weight, and open-data foundation models. We specifically focus on developing multilingual models trained on compliant data to enable use in diverse regulatory and cultural settings [1,2,3,4]

AI Democratization. Much like previous generations of AI advancement, AI reasoners will be experienced differently by different groups based on their current digital and AI maturity. Across practice areas in health, education, and global fairness, my works works with end users to develop models, theories, and evaluations that enable responsible development of LLM-based AI. [1,2,3,4]

EPFL NLP Group

Publications

Please see my Google Scholar for an up-to-date list of publications.

(2025). Can Performant LLMs Be Ethical? Quantifying the Impact of Web Crawling Opt-Outs. arXiv.

PDF

(2025). From Language to Cognition: How LLMs Outgrow the Human Language Network. arXiv.

PDF

(2025). VinaBench: Benchmark for Faithful and Consistent Visual Narratives. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

PDF Code Dataset Project

(2025). INCLUDE: Evaluating Multilingual Language Understanding with Regional Knowledge. Proceedings of the 13th International Conference for Learning Representations (ICLR).

PDF Dataset Project

(2025). A Logical Fallacy-Informed Framework for Argument Generation. Proceedings of the Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL).

PDF Code

(2025). PICLe: Pseudo-Annotations for In-Context Learning in Low-Resource Named Entity Detection. Proceedings of the Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL).

PDF Code

(2025). The LLM Language Network: A Neuroscientific Approach for Identifying Causally Task-Relevant Units. Proceedings of the Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL).

PDF Code

(2025). Efficient Tool Use with Chain-of-Abstraction Reasoning. Proceedings of the 31st International Conference on Computational Linguistics (COLING).

PDF Poster

(2024). Global MMLU: Understanding and Addressing Cultural and Linguistic Biases in Multilingual Evaluation. arXiv.

PDF Dataset

(2024). Could ChatGPT get an engineering degree? Evaluating higher education vulnerability to AI assistants. Proceedings of the National Academy of Sciences (PNAS).

PDF

(2024). Discovering Knowledge-Critical Subnetworks in Pretrained Language Models. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP).

PDF

(2024). Let Me Teach You: Pedagogical Foundations of Feedback for Language Models. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP).

PDF

(2024). "Flex Tape Can't Fix That": Bias and Misinformation in Edited Language Models. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP).

PDF

(2024). Instruction-tuning Aligns LLMs to the Human Brain. Conference on Language Modeling (COLM).

PDF

(2024). Making Reasoning Matter: Measuring and Improving Faithfulness of Chain-of-Thought Reasoning. Findings of EMNLP.

PDF Code Project

(2024). DiffuCOMET: Contextual Commonsense Knowledge Diffusion. Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL).

PDF Code

(2024). Complex Reasoning over Logical Queries on Commonsense Knowledge Graphs. Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL).

PDF Code

(2024). Exploring Defeasibility in Causal Reasoning. Findings of the ACL.

PDF

(2024). Enhancing Procedural Writing Through Personalized Example Retrieval: A Case Study on Cooking Recipes. International Journal of Artificial Intelligence in Education (IJAIED).

PDF

(2024). A Design Space for Intelligent and Interactive Writing Assistants. Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI).

PDF Code Project

(2024). ConVQG: Contrastive Visual Question Generation with Multimodal Guidance. Proceedings of the 38th AAAI Conference on Artificial Intelligence (AAAI).

PDF Project

(2024). Course Recommender Systems Need to Consider the Job Market. Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR).

PDF

(2024). ConGeo: Robust Cross-view Geo-localization across Ground View Variations. Proceedings of the European Conference on Computer Vision (ECCV).

PDF Code Project

(2024). Brain-Like Language Processing via a Shallow Untrained Multihead Attention Network. arXiv.

PDF

(2024). ComperDial: Commonsense Persona-grounded Dialogue Dataset and Benchmark. arXiv.

PDF

(2024). Evaluating Language Model Agency through Negotiations. arXiv.

PDF Code Dataset Project

(2024). Improving Autoformalization using Type Checking. arXiv.

PDF

(2024). Large Language Models are Catalyzing Chemistry Education. chemRxiv.

PDF

(2024). REFINER: Reasoning Feedback on Intermediate Representations. Proceedings of the Conference of the European Chapter of the Association for Computational Linguistics (EACL).

PDF Code

(2024). JOBSKAPE: A Framework for Generating Synthetic Job Postings to Enhance Skill Matching. NLP4HR Workshop - EACL.

PDF

(2024). Rethinking Skill Extraction in the Job Market Domain using Large Language Models. NLP4HR Workshop - EACL.

PDF

(2023). MEDITRON-70B: Scaling Medical Pretraining for Large Language Models. arXiv.

PDF Code Project

(2023). RECKONING: Reasoning through Dynamic Knowledge Encoding. Neural Information Processing Systems (NeurIPS).

PDF Code

(2023). CRoW: Benchmarking Commonsense Reasoning in Real-World Tasks. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP).

PDF Code Dataset Project

(2023). CRAB: Assessing the Strength of Causal Relationships Between Real-world Events. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP).

PDF Code

(2023). Towards a Mechanistic Interpretation of Multi-Step Reasoning Capabilities of Language Models. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP).

PDF Code

(2023). Breaking the Language Barrier: Improving Cross-Lingual Reasoning with Structured Self-Attention. Findings of EMNLP.

PDF Code

(2023). CAR: Conceptualization-Augmented Reasoner for Zero-Shot Commonsense Question Answering. Findings of EMNLP.

PDF Code

(2023). PeaCoK: Persona Commonsense Knowledge for Consistent and Engaging Narratives. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL). Outstanding Paper Award.

PDF Code Dataset Video

(2023). Mitigating Label Biases for In-context Learning. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL).

PDF Code Video

(2023). DISCO: Distilling Counterfactuals with Large Language Models. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL).

PDF Code Dataset

(2023). kogito: A Commonsense Knowledge Inference Toolkit. Proceedings of the Conference of the European Chapter of the Association for Computational Linguistics (EACL) - Systems Demonstrations.

PDF Code Video

(2022). ComFact: A Benchmark for Linking Contextual Commonsense Knowledge. Findings of EMNLP.

PDF Code Dataset Video

(2022). Discovering Language-neutral Sub-networks in Multilingual Language Models. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP).

PDF Code Poster Video

(2022). Conditional set generation using Seq2seq models. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP).

PDF

(2022). Deep Bidirectional Language-Knowledge Graph Pretraining. Advances in Neural Information Processing Systems (NeurIPS).

PDF Code

(2022). Memory-Based Model Editing at Scale. Proceedings of the 39th International Conference on Machine Learning (ICML).

PDF Code Project Video

(2022). GreaseLM: Graph REASoning Enhanced Language Models for Question Answering. Proceedings of the 10th International Conference for Learning Representations (ICLR). Spotlight (Top 5%).

PDF Code Video

(2022). Fast Model Editing at Scale. Proceedings of the 10th International Conference for Learning Representations (ICLR).

PDF Code Project

(2022). End-to-End Task-Oriented Dialog Modeling with Semi-Structured Knowledge Management. IEEE/ACM Transactions on Audio Speech and Language (TASLP).

PDF

(2022). Synthetic Disinformation Attacks on Automated Fact Verification Systems. Proceedings of the 36th AAAI Conference on Artificial Intelligence (AAAI).

PDF Code Video

(2021). Conversational Multi-Hop Reasoning with Neural Commonsense Knowledge and Symbolic Logic Rules. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP).

PDF

(2021). Analyzing Commonsense Emergence in Few-shot Knowledge Models. Proceedings of the 3rd Conference on Automated Knowledge Base Construction (AKBC).

PDF Code

(2021). On the Opportunities and Risks of Foundation Models. arXiv.

PDF

(2021). Edited Media Understanding Frames: Reasoning About the Intents and Implications of Visual Disinformation. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics (ACL).

PDF Dataset

(2021). On-the-Fly Attention Modulation for Neural Generation. Findings of the ACL.

PDF

(2021). QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering. Proceedings of the 18th Meeting of the North American Association for Computational Linguistics (NAACL).

PDF Code Project

(2021). I'm Not Mad: Commonsense Implications of Negation and Contradiction. Proceedings of the 18th Meeting of the North American Association for Computational Linguistics (NAACL).

PDF Code Dataset

(2021). Discourse Understanding and Factual Consistency in Abstractive Summarization. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics.

PDF

(2021). (Comet-)Atomic 2020: On Symbolic and Neural Commonsense Knowledge Graphs. Proceedings of the 35th AAAI Conference on Artificial Intelligence (AAAI).

PDF Code

(2021). Dynamic Neuro-Symbolic Knowledge Graph Construction for Zero-shot Commonsense Question Answering. Proceedings of the 35th AAAI Conference on Artificial Intelligence (AAAI).

PDF

(2020). Back to the Future: Unsupervised Backprop-based Decoding for Counterfactual and Abductive Commonsense Reasoning. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP).

PDF Code Slides Video

(2020). Procedural Reading Comprehension with Attribute-Aware Context Flow. Proceedings of the 2nd Conference on Automated Knowledge Base Construction (AKBC). Best Paper Runner-up.

PDF Video

(2020). Commonsense Knowledge Base Completion with Structural and Semantic Context. Proceedings of the 34th AAAI Conference on Artificial Intelligence (AAAI).

PDF

(2019). COMET: Commonsense Transformers for Automatic Knowledge Graph Construction. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL).

PDF Code Poster Video

(2019). Efficient Adaptation of Pretrained Transformers for Abstractive Summarization. arXiv.

PDF Code

(2019). Counterfactual Story Reasoning and Generation. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP).

PDF Code Dataset

(2019). Everything Happens for a Reason: Discovering the Purpose of Actions in Procedural Text. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP).

PDF Dataset Project

(2019). WIQA: A dataset for "What if..." reasoning over procedural text. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP).

PDF Code Dataset Project

(2019). Be Consistent! Improving Procedural Text Comprehension using Label Consistency. Proceedings of the 17th Annual Meeting of the North American Association for Computational Linguistics (NAACL).

PDF Dataset Project

(2018). Simulating Action Dynamics with Neural Process Networks. Proceedings of the 6th International Conference for Learning Representations (ICLR).

PDF Dataset Poster Video

(2018). Discourse-Aware Neural Rewards for Coherent Text Generation. Proceedings of the 16th Annual Meeting of the North American Association for Computational Linguistics (NAACL).

PDF Dataset Poster

(2018). Deep Communicating Agents for Abstractive Summarization. Proceedings of the 16th Annual Meeting of the North American Association for Computational Linguistics (NAACL).

PDF Project Poster

(2018). Modeling Naive Psychology of Characters in Simple Commonsense Stories. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL).

PDF Code Dataset Project Slides

(2018). Learning to Write with Cooperative Discriminators. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL).

PDF Code Project Poster

(2018). Reasoning about Actions and State Changes by Injecting Commonsense Knowledge. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP).

PDF Dataset Project

(2016). Learning Prototypical Event Structure from Photo Albums. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL).

PDF Dataset Project

Media

My Research in The News

L’AGEFI. L’industrie pharma croit au potentiel de l’intelligence artificielle à tous les niveaux (Dec 2023)

GGB. Meditron, EPFL’s new Large Language Model for medical knowledge (Dec 2023)

ICT Journal. Né à l’EPFL: un LLM open source spécialisé dans le domaine médical (Dec 2023)

RTS CQFD. EPFL: Meditron (Dec 2023)

Communications of the ACM. Seeking Artificial Common Sense (Nov 2020)

The Atlantic. The Easy Questions that Stump Computers (May 2020)

Quanta Magazine. Common Sense Comes Closer to Computers (April 2020)

New York Academy of Sciences. Can Researchers Create Commonsense Artificial Intelligence? (June 2019)

The Gradient. NLP’s generalization problem, and how researchers are tackling it (August 2018)

NLP Highlights Podcast. 54 - Simulating Action Dynamics with Neural Process Networks, with Antoine Bosselut (March 2018)

My Thoughts in the News

Le Temps. Le superordinateur suisse Alps, monstre de puissance de classe mondiale, commence ses activités (Sept 2024)

Blick. Comment la Suisse veut instaurer la confiance en l’IA (Jan 2024)

Le Temps. Un super-ordinateur suisse dédié à l’IA (Dec 2023)

Corriere del Ticino. Ma davvero ChatGPT sta acquisendo tratti sempre più simili ai nostri? (Oct 2023)

Mirage News. Making AI work for everyone (Sept 2023)

RTS Forum. Les IA peuvent-elles comprendre l’humour? (May 2023)

RTS Infrarouge. Intelligence artificielle: le grand remplacement? (Jan 2023)

Tribune de Genève. Intelligence artificielle: Profession? Journaliste sportif virtuel (Jan 2023)

Heidi.news ChatGPT facilite la triche: et si c’était une bonne nouvelle? (Jan 2023)

Communications of the ACM. The Best of NLP (April 2021)

Joining EPFL NLP

If you’re interested in joining the EPFL NLP group, please read the following:

I am…

Looking for a postdoctoral position Feel free to contact me about potential postdoctoral positions. Also, check out these opportunities for fully funded postdoctoral positions that I can be a co-advisor on:
Horizon Europe Swiss Postdoctoral Fellowships
EPFLeaders4impact Postdoctoral Fellowships
Applying to the EPFL EDIC PhD program I will be taking on new PhD students next year! Apply if you’re interested in joining EPFL to work with me. Before you can be considered for the NLP lab, however, you will have to be admitted to the EDIC program, which handles admissions centrally. Feel free to let me know if you apply, but I unfortunately can’t conduct pre-screenings until applications are in.
An EDIC fellow I’m happy to supervise rotations provided our research interests align and there’s a good chance that the rotation will lead to a permanent position in the lab.
An EPFL Master’s student I’m happy to supervise Master’s projects and theses every semester! If you’re interested in doing a project with EPFL NLP, send an e-mail to:
nlp-projects-apply@groupes.epfl.ch
Please attach your CV and transcript and include [Masters Project] or [Masters Thesis] in your subject heading. If you want a sense of what a project in our lab would be about, check out my research interests above or those of my lab members! If you would like to complete an industry PDM, please follow the guidelines presented here
Looking for a summer internship If you are a Bachelor’s or Master’s student at another university, please apply through the Summer@EPFL program. If you are looking for a PhD internship, contact me directly.