metaxa

Valentina Pyatkin

Postdoctoral Researcher
Allen Institute for AI
University of Washington


News

  • Mar. 2025: Invited talk at Stanford.
  • Will be serving as Tutorial Chair of EMNLP 2025.
  • Dec. 2024: Co-organizing the SoLaR workshop on Socially Responsible Language Modelling Research at NeurIPS 2024.
  • Oct. 2024: I'm attending COLM and will also be a mentor at the MLR at Penn workshop!
  • Aug. 2024: 2 paper awards at ACL 2024!
  • Jun. 2024: Serving as Internal Communication Chair for ACL 2024
  • Apr. 2024: Invited talk at the UMass NLP Seminar.
  • Mar. 2024: Invited talk at the University of Edinburgh.
  • Mar. 2024: Co-organized the UnImplicit workshop at EACL-2024.
  • Mar. 2024: Invited talk at the Harvard Efficient ML Seminar.
  • Mar. 2024: DAAD sponsored visit to the University of SaarbrΓΌcken, the Max Planck Institute for Software Systems and the University of Stuttgart.
  • Feb. 2024: Invited talk at the UBC NLP group.
  • Dec. 2023: Invited talk at Brown/TΓΌbingen.
  • Sep. 2023: Gave an invited talk at the KR 2023 workshop on Computational Machine Ethics.
  • Jan. 2023: Invited talk at the UT Austin Seminar on "Social Implications and Impact of NLP".
  • Jul. 2022: Co-organized the UnImplicit workshop at NAACL-2022.

I am on the academic job market for faculty positions! Feel free to reach out if you have an opening in your department.

Bio

I am a postdoctoral researcher (and Young Investigator) at the Allen Institute for AI and the University of Washington, advised by Prof. Yejin Choi. I completed my PhD in Computer Science at the NLP lab of Bar Ilan University, supervised by Prof. Ido Dagan and Prof. Reut Tsarfaty. I also was a visiting PhD student at UW NLP and had the pleasure of interning twice at the Allen Institute for AI. My work has been awarded an ACL Outstanding Paper Award and the ACL Best Theme Paper Award. I am also very honored to have received the AI2 Outstanding Intern of the Year Award. Previously I did a research internship at Google, obtained an MSc from the University of Edinburgh and a BA from the University of Zurich. My work has been featured in the press, for example by TechCrunch and GeekWire.


Research

My research focuses on Post-Training and the Adaptation of Language Models, for example to make them better semantic and pragmatic reasoners. In the past, I worked on question generation, natural language representations, and discourse. I am also interested in underspecified, ambiguous and implicit language and in teaching language models how to better deal with such phenomena. More specifically, my research is centered around:

  • Post-Training and LM Adaptation: Finding optimal recipes for LM post-training - from generating (synthetic) preference data to developing reinforcement learning algorithms.
    • I am a core contributor on the Tulu and Open-Instruct project, where develop post-training pipelines consisting of supervised finetuning, direct preference optimization, and reinforcement learning with verifiable rewards.
    • I have worked on the open science of language models, by contributing to OLMo and OLMo2.
  • Natural Language Understanding: Improving LMs' semantic reasoning across a broader discourse and making them robust for handling ambiguous and underspecified inputs.
  • Critical Evaluation: Evaluating Reward Models (RewardBench), LMs' values and LMs' abilities to draw pragmatic inferences.

Awards


Publications

Below is a selection of my recent publications; for my full publication record, please see my Google Scholar page.

2025


IssueBench: Millions of Realistic Prompts for Measuring Issue Bias in LLM Writing Assistance

Paul RΓΆttger, Musashi Hinck, Valentin Hofmann, Kobi Hackenburg, Valentina Pyatkin, Faeze Brahman, Dirk Hovy πŸ“„ Paper

2 OLMo 2 Furious

Pete Walsh, Luca Soldaini, Dirk Groeneveld, Kyle Lo, Shane Arora, Akshita Bhagia, Yuling Gu, Shengyi Huang, Matt Jordan, Nathan Lambert, Dustin Schwenk, Oyvind Tafjord, Taira Anderson, David Atkinson, Faeze Brahman, Christopher Clark, Pradeep Dasigi, Nouha Dziri, Michal Guerquin, Hamish Ivison, Pang Wei Koh, Jiacheng Liu, Saumya Malik, William Merrill, Lester James V Miranda, Jacob Morrison, Tyler Murray, Crystal Nam, Valentina Pyatkin, Aman Rangapur, Michael Schmitz, Sam Skjonsberg, David Wadden, Christopher Wilhelm, Michael Wilson, Luke Zettlemoyer, Ali Farhadi, Noah A Smith, Hannaneh Hajishirzi πŸ“„ Paper

2024


TÜLU 3: Pushing Frontiers in Open Language Model Post-Training

Valentina Pyatkin*, Nathan Lambert*, Jacob Morrison*, Shengyi Huang*, Hamish Ivison*, Faeze Brahman*, Lester James V Miranda*, Alisa Liu. Nouha Dziri, Xinxi Lyu, Yuling Gu, Saumya Malik, Victoria Graf, Jena D Hwang, Jiangjiang Yang, Ronan Le Bras, Oyvind Tafjord, Chris Wilhelm, Luca Soldaini, Noah A Smith, Yizhong Wang, Pradeep Dasigi, Hannaneh Hajishirzi.
πŸ“„ Paper

Diverging Preferences: When do Annotators Disagree and do Models Know?

Michael J.Q. Zhang, Zhilin Wang, Jena D. Hwang, Yi Dong, Olivier Delalleau, Yejin Choi, Eunsol Choi, Xiang Ren, Valentina Pyatkin.
πŸ“„ Paper

SafetyAnalyst: Interpretable, transparent, and steerable LLM safety moderation

Jing-Jing Li, Valentina Pyatkin, Max Kleiman-Weiner, Liwei Jiang, Nouha Dziri, Anne G. E. Collins, Jana Schaich Borg, Maarten Sap, Yejin Choi, Sydney Levine.
πŸ“„ Paper

Hybrid Preferences: Learning to Route Instances for Human vs. AI Feedback

Lester James V. Miranda*, Yizhong Wang*, Yanai Elazar, Sachin Kumar, Valentina Pyatkin, Faeze Brahman, Noah A. Smith, Hanna Hajishirzi, Pradeep Dasigi.
πŸ“„ Paper

Superlatives in Context: Modeling the Implicit Semantics of Superlatives

Valentina Pyatkin, Bonnie Webber, Ido Dagan, Reut Tsarfaty.
πŸŽ“ In: NAACL 2025 | πŸ“„ Paper

Explicating the Implicit: Argument Detection Beyond Sentence Boundaries

Paul Roit, Aviv Slobodkin, Eran Hirsch, Arie Cattan, Ayal Klein, Valentina Pyatkin, Ido Dagan.
πŸŽ“ In: ACL 2024 | πŸ“„ Paper

Self-Directed Synthetic Dialogues and Revisions Technical Report

Nathan Lambert, Hailey Schoelkopf, Aaron Gokaslan, Luca Soldaini, Valentina Pyatkin, Louis Castricato.
πŸ“„ Paper

The Art of Saying No: Contextual Noncompliance in Language Models

Faeze Brahman*, Sachin Kumar*, Vidhisha Balachandran, Pradeep Dasigi, Valentina Pyatkin, Abhilasha Ravichander, Sarah Wiegreffe, Nouha Dziri, Khyathi Chandu, Jack Hessel, Yulia Tsvetkov, Noah A. Smith, Yejin Choi, Hannaneh Hajishirzi.
πŸŽ“ In: NeurIPS 2024 | πŸ“„ Paper

Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback

Hamish Ivison, Yizhong Wang, Jiacheng Liu, Zeqiu Wu, Valentina Pyatkin, Nathan Lambert, Noah A Smith, Yejin Choi, Hannaneh Hajishirzi.
πŸŽ“ In: NeurIPS 2024 | πŸ“„ Paper

WILDBENCH: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild

Bill Yuchen Lin, Yuntian Deng, Khyathi Chandu, Faeze Brahman, Abhilasha Ravichander, Valentina Pyatkin, Nouha Dziri, Ronan Le Bras, Yejin Choi
πŸŽ“ In: ICLR 2025 | πŸ“„ Paper

RewardBench: Evaluating Reward Models for Language Modeling

Nathan Lambert, Valentina Pyatkin, Jacob Morrison, LJ Miranda, Bill Yuchen Lin, Khyathi Chandu, Nouha Dziri, Sachin Kumar, Tom Zick, Yejin Choi, Noah A. Smith, Hannaneh Hajishirzi.
πŸŽ“ In: NAACL Findings 2025 | πŸ“„ Paper

Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models

Paul RΓΆttger*, Valentin Hofmann*, Valentina Pyatkin, Musashi Hinck, Hannah Rose Kirk, Hinrich SchΓΌtze, Dirk Hovy.
πŸŽ“ In: ACL 2024 | πŸ“„ Paper
⭐Outstanding Paper Award⭐

OLMo: Accelerating the Science of Language Models

Dirk Groeneveld, Iz Beltagy, Pete Walsh, Akshita Bhagia, Rodney Kinney, Oyvind Tafjord, Ananya Harsh Jha, Hamish Ivison, Ian Magnusson, Yizhong Wang, Shane Arora, David Atkinson, Russell Authur, Khyathi Raghavi Chandu, Arman Cohan, Jennifer Dumas, Yanai Elazar, Yuling Gu, Jack Hessel, Tushar Khot, William Merrill, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E Peters, Valentina Pyatkin, Abhilasha Ravichander, Dustin Schwenk, Saurabh Shah, Will Smith, Emma Strubell, Nishant Subramani, Mitchell Wortsman, Pradeep Dasigi, Nathan Lambert, Kyle Richardson, Luke Zettlemoyer, Jesse Dodge, Kyle Lo, Luca Soldaini, Noah A Smith, Hannaneh Hajishirzi.
πŸŽ“ In: ACL 2024 | πŸ“„ Paper ⭐Best Theme Paper Award⭐

Promptly Predicting Structures: The Return of Inference

Maitrey Mehta, Valentina Pyatkin, Vivek Srikumar.
πŸŽ“ In: NAACL 2024 | πŸ“„ Paper

2023


Camels in a Changing Climate: Enhancing LM Adaptation with TÜLU 2

Hamish Ivison*, Yizhong Wang*, Valentina Pyatkin, Nathan Lambert, Matthew Peters, Pradeep Dasigi, Joel Jang, David Wadden, Noah A. Smith, Iz Beltagy, Hannaneh Hajishirzi.
πŸ“„ Paper

” You Are An Expert Linguistic Annotator”: Limits of LLMs as Analyzers of Abstract Meaning Representation

Allyson Ettinger, Jena D Hwang, Valentina Pyatkin, Chandra Bhagavatula, Yejin Choi.
πŸŽ“ In: EMNLP Findings | πŸ“„ Paper

What Makes it Ok to Set a Fire? Iterative Self-distillation of Contexts and Rationales for Disambiguating Defeasible Social and Moral Situations

Kavel Rao, Liwei Jiang, Valentina Pyatkin, Yuling Gu, Niket Tandon, Nouha Dziri, Faeze Brahman, Yejin Choi.
πŸŽ“ In: EMNLP Findings | πŸ“„ Paper

Phenomenal Yet Puzzling: Testing Inductive Reasoning Capabilities of Language Models with Hypothesis Refinement

Linlu Qiu, Liwei Jiang, Ximing Lu, Melanie Sclar, Valentina Pyatkin, Chandra Bhagavatula, Bailin Wang, Yoon Kim, Yejin Choi, Nouha Dziri, Xiang Ren.
πŸŽ“ In: ICLR | πŸ“„ Paper

Value Kaleidoscope: Engaging AI with Pluralistic Human Values, Rights, and Duties

Taylor Sorensen, Liwei Jiang, Jena Hwang, Sydney Levine, Valentina Pyatkin, Peter West, Nouha Dziri, Ximing Lu, Kavel Rao, Chandra Bhagavatula, Maarten Sap, John Tasioulas, Yejin Choi.
πŸŽ“ In: AAAI | πŸ“„ Paper

PlaSma: Making Small Language Models Better Procedural Knowledge Models for (Counterfactual) Planning

Faeze Brahman, Chandra Bhagavatula, Valentina Pyatkin, Jena D. Hwang, Xiang Lorraine Li, Hirona J. Arai, Soumya Sanyal, Keisuke Sakaguchi, Xiang Ren, Yejin Choi.
πŸŽ“ In: ICLR | πŸ“„ Paper

Retrieving Texts based on Abstract Descriptions

Shauli Ravfogel, Valentina Pyatkin, Amir DN Cohen, Avshalom Manevich, Yoav Goldberg.
πŸŽ“ In: COLM 2024 | πŸ“„ Paper

Design Choices for Crowdsourcing Implicit Discourse Relations: Revealing the Biases Introduced by Task Design

Valentina Pyatkin, Frances Yung, Merel C.J. Scholman, Reut Tsarfaty, Ido Dagan, Vera Demberg.
πŸŽ“ In: TACL | πŸ“„ Paper

ClarifyDelphi: Reinforced Clarification Questions with Defeasibility Rewards for Social and Moral Situations

Valentina Pyatkin, Jena D. Hwang, Vivek Srikumar, Ximing Lu, Liwei Jiang, Yejin Choi and Chandra Bhagavatula.
πŸŽ“ In: ACL | πŸ“„ Paper

Revisiting Sentence Union Generation as a Testbed for Text Consolidation

Eran Hirsch, Valentina Pyatkin, Ruben Wolhandler, Avi Caciularu, Asi Shefer, Ido Dagan.
πŸŽ“ In: ACL Findings | πŸ“„ Paper

2022


Just-DREAM-about-it: Figurative Language Understanding with DREAM-FLUTE

Gu, Yuling, Yao Fu, Valentina Pyatkin, Ian H. Magnusson, Bhavana Dalvi and Peter Clark.
πŸŽ“ In: Proceedings of the Workshop on Figurative Language Processing at EMNLP 2022 | πŸ“„ Paper

QASem Parsing: Text-to-text Modeling of QA-based Semantics

Ayal Klein, Eran Hirsch, Ron Eliav, Valentina Pyatkin, Avi Caciularu, Ido Dagan.
πŸŽ“ In: EMNLP | πŸ“„ Paper

Design Choices in Crowdsourcing Discourse Relation Annotations: The Effect of Worker Selection and Training

Merel C.J. Schoman, Valentina Pyatkin, Frances Yung, Ido Dagan, Reut Tsarfaty, Vera Demberg.
πŸŽ“ In: LREC | πŸ“„ Paper

Draw Me a Flower: Grounding Formal Abstract Structures Stated in Informal Natural Language

Royi Lachmy, Valentina Pyatkin, Avshalom Manevich, Reut Tsarfaty.
πŸŽ“ In: TACL | πŸ“„ Paper

2021


Asking It All: Generating Contextualized Questions for any Semantic Role

Valentina Pyatkin*, Paul Roit*, Julian Michael, Reut Tsarfaty, Yoav Goldberg, Ido Dagan.
πŸŽ“ In: EMNLP | πŸ“„ Paper

The Possible, the Plausible, and the Desirable: Event-Based Modality Detection for Language Processing

Valentina Pyatkin*, Shoval Sadde*, Aynat Rubinstein, Paul Portner, Reut Tsarfaty.
πŸŽ“ In: ACL | πŸ“„ Paper

2020


QADiscourse - Discourse Relations as QA Pairs: Representation, Crowdsourcing and Baselines

Valentina Pyatkin, Ayal Klein, Reut Tsarfaty, Ido Dagan.
πŸŽ“ In: EMNLP | πŸ“„ Paper

QA-Nom: Question-Answer driven SRL for Nominalizations

Ayal Klein, Jonathan Mamou, Valentina Pyatkin, Daniela Stepanov, Hangfeng He, Dan Roth, Luke Zettlemoyer, Ido Dagan.
πŸŽ“ In: COLING | πŸ“„ Paper

2017


Discourse Relations and Conjoined VPs: Automated Sense Recognition

Valentina Pyatkin, Bonnie Webber.
πŸŽ“ In: EACL SRW 2017 | πŸ“„ Paper

* : Equal contribution.


Misc

Besides this I love rowing (currently at Lake Washington Rowing Club) and going to the “cinemathΓ¨que”. I think that Italian Neorealism produced some of the most beautiful movies. My ErdΕ‘s number is 3 (Paul ErdΕ‘s β†’ Noga Alon β†’ Ido Dagan β†’ Me) and my Kevin Knight number is 2 (Kevin Knight β†’ Yejin Choi β†’ Me).