Vanya Cohen

I am a PhD Student at the University of Texas at Austin (currently on leave) and AI Scientist at Blackbird.AI, where I created Compass, an agentic fact-checker for multimodal social media content. My PhD advisor is Ray Mooney.

I completed my BSc and MSc at Brown University, advised by Stefanie Tellex and George Konidaris in the Humans to Robots Lab.

Fun facts: I helped create the first publically downloadable, open-source LLM OpenGPT-2 and dataset OpenWebText. My Erdős number is 3.

Email  /  Google Scholar  /  LinkedIn

profile photo

Research

I'm interested in grounded natural language processing, reinforcement learning, and robotics.

project image

CaT-Bench: Benchmarking Language Model Understanding of Causal and Temporal Dependencies in Plans


Yash Kumar Lal*, Vanya Cohen*, Nathanael Chambers, Niranjan Balasubramanian, Raymond Mooney
EMNLP, 2024
arxiv / dataset / code /

A benchmark that evaluates language models’ ability to reason about step dependencies in task plans, using causal and temporal relations. It tests LLMs ability to predicting procedure dependencies and give explanations. We find SOTA LLMs perform poorly on this task despite its simplicity.

project image

A Survey of Robotic Language Grounding: Tradeoffs Between Symbols and Embeddings


Vanya Cohen*, Jason Xinyu Liu*, Raymond Mooney*, Stefanie Tellex*, David Watkins*
IJCAI Survey Track, 2024
arxiv / website /

Robotic language grounding methods can be positioned along an axis that ranges from methods that map natural language to formal symbolic representations to those that map to high-dimensional vector spaces. The survey explores the trade-offs between interpretability, generalizability, and the scalability of these methods.

project image

CAPE: Corrective Actions from Precondition Errors using Large Language Models


Shreyas Sundara Raman, Vanya Cohen, Ifrah Idrees, Eric Rosen, Ray Mooney, Stefanie Tellex, David Paulius
ICRA, 2024
arxiv / code / website /

CAPE resolves precondition errors in task planning for robotic agents by leveraging large language models (LLMs). The method re-prompts LLMs using error feedback, allowing robots to make corrective actions in real-world environments. We implement CAPE on a Spot robot and demonstrate improvements over previous LLM planning methods.

project image

OpenGPT-2: We Replicated GPT-2 Because You Can Too


Aaron Gokaslan*, Vanya Cohen*, Ellie Pavlick, Stefanie Tellex
NeurIPS Workshop, 2019
article / blog / code /

OpenGPT-2 is a replication of OpenAI’s 1.5B parameter GPT-2 LLM. Our model was the first open-source, publically downloadable LLM. It was trained on our OpenWebText dataset and was followed by several other prominent open-source LLM releases.

project image

Grounding Language Attributes to Objects using Bayesian Eigenobjects


Vanya Cohen*, Benjamin Burchfiel*, Thao Nguyen*, Nakul Gopalan, Stefanie Tellex, George Konidaris
IROS, 2019
arxiv / code / website /

A method to recognize 3D objects from natural language descriptions and depth images. The system leverages unsupervised learning on 3D object meshes and a small amount of annotated data and generalizes to unseen and novel viewpoints.

project image

OpenWebText: An Open Source Replication of OpenAI's WebText


Aaron Gokaslan*, Vanya Cohen*, Ellie Pavlick, Stefanie Tellex
NeurIPS Workshop, 2019
code / website /

OpenWebText is a replication of OpenAI’s WebText dataset. It has been downloaded over 4 million times. OpenWebText was used to train our model OpenGPT-2, among other LLMs.


Design and source code from Jon Barron's website