I am a researcher in computer science, specializing in databases and human-AI interaction. In 2027, I will join Carnegie Mellon University as an Assistant Professor in the Computer Science Department (CSD) and, by courtesy, the HCII.
My research goal is to build the "full stack" (i.e., systems and interfaces) for data-intensive knowledge work. We take an applications-oriented approach: we build open-source software, deploy with real users, and use what we learn to advance research in systems, interfaces, and user behavior.
Current Research Projects
- Efficient & accurate AI-powered data processing
- AI agent observability
- Async interfaces for human-AI collaboration
I (am close to having) received my PhD in EECS from UC Berkeley's Data Systems and Foundations group, advised by Aditya Parameswaran. In my PhD, I designed and built the DocETL and DocWrangler ecosystem for scalable LLM-powered data processing. I also wrote several papers, a course, and a book on evaluating LLM-powered applications. Our work has had real-world impact in (1) databases, e.g., Snowflake, BigQuery; (2) AI tooling, e.g., LangChain, ChromaDB, OpenAI; and (3) society, e.g., DocETL powers a tool helping California public defenders challenge wrongful convictions under the Racial Justice Act.
Academic Service
Reviewer: VLDB (2027–), UIST (2024–), CHI (2024–), NeurIPS (2021, 2022)
Organizer: DEEM Workshop at SIGMOD (2023–2025)
Current Mentees
- Andrew Cheng (undergrad)
- Sasha Singh (undergrad)
Past Mentees
- Parth Asawa (undergrad → PhD student @ Berkeley; CRA Undergraduate Award Honorable Mention)
- Ruiqi Chen (MS → PhD student @ University of Michigan CSE)
- Ankush Garg (MS → Senior Data Scientist @ Clarkson Consulting)
- Rachel Lin (undergrad, MS → Software Engineer @ Opto)
- Aditi Mahajan (undergrad → Google)
- Nikhil & Vinay Rao (high school → undergrads @ UC Berkeley EECS)
- Quentin Romero Lauro (undergrad → CEO @ Inspector, YC 2025; CRA Undergraduate Award Winner)
- Reya Vir (undergrad → PhD student @ Columbia; NSF GRFP recipient)
- Yujie Wang (undergrad → Google)
- Lindsey Wei (undergrad → PhD student @ UC Berkeley EECS; CRA Undergraduate Award Honorable Mention)
Publications
- Can AI Agents Answer Your Data Questions? A Benchmark for Data AgentsPreprintCo-first author is my mentee
- Multi-Objective Agentic Rewrites for Unstructured Data ProcessingUnder revision at VLDB 2026Co-first author is my mentee
- Featurized-Decomposition Join: Low-Cost Semantic Joins with GuaranteesUnder revision at VLDB 2026
- Task Cascades for Efficient Unstructured Data ProcessingTo appear at SIGMOD 2026
- Cut Costs, Not Accuracy: LLM-Powered Data Processing with GuaranteesTo appear at SIGMOD 2026
- RAG Without the Lag: Interactive Debugging for Retrieval-Augmented Generation PipelinesCHI 2026 — 🏆 Best PaperCo-first author is my mentee
- Supporting Our AI Overlords: Redesigning Data Systems to be Agent-FirstCIDR 2026
- Steering Semantic Data Processing with DocWranglerUIST 2025 — 🏆 Best Paper Honorable Mention
- Rethinking Dataset Discovery with DataScoutUIST 2025Co-first author is my mentee
- DocETL: Agentic Query Rewriting and Evaluation for Complex Document ProcessingVLDB 2025
- LLM-Powered Proactive Data SystemsIEEE Data Engineering Bulletin 2025
- Querying Templatized Document Collections with Large Language ModelsICDE 2025
- PromptEvals: A Dataset of Assertions and Guardrails for Custom Production Large Language Model PipelinesNAACL 2025 — 🏆 Selected for Oral PresentationCo-first author is my mentee
- Who Validates the Validators? Aligning LLM-Assisted Evaluation of LLM Outputs with Human PreferencesUIST 2024
- SPADE: Synthesizing Data Quality Assertions for Large Language Model PipelinesVLDB 2024
- What We've Learned From a Year of Building with LLMsO'Reilly Radar
- Building Reactive Large Language Model Pipelines with MotionSIGMOD 2024 (Demo)
- It Took Longer Than I Was Expecting: Why Is Dataset Search Still So Hard?HILDA 2024 (Workshop on Human-in-the-Loop Data Analytics)
- Revisiting Prompt Engineering via Declarative CrowdsourcingCIDR 2024
- Operationalizing Machine Learning: An Interview StudyCSCW 2024
- Towards Observability for Production Machine Learning PipelinesVLDB 2023
- Bolt-on, Compact, and Rapid Program Slicing for NotebooksVLDB 2023
- Automatic and Precise Data Validation for Machine LearningCIKM 2023
- Rethinking Streaming Machine Learning EvaluationICLR 2022: Workshop on ML Evaluation Standards
- Enabling certification of verification-agnostic networks via memory-efficient semidefinite programmingNeurIPS 2020
- Adversarial examples that fool both computer vision and time-limited humansNIPS 2018
- No classification without representation: Assessing geodiversity issues in open data sets for the developing worldNIPS 2017: Workshop on Machine Learning for the Developing World