🚀 Exciting News (March 2025): I'm co-teaching a course on 📊 AI Evals For Engineers in May! This hands-on course for engineers will feature interactive lectures and real homework assignments.

About Me

Shreya Shankar

My dog Papaya 🐕 and me on a hike 🥾

I'm Shreya Shankar, a fourth-year PhD student at UC Berkeley in the EECS department. I am in the data systems and foundations group, advised by Dr. Aditya Parameswaran and supported by the NDSEG Fellowship. Go Bears! 🐻

As of Spring 2025, I am also a visiting student researcher in the Systems Research @ Google group. We are exploring cost models for semantic data processing system optimizers.

Prior to my PhD, I worked as an ML engineer in industry. I completed my BS and MS in computer science at Stanford. Go Trees! 🌲

Shreya Shankar

My dog Papaya 🐕 and me on a hike 🥾

🔬 Research Interests

I build agentic systems to help people work with data. I am broadly interested in data systems and human-computer interaction research. I also have a side interest in educating engineers in industry about data systems and ML engineering practices. I am fortunate that several of my research projects have been deployed in production at major tech companies and startups.

👉 Click to show/hide full bio for speaking engagements

📝 Bio (for speaking engagements, etc.)

Shreya Shankar is a PhD student in computer science at UC Berkeley. Her research creates practical tools and frameworks that help people build reliable ML systems, with recent work on declarative interfaces and optimization for complex unstructured data analysis.

Shreya is advised by Dr. Aditya Parameswaran. Her work appears in top data management and HCI venues like SIGMOD, VLDB, CIDR, CSCW and UIST, and she co-organizes the DEEM workshop at SIGMOD. She is supported by the NDSEG Fellowship. Prior to Berkeley, she worked as an ML engineer after completing her B.S. in computer science at Stanford University. In her free time, she enjoys roasting coffee and is actively trying to reduce her Twitter usage.

📰 News and Industry Impact

Recent News

  • [March 2025] I'm teaching a course on LLM Evals in May: AI Evals For Engineers. It will be a hands-on, interactive course with real homework assignments and collaborative learning sessions!
  • [Jan 2025] We released DocWrangler, an IDE for writing DocETL pipelines! Read more about it in our blog post and access DocWrangler here.
  • [Oct 2024] New preprint out for DocETL, on our agentic query optimizer! Read it here.

Companies That Like Our Work 👍

👨‍🏫 Mentorship

I am fortunate to work with many talented students at UC Berkeley. Below is a list of students I am currently mentoring or have mentored for a year or more.

Current Students

  • Quentin Romero Lauro (University of Pittsburgh undergraduate, REU at UC Berkeley) - Developing an interactive debugging tool for RAG pipelines. First-author paper under submission.
  • Rachel Lin (UC Berkeley master's student) - Developing interfaces for iterative dataset search with LLMs; co-mentored with Madelon Hulsebos. First-author paper under submission.

Past Students

  • Reya Vir (UC Berkeley undergraduate) - Built a benchmark for synthesizing data quality constraints for LLM applications. Co-first-authored a publication at NAACL To pursue a PhD at Columbia University, with support from the NSF GRFP.
  • Ankush Garg (UC Berkeley master's student) - Building SCIPE, a debugging tool for complex chains and graphs of LLM calls. Deployed SCIPE with LangChain!
  • Parth Asawa (former UC Berkeley undergraduate) - Worked on data quality constraints for LLM applications and declarative LLM workflows. Co-authored two publications at CIDR and VLDB. Now pursuing a PhD at UC Berkeley.
  • Yujie Wang (former UC Berkeley undergraduate) - Worked on monitoring ML performance metrics without ground-truth labels. Co-authored a publication at CIDR. Joined Google after graduation.
  • Aditi Mahajan (former UC Berkeley undergraduate) - Worked on unit tests for end-to-end ML pipelines. Joined Google after graduation.

🗣️ Selected Invited Talks

DocWrangler and Semantic Data Processing

  • [May '25] LangChain Disrupt Conference
  • [April '25] SF Public Defender's Office
  • [April '25] Spring EPIC Lab Retreat
  • [March '25] Montreal HCI Seminar

DocETL and Agentic Data Systems

  • [March '25] UC Berkeley BLISS Lab Seminar
  • [March '25] Brown University DB Seminar
  • [Feb '25] Columbia University DB Seminar
  • [Feb '25] Scottish Climate Intelligence Service
  • [Jan '25] Cloudera
  • [Dec '24] Microsoft: Gray Systems Lab
  • [Nov '24] Snowflake
  • [Nov '24] ByteDance (TikTok)
  • [Nov '24] Google: Systems Research Group
  • [Nov '24] WInE Lab at CMU
  • [Nov '24] Solventum
  • [Oct '24] US Army Research Laboratory

Some Past Recordings

📬 Contact

Email: shreyashankar@berkeley.edu
Twitter | Github

Download Outdated CV (PDF)