|
Jaydeep Borkar (जयदीप बोरकर)
jaijborkar at gmail dot com
I am a PhD Candidate at Northeastern University advised by
David A. Smith.
During my PhD, I spent time as a Visiting Researcher at Meta Superintelligence Labs on the Pretraining Trust team, where I worked on research related to memorization and privacy in language models.
Starting in June 2026, I will return to Meta as a Research Scientist Intern to work on safety in LLMs.
Previously, I was an external research student at MIT-IBM Watson AI Lab advised by Pin-Yu Chen where I worked on adversarial machine learning.
I'm one of the founding organizers of the Trustworthy ML Initiative, which aims to lower the entry barriers into trustworthy machine learning.
Twitter  / 
Google Scholar  / 
LinkedIn  / 
GitHub  / 
CV
|
|
|
Research
I study privacy and safety in language models. I'm particularly interested in understanding various questions related to training data memorization through the lens of data and training dynamics.
Please get in touch if you'd like to collaborate on research or go mountain biking.
|
Papers
Memorization Dynamics in Knowledge Distillation for Language Models
Jaydeep Borkar, Karan Chadha, Niloofar Mireshghallah, Yuchen Zhang, Irina-Elena Veliche, Archi Mitra, David A. Smith, Zheng Xu, and Diego Garcia-Olano.
arxiv preprint 2026
Privacy Ripple Effects from Adding or Removing Personal Information in Language Model Training
Jaydeep Borkar, Matthew Jagielski, Katherine Lee, Niloofar Mireshghallah, David A. Smith, and Christopher A. Choquette-Choo.
Findings of the Association for Computational Linguistics (ACL) 2025
Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon
USVSN Sai Prashanth, Alvin Deng, Kyle O’Brien, Jyothir S V, Mohammad Aflah Khan, Jaydeep Borkar, Christopher A. Choquette-Choo, Jacob Ray Fuehne, Stella Biderman, Tracy Ke, Katherine Lee, and Naomi Saphra.
International Conference on Learning Representations (ICLR) 2025
Mind the gap: Analyzing lacunae with transformer-based transcription
Jaydeep Borkar and David A. Smith
Workshop on Computational Paleography, ICDAR 2024
What can we learn from Data Leakage and Unlearning for Law?
Jaydeep Borkar
Workshop on Generative AI + Law (GenLaw), ICML 2023
Simple Transparent Adversarial Examples
Jaydeep Borkar and Pin-Yu Chen
Workshop on Security and Safety in Machine Learning Systems, ICLR 2021
|
|
News
January 2026
-
I’m returning to Meta (MSL) for another research role, this time as a Research Scientist Intern starting in June.
-
New paper on memorization during distillation in language models.
June 2025 I have joined Meta GenAI in NYC as a Visiting Researcher for six months.
May 2025
March 2025 Had fun attending ACM CS & Law in Munich.
February 2025 My wonderful co-authors and I wrote a new paper on memorization of personal information in language models. Check it out!
January 2025 Our paper on Memorization in LMs got accepted at ICLR!
April 2024 Giving a guest lecture on privacy and security in LLMs for CS 5100 Foundations of AI class at Northeastern
June 2023 Stoked to present my work on memorization + law in LLMS at the first Generative AI + law (GenLaw) workshop at ICML in Honolulu, Hawai'i.
|
|
Service
Organizing
The Trustworthy ML Initiative (together with Hima Lakkaraju, Sara Hooker, Sarah Tan, Subho Majumdar, Chhavi Yadav, Chirag Agarwal, Haohan Wang, and Marta Lemanczyk).
Reviewing
- ACM Conference on Fairness, Accountability, and Transparency (FAccT), 2026
- Workshop on Privacy in Natural Language Processing, ACL 2024
|
|
Some fun stuff!
In my free time, I enjoy (mountain) biking in Boston/Cambridge (Fells is my favorite), lifting, hanging out at bookstores and libraries, and dancing Bollywood and Bachata.
|
|