About Me
Hello! I am Sapana, a final year PhD student at Texas A&M University. I am being advised by Dr. Dileep Kalathil. I am passionate about using Reinforcement Learning (RL) to solve challenging real world problems.
I have worked on multiple algorithmic paradigms in RL ranging from generative adversarial imitation learning to meta-RL. More recently, I have pivoted towards fine-tuning Large Language Models (LLMs) using Reinforcement Learning from Human Feedback (RLHF). This pivot reflects my growing interest in the intersection of natural language processing and reinforcement learning. I have also built abstractive and extractive Q&A systems using retrieval augmented generation (RAG) and LLMs while doing an applied science internship at Amazon.
Previously, I was a research fellow at MPI-SWS, Germany with Dr. Adish Singla. I also did a MS (Research) at IIT Madras with Dr. Balaraman Ravindran and Dr. Radha Krishna Ganti.
Aside from work, I like to hike, cook, paint, and photograph.
News
- [Feb 2024] Paper on Pedagogical Alignment of LLMs out on arxiv!
- [Aug 2023] Paper on Safe distributed OCO accepted to TMLR!
- [May 2023] Back in Seattle for an Applied Scientist intern at Amazon!
- [Apr 2023] Accepted to IJCAI 2023 Doctoral Consortium!
- [Feb 2023] New paper on Safe distributed OCO out on arxiv!
- [Feb 2023] Gave an invited talk on ‘Adaptivity and safety in sequential decision making’ at Rice University!
- [Sep 2022] Paper on meta-RL in sparse reward environments accepted to NeurIPS 2022!
- [Aug 2022] Spent a wonderful summer in Seattle as an Applied Scientist intern at Amazon!
- [Dec 2021] Paper on Safe online convex optimization accepted to AAAI 2022!