Page Not Found
Page not found. Your pixels are in another canvas.
A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.
Page not found. Your pixels are in another canvas.
About Me
This is a page not in th emain menu
Published:
This blog post is an opus of my naive understanding of various ideas from Asymptotic Statistics.
Published:
You might have come across Pinfy in my previous blog post on Loss function and objective function are different, or are they? Here is the story about how she came into being. Here is a picture of her, that was created by my friend Akshat and me.
Published:
This blog post is a result of a discussion (read borderline heated argument) that I had with a friend regarding if machine learning terms loss function and objective function meant the same. We will get to know if they do mean the same or not by the end of this post. I am choosing to write this post as a dialogue between Pinfy and Scooby, to let you decide if it was a discussion or an argument :’). Also, I am using a colab notebook to write this post because the post is going to get mathematical.
Published:
Visualizating learning is a great way to gain better understaning of your machine learning model’s inputs, outputs and/or the model parameters. In this article we discuss
Published:
What is academic research all about? This is a question that I keep asking myself, frequently so. From whatever research experience I have, I have come to realise that there are three major components that determine the quality of research one gets to do (or atleast the one I get to do): working, networking and not-working. Working: Because if you are a researcher, you have got to do research. Networking: Because you need collaborators, mentors and reveiwers to do good research. Not-working: Because you are a human, and you need rest no matter how much you think you don’t.
Published:
Short description of portfolio item number 1
Published:
Short description of portfolio item number 2 
Published in Workshop on Goal Specification in Reinforcement Learning, ICML, 2018
Introduces Lipschitz-constrained cost functions to achieve smoother and more robust imitation learning.
Recommended citation: Sapana Chaudhary, Akshat Dave, Balaraman Ravindran. (2018). SILC: Smoother Imitation with Lipschitz Costs. ICML Workshop on Goal Specification in RL. https://sites.google.com/view/goalsrl/accepted-papers?authuser=0
Published in CoDS-COMAD (ACM Digital Library), 2022
Proposes smooth cost functions and policy regularization to improve stability and performance in imitation learning.
Recommended citation: Sapana Chaudhary, Balaraman Ravindran. (2022). Smooth Imitation Learning via Smooth Costs and Smooth Policies. CoDS-COMAD 2022. https://arxiv.org/abs/2111.02354
Published in AAAI Conference on Artificial Intelligence, 2022
Develops algorithms for online convex optimization that learn and satisfy unknown linear safety constraints during execution.
Recommended citation: Sapana Chaudhary, Dileep Kalathil. (2022). Safe Online Convex Optimization with Unknown Linear Safety Constraints. AAAI 2022. https://arxiv.org/abs/2111.07430
Published in Neural Information Processing Systems (NeurIPS), 2022
Improves meta-RL by incorporating demonstrations to accelerate learning in challenging sparse reward environments.
Recommended citation: Desik Rengarajan, Sapana Chaudhary, Jaewon Kim, Dileep Kalathil, Srinivas Shakkottai. (2022). Enhanced Meta Reinforcement Learning via Demonstrations in Sparse Reward Environments. NeurIPS 2022. https://arxiv.org/abs/2209.13048
Published in Transactions of Machine Learning Research (TMLR), 2023
Analyzes regret bounds for distributed online optimization under safety constraints in both convex and non-convex settings.
Recommended citation: Ting-Jui Chang, Sapana Chaudhary, Dileep Kalathil, Shahin Shahrampour. (2023). Dynamic Regret Analysis of Safe Distributed Online Optimization for Convex and Non-convex Problems. TMLR 2023. https://openreview.net/forum?id=xiQXHvL1eN
Published in International Joint Conference on Artificial Intelligence (IJCAI) Doctoral Consortium, 2023
Doctoral consortium paper exploring the interplay between safety constraints and adaptive learning in sequential decision problems.
Recommended citation: Sapana Chaudhary. (2023). On Safety and Adaptivity in Sequential Decision Making. IJCAI Doctoral Consortium 2023. https://www.ijcai.org/proceedings/2023/0813.pdf
Published in Empirical Methods in Natural Language Processing (EMNLP), 2024
Aligns LLMs with pedagogical principles to improve their effectiveness as educational tools and tutors.
Recommended citation: Shashank Sonkar, Kangqi Ni, Sapana Chaudhary, Richard G. Baraniuk. (2024). Pedagogical Alignment of Large Language Models. EMNLP 2024. https://arxiv.org/abs/2402.05000
Published in Neural Information Processing Systems (NeurIPS), 2024
Introduces risk-averse objectives for LLM finetuning to ensure robust and safe model behavior across diverse scenarios.
Recommended citation: Sapana Chaudhary, Ujwal Dinesha, Dileep Kalathil, Srinivas Shakkottai. (2024). Risk-Averse Finetuning of Large Language Models. NeurIPS 2024. https://arxiv.org/abs/2501.06911v1
Published in International Conference on Learning Representations (ICLR), 2025
Presents a minimalist approach to building effective web agents using LLMs with strong empirical performance.
Recommended citation: Ke Yang, Yao Liu, Sapana Chaudhary, Rasool Fakoor, Pratik Chaudhari, George Karypis, Huzefa Rangwala. (2025). AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents. ICLR 2025. https://arxiv.org/pdf/2410.13825v1
Published in Transactions on Machine Learning Research (TMLR), 2025
Explores offline learning dynamics and catastrophic forgetting in LLM reasoning tasks using reinforcement learning techniques.
Recommended citation: Tianwei Ni, Allen Nie, Sapana Chaudhary, Yao Liu, Huzefa Rangwala, Rasool Fakoor. (2025). Offline Learning and Forgetting for Reasoning with Large Language Models. TMLR. https://openreview.net/pdf?id=RF6raEUATc
Published in arXiv preprint, 2025
Combines neural and symbolic methods to validate chain-of-thought reasoning through logical consistency verification.
Recommended citation: Yu Feng, Nathaniel Weir, Kaj Bostrom, Sam Bayless, Darion Cassel, Sapana Chaudhary, Benjamin Kiesl-Reiter, Huzefa Rangwala. (2025). VeriCoT: Neuro-symbolic Chain-of-Thought Validation via Logical Consistency Checks. arXiv preprint. https://arxiv.org/abs/2511.04662
Published in arXiv preprint, 2026
A reinforcement learning framework that optimizes code by maximizing reward signals through automated search and refinement.
Recommended citation: Jiefu Ou, Sapana Chaudhary, Kaj Bostrom, Nathaniel Weir, Shuai Zhang, Huzefa Rangwala, George Karypis. (2026). MaxCode: A Max-Reward Reinforcement Learning Framework for Automated Code Optimization. arXiv preprint. https://arxiv.org/abs/2601.05475
Published:
This is a description of your talk, which is a markdown files that can be all markdown-ified like any other post. Yay markdown!
Published:
This is a description of your conference proceedings talk, note the different field in type. You can put anything in this field.
Undergraduate course, University 1, Department, 2014
This is a description of a teaching experience. You can use markdown like any other post.
Workshop, University 1, Department, 2015
This is a description of a teaching experience. You can use markdown like any other post.