Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Pages

Posts

Loss function and objective function are different, or are they?

less than 1 minute read

Published:

This blog post is a result of a discussion (read borderline heated argument) that I had with a friend regarding if machine learning terms loss function and objective function meant the same. We will get to know if they do mean the same or not by the end of this post. I am choosing to write this post as a dialogue between Pinfy and Scooby, to let you decide if it was a discussion or an argument :’). Also, I am using a colab notebook to write this post because the post is going to get mathematical.

Visualizing Learning of PyTorch Code using TensorboardX

less than 1 minute read

Published:

Visualizating learning is a great way to gain better understaning of your machine learning model’s inputs, outputs and/or the model parameters. In this article we discuss

  • how to use TensorboardX, a wrapper around Tensorboard, to visualize training of your existing PyTorch models.
  • how to use a conda environment to install tensorboard in case of installation clashes.
  • how to remotely access the web interface for tensorboard.

Academic Research: What does it involve?

6 minute read

Published:

What is academic research all about? This is a question that I keep asking myself, frequently so. From whatever research experience I have, I have come to realise that there are three major components that determine the quality of research one gets to do (or atleast the one I get to do): working, networking and not-working. Working: Because if you are a researcher, you have got to do research. Networking: Because you need collaborators, mentors and reveiwers to do good research. Not-working: Because you are a human, and you need rest no matter how much you think you don’t.

portfolio

publications

Enhanced Meta Reinforcement Learning via Demonstrations in Sparse Reward Environments

Published in Neural Information Processing Systems (NeurIPS), 2022

Improves meta-RL by incorporating demonstrations to accelerate learning in challenging sparse reward environments.

Recommended citation: Desik Rengarajan, Sapana Chaudhary, Jaewon Kim, Dileep Kalathil, Srinivas Shakkottai. (2022). Enhanced Meta Reinforcement Learning via Demonstrations in Sparse Reward Environments. NeurIPS 2022. https://arxiv.org/abs/2209.13048

Dynamic Regret Analysis of Safe Distributed Online Optimization for Convex and Non-convex Problems

Published in Transactions of Machine Learning Research (TMLR), 2023

Analyzes regret bounds for distributed online optimization under safety constraints in both convex and non-convex settings.

Recommended citation: Ting-Jui Chang, Sapana Chaudhary, Dileep Kalathil, Shahin Shahrampour. (2023). Dynamic Regret Analysis of Safe Distributed Online Optimization for Convex and Non-convex Problems. TMLR 2023. https://openreview.net/forum?id=xiQXHvL1eN

Pedagogical Alignment of Large Language Models

Published in Empirical Methods in Natural Language Processing (EMNLP), 2024

Aligns LLMs with pedagogical principles to improve their effectiveness as educational tools and tutors.

Recommended citation: Shashank Sonkar, Kangqi Ni, Sapana Chaudhary, Richard G. Baraniuk. (2024). Pedagogical Alignment of Large Language Models. EMNLP 2024. https://arxiv.org/abs/2402.05000

Risk-Averse Finetuning of Large Language Models

Published in Neural Information Processing Systems (NeurIPS), 2024

Introduces risk-averse objectives for LLM finetuning to ensure robust and safe model behavior across diverse scenarios.

Recommended citation: Sapana Chaudhary, Ujwal Dinesha, Dileep Kalathil, Srinivas Shakkottai. (2024). Risk-Averse Finetuning of Large Language Models. NeurIPS 2024. https://arxiv.org/abs/2501.06911v1

VeriCoT: Neuro-symbolic Chain-of-Thought Validation via Logical Consistency Checks

Published in arXiv preprint, 2025

Combines neural and symbolic methods to validate chain-of-thought reasoning through logical consistency verification.

Recommended citation: Yu Feng, Nathaniel Weir, Kaj Bostrom, Sam Bayless, Darion Cassel, Sapana Chaudhary, Benjamin Kiesl-Reiter, Huzefa Rangwala. (2025). VeriCoT: Neuro-symbolic Chain-of-Thought Validation via Logical Consistency Checks. arXiv preprint. https://arxiv.org/abs/2511.04662

MaxCode: A Max-Reward Reinforcement Learning Framework for Automated Code Optimization

Published in arXiv preprint, 2026

A reinforcement learning framework that optimizes code by maximizing reward signals through automated search and refinement.

Recommended citation: Jiefu Ou, Sapana Chaudhary, Kaj Bostrom, Nathaniel Weir, Shuai Zhang, Huzefa Rangwala, George Karypis. (2026). MaxCode: A Max-Reward Reinforcement Learning Framework for Automated Code Optimization. arXiv preprint. https://arxiv.org/abs/2601.05475

talks

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.