About Me

Hi, I’m Sid, a second year PhD candidate studying Computer Science at UCLA. I’m advised by Professor Baharan Mirzasoleiman. My research focus is on data-efficiency for learning with limited supervision i.e. selecting the best small subsets of data for training, to reduce costs without sacrificing accuracy. I aim to develop practically effective and theoretically rigorous approaches to solving these problems.

Open Office Hours: In an effort to pay forward all the help I’ve received in my journey so far in pursuing a career in ML research, I am dedicating 1-2 hours each week for open office hours. This is best suited for relatively junior students (undergraduate/masters) since I’m not very experienced myself :). If you’d like to chat about research, grad school or anything else, please fill out this form.

In my free time, I like to write (https://medium.com/@sjoshi804), read about philosophy and run.

Highlights

  • SAS: SAS selects subsets of pre-training data to enable data-efficient contrastive SSL (ICML ‘23). Give it a spin to try out data-efficient SSL!
  • SpuCo: SpuCo is a Python package developed to make research on address spurious correlations effortless. Check it out!

News

Publications

[1] Siddharth Joshi, Arnav Jain, Ali Payani and Baharan Mirzasoleiman, Data-Efficient Contrastive Language-Image Pretraining: Prioritizing Data Quality over Quantity, AISTATS 2024.

[2] Yihao Xue, Siddharth Joshi, Dang Nguyen and Baharan Mirzasoleiman, Understanding the Robustness of Multi-modal Contrastive Learning to Distribution Shift, ICLR 2024.

[3] Yihao Xue, Eric Gan, Jiayi Ni, Siddharth Joshi and Baharan Mirzasoleiman, Investigating the Benefits of Projection Head for Representation Learning, ICLR 2024.

[4] Siddharth Joshi and Baharan Mirzasoleiman, Data-Efficient Contrastive Self-supervised Learning: Most Beneficial Examples for Supervised Learning Contribute the Least, ICML 2023.

[5] Yihao Xue, Siddharth Joshi, Eric Gan, Pin-Yu Chen and Baharan Mirzasoleiman, Which Features are Learnt by Contrastive Learning? On the Role of Simplicity Bias in Class Collapse and Feature Suppression, ICML 2023 (Oral).

[6] Siddharth Joshi*, Yuhan Liu* and Baharan Mirzasoleiman, Low Rank Pruning via Output Perturbation, Sparsity in Neural Networks Workshop 2022.

* = equal contribution

Preprints

[1] Siddharth Joshi, Yu Yang, Yihao Xue, Wenhan Yang and Baharan Mirzasoleiman, Towards Mitigating Spurious Correlations in the Wild: A Benchmark & a more Realistic Dataset, arXiv.