Resume
Work
-
2022.08 - 2024.08 Data Engineer
IBM - Chief Information Office (CIO)
Led data engineering initiatives as Product Owner for the GHD business unit, optimizing and scaling data pipelines to drive predictive analytics and data mining capabilities.
- Migrated IBM's transactional data warehouse to IBM Cloud using Apache Spark and Scala
- Reduced costs by 81.3% through legacy system migration and decommissioning
- Optimized SQL scripts from 600+ to 22 ETLs through new modeling and architecture
- Improved processing speed by 63.6% with data lakes and data marts implementation
- Reduced downtime by 37.5% through Jenkins CI/CD and Apache Airflow automation
- Participated in CDC PoC using Apache Kafka and Debezium for streaming data pipelines
-
2022.01 - 2022.07 Software Developer Intern
IBM - Chief Information Office (CIO)
Assisted with 6+ cross-functional teams to streamline data workflows, integrating data pipelines that supported efficient ML algorithm development for optimal bid pricing.
- Validated and refined 40+ SQL scripts into 6 Fact models
- Documented 17 key business logic for bid pricing modules
- Automated Datamart schema generation with Python script
-
2020.06 - 2021.01 Research Assistant
Amrita School of Engineering - AMUDA Lab
Performed EDA on real-time indoor localization data using BLE beacons, applying ML techniques to derive actionable insights.
Education
-
2024.09 - 2026.05 New York City, NY
Master of Science
New York University, Courant Institute of Mathematical Sciences
Computer Science, AI Concentration
-
2018.08 - 2022.06 Coimbatore, IN
Volunteer
-
2025.09 - Present New York City, NY
Recitation Leader
New York University
Recitation leader for these courses at CIMS Math department
- CALC I, MATH-UA.121.019 & MATH-UA.121.020
Projects
- 2025.04 - 2025.05
Gaze-Guided Reinforcement Learning for Visual Search
Implemented a novel RL framework integrating human gaze patterns with PPO algorithm in AI2-THOR using three integration methods and custom CNN architectures.
- Achieved 26% better performance than random baselines
- Improved sample efficiency in 3D visual search and object detection tasks
- 2024.10 - 2024.12
MTA Transit Time Prediction
Designed robust regression models to predict NYC bus travel times using MTA BusTime and TomTom Traffic data.
- Achieved MAE of 43.73 seconds using XGBoost with grid-based modeling
- Evaluated LSTM architectures to optimize short-sequence temporal data predictions
- 2021.03 - 2021.05
Health Insurance Cross-Sell Prediction Case Study
Engineered a predictive machine learning model to forecast customer propensity for purchasing additional insurance products.
Skills
Programming | |
Python | |
C++ | |
Scala | |
SQL | |
DB2 |
Frameworks | |
Apache Spark | |
Apache Airflow | |
TensorFlow | |
Keras | |
PyTorch | |
OpenAI Gym | |
FastAPI | |
dbt (Data Build Tool) |
Cloud | |
AWS (EC2, S3) | |
GCP |
Developer Tools | |
Git | |
Docker | |
Kubernetes | |
Bazel | |
Jenkins | |
LogDNA | |
VS Code | |
Jupyter Notebook | |
Anaconda |
Libraries | |
Pandas | |
NumPy | |
Matplotlib | |
Scikit-learn | |
XGBoost | |
SpaCy |