Resume
Work
-
2022.08 - 2024.08 Data Engineer
IBM - Chief Information Office (CIO)
Led data engineering initiatives as Product Owner for the GHD business unit, optimizing and scaling data pipelines to drive predictive analytics and data mining capabilities.
- Migrated IBM's transactional data warehouse to IBM Cloud using Apache Spark and Scala
- Reduced costs by 81.3% through legacy system migration and decommissioning
- Optimized SQL scripts from 600+ to 22 ETLs through new modeling and architecture
- Improved processing speed by 63.6% with data lakes and data marts implementation
- Reduced downtime by 37.5% through Jenkins CI/CD and Apache Airflow automation
- Participated in CDC PoC using Apache Kafka and Debezium for streaming data pipelines
-
2022.01 - 2022.07 Software Developer Intern
IBM - Chief Information Office (CIO)
Assisted with 6+ cross-functional teams to streamline data workflows, integrating data pipelines that supported efficient ML algorithm development for optimal bid pricing.
- Validated and refined 40+ SQL scripts into 6 Fact models
- Documented 17 key business logic for bid pricing modules
- Automated Datamart schema generation with Python script
-
2020.06 - 2021.01 Research Assistant
Amrita School of Engineering - AMUDA Lab
Performed EDA on real-time indoor localization data using BLE beacons, applying ML techniques to derive actionable insights.
Education
-
2024.09 - 2026.05 New York City, NY
Master of Science
New York University, Courant Institute of Mathematical Sciences
Computer Science, AI Concentration
-
2018.08 - 2022.06 Coimbatore, IN
Volunteer
-
2025.09 - Present New York City, NY
Recitation Leader
New York University
Recitation leader for these courses at CIMS Math department
- CALC I, MATH-UA.121.019 & MATH-UA.121.020
Projects
- 2025.04 - 2025.05
Gaze-Guided Reinforcement Learning for Visual Search
Implemented a novel RL framework integrating human gaze patterns with PPO algorithm in AI2-THOR using three integration methods and custom CNN architectures.
- Achieved 26% better performance than random baselines
- Improved sample efficiency in 3D visual search and object detection tasks
- 2024.10 - 2024.12
MTA Transit Time Prediction
Designed robust regression models to predict NYC bus travel times using MTA BusTime and TomTom Traffic data.
- Achieved MAE of 43.73 seconds using XGBoost with grid-based modeling
- Evaluated LSTM architectures to optimize short-sequence temporal data predictions
- 2021.03 - 2021.05
Health Insurance Cross-Sell Prediction Case Study
Engineered a predictive machine learning model to forecast customer propensity for purchasing additional insurance products.
Skills
| Programming | |
| Python | |
| C++ | |
| Scala | |
| SQL | |
| DB2 |
| Frameworks | |
| Apache Spark | |
| Apache Airflow | |
| TensorFlow | |
| Keras | |
| PyTorch | |
| OpenAI Gym | |
| FastAPI | |
| dbt (Data Build Tool) |
| Cloud | |
| AWS (EC2, S3) | |
| GCP |
| Developer Tools | |
| Git | |
| Docker | |
| Kubernetes | |
| Bazel | |
| Jenkins | |
| LogDNA | |
| VS Code | |
| Jupyter Notebook | |
| Anaconda |
| Libraries | |
| Pandas | |
| NumPy | |
| Matplotlib | |
| Scikit-learn | |
| XGBoost | |
| SpaCy |