Resume | Monishver Chandrasekaran

Work

2022.08 - 2024.08
Data Engineer

IBM - Chief Information Office (CIO)

Led data engineering initiatives as Product Owner for the GHD business unit, optimizing and scaling data pipelines to drive predictive analytics and data mining capabilities.
- Migrated IBM's transactional data warehouse to IBM Cloud using Apache Spark and Scala
- Reduced costs by 81.3% through legacy system migration and decommissioning
- Optimized SQL scripts from 600+ to 22 ETLs through new modeling and architecture
- Improved processing speed by 63.6% with data lakes and data marts implementation
- Reduced downtime by 37.5% through Jenkins CI/CD and Apache Airflow automation
- Participated in CDC PoC using Apache Kafka and Debezium for streaming data pipelines
2022.01 - 2022.07
Software Developer Intern

IBM - Chief Information Office (CIO)

Assisted with 6+ cross-functional teams to streamline data workflows, integrating data pipelines that supported efficient ML algorithm development for optimal bid pricing.
- Validated and refined 40+ SQL scripts into 6 Fact models
- Documented 17 key business logic for bid pricing modules
- Automated Datamart schema generation with Python script

Education

2024.09 - 2026.05

New York City, NY
Master of Science

New York University, Courant Institute of Mathematical Sciences

Computer Science, AI Concentration

GPA: 3.8/4.0
2018.08 - 2022.06

Coimbatore, IN
Bachelor of Technology

Amrita School of Engineering, Amrita Vishwa Vidyapeetham

Computer Science

GPA: 9.37/10.0

Volunteer

2025.09 - 2025.12

New York City, NY
Recitation Leader

New York University

Recitation leader for Calculus-I course at CIMS Math department, held weekly teaching sessions for 60+ undergrad students.
- CALC I, MATH-UA.121.019 & MATH-UA.121.020

Projects

2025.09 - 2025.12
From Baseline to DeepSeek - Single-GPU MoE Training Efficiency

Conducted a systems-level study of MoE training efficiency on resource-constrained hardware using the FineWeb-10B dataset, comparing naive PyTorch, ScatterMoE, and MegaBlocks architectures.
- Reproduced a DeepSeek-inspired MoE architecture with shared experts and top-k routing, achieving a validation loss of 3.93 (1.5% improvement over dense models)
- Optimized training throughput using ScatterMoE fused kernels, reducing memory footprint by 18% and decreasing latency by 16% against a Naive MoE implementation
- Benchmarked implementation variants including MegaBlocks and ScatterMoE to identify VRAM bottlenecks in single-GPU scaling
2025.09 - 2025.12
SmallGraphGCN - Accelerating GNN Training on Batched Small Graphs

Engineered a specialized GCN framework optimized for training on large batches of small sized graphs by eliminating excessive kernel launches by kernel fusion and edge-centric parallelism.
- Wrote custom CUDA kernels for computation aggregation, resulting in a 68% reduction in kernel launches and 4.9x lower memory transfer overhead compared to PyG baselines
- Maintained high model accuracy on molecular datasets while achieving substantial gains in training throughput
2025.04 - 2025.05
Gaze-Guided Reinforcement Learning for Visual Search

Implemented a novel RL framework integrating human gaze patterns with PPO algorithm in AI2-THOR using three integration methods and custom CNN architectures.
- Achieved 26% better performance than random baselines
- Improved sample efficiency in 3D visual search and object detection tasks
2024.10 - 2024.12
MTA Transit Time Prediction

Designed robust regression models to predict NYC bus travel times using MTA BusTime and TomTom Traffic data.
- Achieved MAE of 43.73 seconds using XGBoost with grid-based modeling
- Evaluated LSTM architectures to optimize short-sequence temporal data predictions
2021.03 - 2021.05
Health Insurance Cross-Sell Prediction Case Study

Engineered a predictive machine learning model to forecast customer propensity for purchasing additional insurance products.

Skills

	Programming
	Python
	C/C++
	Scala
	SQL
	DB2

	Frameworks
	CUDA
	Apache Spark
	Apache Airflow
	TensorFlow
	Keras
	PyTorch
	OpenAI Gym
	FastAPI
	dbt (Data Build Tool)

	Cloud
	AWS (EC2, S3)
	GCP

	Developer Tools
	Git
	Docker
	Kubernetes
	Bazel
	Jenkins
	LogDNA
	VS Code
	Jupyter Notebook
	Anaconda

	Libraries
	Pandas
	NumPy
	Matplotlib
	Scikit-learn
	XGBoost
	SpaCy

Work

IBM - Chief Information Office (CIO)

Led data engineering initiatives as Product Owner for the GHD business unit, optimizing and scaling data pipelines to drive predictive analytics and data mining capabilities.

IBM - Chief Information Office (CIO)

Assisted with 6+ cross-functional teams to streamline data workflows, integrating data pipelines that supported efficient ML algorithm development for optimal bid pricing.

Education

New York University, Courant Institute of Mathematical Sciences

Computer Science, AI Concentration

GPA: 3.8/4.0

Amrita School of Engineering, Amrita Vishwa Vidyapeetham

Computer Science

GPA: 9.37/10.0

Volunteer

New York University

Recitation leader for Calculus-I course at CIMS Math department, held weekly teaching sessions for 60+ undergrad students.

Projects

Conducted a systems-level study of MoE training efficiency on resource-constrained hardware using the FineWeb-10B dataset, comparing naive PyTorch, ScatterMoE, and MegaBlocks architectures.

Engineered a specialized GCN framework optimized for training on large batches of small sized graphs by eliminating excessive kernel launches by kernel fusion and edge-centric parallelism.

Implemented a novel RL framework integrating human gaze patterns with PPO algorithm in AI2-THOR using three integration methods and custom CNN architectures.

Designed robust regression models to predict NYC bus travel times using MTA BusTime and TomTom Traffic data.

Engineered a predictive machine learning model to forecast customer propensity for purchasing additional insurance products.

Skills