CV
Namit Shrivastava
Summary
High performance AI and Data Engineer with 2 years of experience shipping production-grade ML systems and scalable ETL pipelines on Azure and AWS. Specialized in deploying RAG architectures, geospatial analytics, and high-throughput microservices. Currently pursuing MS in Survey Data Science at the University of Maryland.
Education
- Master of Science, Survey Data Science (Data Science Track)2026-05University of Maryland, College ParkGPA: 3.814/4.0Courses: Survey Methodology, Machine Learning, Data Privacy, Natural Language Processing, Causal Inference, Geospatial Data Analysis
- Bachelor of Engineering (Honours), Civil Engineering; Minor in Data Science2024-07Birla Institute of Technology and Science (BITS Pilani)GPA: 3.327/4.0Courses: Data Structures & Algorithms, Machine Learning, Database Systems, Statistics, Linear Algebra, Probability Theory
Work Experience
- Graduate Research Assistant2026-01 - 2026-05Social Data Science Center, University of MarylandEngineered automation and data infrastructure for university data repository.
- Engineered Python automation suite interacting with CKAN REST API to manage lifecycle of 18 datasets ensuring 100% resource accessibility and data integrity
- Architected scalable data taxonomy for university repository transforming flat catalogs into hierarchical thematic groups, improving search discoverability by 35%
- Implemented bulk metadata update scripts reducing manual maintenance overhead by 90% and ensuring strict schema compliance across the repository
- Teaching and Graduate Assistant2025-02 - 2026-05Joint Program in Survey Methodology (JPSM), University of MarylandTranslated complex data privacy frameworks into actionable learning modules and optimized course operations.
- Translated complex data privacy frameworks into actionable learning modules for 23 graduate students
- Optimized and coordinated learning operations using Asana, reengineering Canvas LMS infrastructure with automation scripts reducing course setup time by 40%
- Research Assistant2025-05 - 2025-12Institute for Social Research, University of MichiganBuilt scalable geospatial ETL pipelines and composite imputation models for longitudinal demographic analysis.
- Built scalable geospatial ETL pipeline aggregating multi-source demographic data (FCC, ACS, CDC) for 129,572 U.S. census tracts
- Developed composite imputation models recovering 28.6% of critical dataset rows and resolving complex missingness patterns for longitudinal analysis
- Designed automated Data Quality framework using Moran's I statistics to detect spatial autocorrelation anomalies, flagging 15% of tracts for review
- Synthesized disparate government datasets into a unified schema enabling identification of statistically significant rural health trends
- Machine Learning Engineer2024-01 - 2024-06Legistify Services Private LimitedArchitected and shipped production-grade trademark search engine and OCR pipelines on Azure.
- Architected and shipped production-grade trademark search engine processing 2.4 million images using FastAPI and Faiss vector embeddings
- Reduced query latency by 60% by optimizing HNSW indexing parameters and implementing asynchronous microservice architecture
- Deployed scalable OCR pipelines on Azure Cognitive Services processing 50,000 daily legal documents with 95% bilingual text extraction accuracy
- Eliminated database I/O bottlenecks by implementing distributed Redis caching layer increasing API throughput by 30% for 500 concurrent users
- Integrated CI/CD workflows for automated model deployment reducing release cycle time from weeks to under 3 months
- Advanced Application Engineering Analyst2023-06 - 2023-08AccentureEngineered automated threat detection and SOAR playbooks for enterprise security monitoring.
- Engineered automated threat detection logic in Azure Sentinel using KQL processing 10 million daily log events
- Designed Python-based SOAR playbooks to automate incident response reducing Mean Time to Respond (MTTR) by 80%
- Mapped detection rules to MITRE ATT&CK framework achieving 89% classification accuracy
- Hardened application perimeters by identifying and patching 15 critical vulnerabilities including SQL injection and XSS vectors
- Web Developer2022-05 - 2022-07Indian Red Cross SocietyDelivered Drupal-based CMS to digitize volunteer registry.
- Delivered Drupal-based CMS to digitize volunteer registry reducing manual data entry efforts by 50% for 10k monthly visitors
Skills
Programming
- Python
- R
- Java
- C
- JavaScript
- TypeScript
- HTML/CSS
- Bash/Shell
Data & Databases
- SQL
- MySQL
- PostgreSQL
- MongoDB
- Cassandra
- Snowflake
- Neo4j
- Pinecone
- Apache Spark
- Kafka
AI/ML
- PyTorch
- TensorFlow
- Keras
- Hugging Face
- LangChain
- LlamaIndex
- LangGraph
- PySpark
- NLTK
- SpaCy
- Scikit-Learn
DevOps & Cloud
- Git
- Docker
- Kubernetes
- Jenkins
- CI/CD
- REST APIs
- AWS
- Azure
- GCP
- Terraform
- Ansible
Core Competencies
- LLMs
- Generative AI
- RLHF/DPO
- Deep Learning
- NLP
- Computer Vision
- Survey Methodology
- Causal Inference
- MLOps
Publications
- Causal Inference Methods in Educational Research2024Journal of Educational MeasurementA comprehensive review of causal inference methods applicable to educational data, including propensity score methods, instrumental variables, and regression discontinuity designs.
Presentations
- Advances in Causal Inference for Educational Research2024American Educational Research Association (AERA) Annual MeetingPhiladelphia, PA, USAPresented novel methods for causal inference in educational settings with complex data structures.
Teaching
- Data Privacy and Survey Methodology2025Joint Program in Survey Methodology (JPSM), University of MarylandRole: Teaching and Graduate AssistantTranslated complex data privacy frameworks into actionable learning modules for 23 graduate students; reengineered Canvas LMS infrastructure with automation scripts reducing course setup time by 40%.
Portfolio
- Weather Guardian: Automated Daily Weather Briefing System2024AutomationBuilt an intelligent workflow using n8n, OpenWeatherMap API, and Supabase to automatically generate and email daily weather briefings.
- Trademark Search Engine2024MlProduction-grade trademark search engine processing 2.4 million images using FastAPI, Faiss vector embeddings, and HNSW indexing.
Languages
- EnglishProfessional working proficiency
- HindiNative speaker
Interests
- Machine Learning & AIRAG Architectures, Generative AI, LLMs, MLOps
- Data EngineeringETL Pipelines, Geospatial Analytics, Data Quality, Scalable Systems
- Survey Data ScienceSurvey Methodology, Causal Inference, Missing Data, Longitudinal Analysis