Virinderpal Singh Batth
Data Engineer CV
Professional Summary
Data Engineering leader with 5+ years building enterprise data platforms in financial services and insurance. Track record of dramatic efficiency gains—95% compute reduction, 90% faster SCD Type 2 queries, and 2TB+ Hadoop-to-Snowflake migrations. Currently leading a team architecting unified operational data stores and real-time API pipelines serving transactional business insights.
Experience
Lead Data Engineer | Insurance
evolv ConsultingOct 2025 – Present
- Leading and growing a team of 3+ data engineers to architect an insurance client’s first unified operational data store in Snowflake using dbt
- Defining master data standards across 4+ legacy platforms, resolving data overlaps between sub-companies
- Architecting high-performance data pipelines serving a transactional Kong API layer for real-time business insights
- Built reusable dbt framework for automated data extracts to AWS S3, enabling self-service reporting
Data Engineer | Financial Credit Risk
USAAApr 2021 – Oct 2025
- Resolved critical Snowflake performance bottleneck, reducing compute usage by 95% and significantly decreasing query costs
- Optimized SCD Type 2 queries achieving 90% reduction in running time (3 hours → 15 minutes) and 70% reduction in data scan volume
- Designed Kafka streaming pipeline for ML model training with staging tables, automated resiliency checks, and idempotent loading
- Led platform modernization migrating 2+ terabytes from Hadoop to Snowflake using PySpark and parquet format
- Drove Secured Card to Credit Card transition resulting in 30% increase in member engagement
- Implemented decoupled-push architecture reducing On-Call overhead by 90%
- Architected cross-organizational data lake POC with AWS S3, reducing transfer time by 50%
- Enhanced PII/PCI/PHI security with data masking, tokenization, and RBAC
Big Data Consultant | USAA
HCL America Inc.Nov 2017 – Apr 2021
- Pioneered the first project at USAA to utilize AWS cloud and real-time streaming (NiFi/Kafka), transforming batch to real-time processing
- Established AWS S3 to HDFS file transfer routes, enabling cloud-to-on-premise data integration
- Developed 30+ critical decisioning data pipelines in Hadoop supporting credit card risk decisioning
- Built core ETL pipelines using IBM DataStage and DB2 with complex transformations and governance checks
- Created automated schema check process eliminating hours of manual inspection
- Supervised Hadoop projects and mentored colleagues on CI/CD best practices
- Ensured PII/non-PII compliance with Information Governance standards
Technical Skills
| Category | Technologies |
|---|---|
| Programming | Python (PySpark, Pandas, SQLAlchemy, FastAPI), SQL, Bash, Git |
| Data Engineering | dbt, Kafka, Flink, NiFi, IBM DataStage |
| Cloud & Platforms | AWS (S3, EC2, Redshift, Lambda, Athena), Snowflake, Hadoop, DB2 |
| Data Formats | JSON, Parquet, REST APIs |
| Visualization | Apache Superset |
| Practices | CI/CD, Data Governance, RBAC, Data Masking, Tokenization |
Education
B.S. in Computer Science | Cum Laude
Big Data & Analytics Concentration
New York Institute of Technology, New York