I applied online. The process took 2 weeks. I interviewed at EPAM Systems (Poona) in Jul 2025
Interview
The interview process was smooth and well-organized. The questions were highly relevant to Data Engineering, reflecting real-world scenarios. Additionally, providing food for candidates was a thoughtful gesture. Overall, a positive and professional experience.
Interview questions [1]
Question 1
1. Explain the difference between SCD Type1, SCD Type2, SCD Type3
I applied through a recruiter. The process took 2 weeks. I interviewed at EPAM Systems (Madras) in Jul 2025
Interview
The Process was bit tough when compared to other companies. It involves total of 3 rounds, two technical round and one techno managerial round. both technical round involves 1.5 hrs of questions related to spark,sql , python and cloud service you are experinced on . Three coding questions on each round of 1. python dsa 2. sql 3. pyspark. If cleared first two then techno is quite easy just questions on project.
I applied online. I interviewed at EPAM Systems (Gurgaon, Haryana) in Jan 2021
Interview
Technical Round (1 or 2 rounds) which was followed by Project Manager or Solution Architect Round then
HR or Offer Discussion Round....... .... ....
Join the company on your first day with smile
Interview questions [1]
Question 1
A. Core Data Engineering Concepts
SQL (joins, window functions, performance tuning)
Data Modeling (star vs snowflake, normalization)
ETL/ELT pipelines (batch vs streaming, orchestration tools like Airflow)
B. Apache Spark / PySpark
Catalyst Optimizer & Tungsten
Narrow vs Wide transformations
Joins (broadcast, sort-merge), Skew handling
AQE (Adaptive Query Execution)
Partitioning, Predicate Pushdown
Execution Plan (DAG → Stage → Tasks)
Spark UI and Job Debugging
SCD Type 2 Implementation in PySpark
C. AWS
S3, Glue, Athena, Lambda, EMR, Redshift
Event-driven design (S3 → EventBridge → Lambda)
Security: IAM roles, bucket policies, encryption
CI/CD in AWS (CodePipeline, CloudFormation)
D. Python
Writing modular, reusable code
Working with Pandas, Boto3 (for AWS interaction)
Exception handling, logging
Lambda functions and decorators
E. Kafka / Streaming
Kafka topic partitioning, consumer groups
Offset management
Integration with Spark Structured Streaming