Data Engineer Interview Questions

Data Engineer Interview Questions

Data engineers zijn IT-professionals en zijn in bijna elke bedrijfstak nodig. Data engineers volgen gegevenstrends voor het vaststellen van de beste vervolgstappen voor bedrijven. Een cruciaal onderdeel van het werk van een data engineer bestaat uit het verwerken van ruwe gegevens tot bruikbare gegevens door datapipelines te creëren en gegevenssystemen te bouwen.

Meest gestelde sollicitatievragen voor een data engineer (M/V/X) en hoe te antwoorden

Question 1

Vraag 1: Kunt u tot in detail uw kennisniveau van programmeertalen omschrijven?

How to answer
Zo antwoordt u: Bekijk vóór het sollicitatiegesprek uw cv en/of portfolio en maak een lijst van de programma's waar u het meest bekwaam in bent. Als het u duidelijk wordt dat u voor een programma dat het bedrijf voornamelijk gebruikt, niet de benodigde expertise in huis hebt, beschrijf uzelf dan als een zeer gemotiveerd, zelfstandig persoon die zich onvermoeibaar zal inzetten om deze programma's te leren.
Question 2

Vraag 2: Leg in uw eigen woorden uit wat data engineering inhoudt.

How to answer
Zo antwoordt u: Leg uit wat uw rol is in relatie tot de bredere organisatie en in relatie tot andere rollen zoals die van data scientists om uw bijdrage aan het totale bedrijfssysteem duidelijk te maken. Verduidelijk het verschil tussen een op de database gerichte engineer en een op de pipeline gerichte engineer.
Question 3

Vraag 3: Kunt u uw ervaring met Apache Hadoop en databeheer in een cloudomgeving beschrijven?

How to answer
Zo antwoordt u: Bereid u voor op deze vraag door informatie te zoeken over de software van het bedrijf, producten voor gegevensopslag in de cloud en het gebruik van Apache Hadoop. Data engineers moeten kunnen werken met programmeertalen en gegevensbeheersystemen die overal in de bedrijfstak worden gebruikt, zoals Apache Hadoop.

20,944 data engineer interview questions shared by candidates

Given a dictionary, print the key for nth highest value present in the dict. If there are more than 1 record present for nth highest value then sort the key and print the first one (alphabetically). N can be higher than the number of elements in the dictionary.
avatar

Data Engineer

Interviewed at Meta

3.5
Aug 17, 2021

Given a dictionary, print the key for nth highest value present in the dict. If there are more than 1 record present for nth highest value then sort the key and print the first one (alphabetically). N can be higher than the number of elements in the dictionary.

Given a list of ints, balance the list so that each int appears equally in the list. Return a dictionary where the key is the int and the value is the count needed to balance the list. [1, 1, 2] => {2: 1} [1, 1, 1, 5, 3, 2, 2] => {5: 2, 3: 2, 2: 1}
avatar

Data Engineer

Interviewed at Meta

3.5
Aug 17, 2021

Given a list of ints, balance the list so that each int appears equally in the list. Return a dictionary where the key is the int and the value is the count needed to balance the list. [1, 1, 2] => {2: 1} [1, 1, 1, 5, 3, 2, 2] => {5: 2, 3: 2, 2: 1}

SQL questions on promotions, sales schema. what %age of products have both non fat and trans fat. find top 5 sales products having promotions what %age of sales happened on first and last day of the promotion Mysql was used and interviewer asked to if this can be done without subquery. Python:- [1,None,1,2,None} --> [1,1,1,2,2] Ensure you take care of case input[None] which means None object. find s in missisipi.
avatar

Data Engineer

Interviewed at Meta

3.5
Jun 29, 2020

SQL questions on promotions, sales schema. what %age of products have both non fat and trans fat. find top 5 sales products having promotions what %age of sales happened on first and last day of the promotion Mysql was used and interviewer asked to if this can be done without subquery. Python:- [1,None,1,2,None} --> [1,1,1,2,2] Ensure you take care of case input[None] which means None object. find s in missisipi.

products sales +------------------+---------+ +------------------+---------+ | product_id | int |------->| product_id | int | | product_class_id | int | +---->| store_id | int | | brand_name | varchar | | +->| customer_id | int | | product_name | varchar | | | | promotion_id | int | | price | int | | | | store_sales | decimal | +------------------+---------+ | | | store_cost | decimal | | | | units_sold | decimal | | | | transaction_date | date | | | +------------------+---------+ | | stores | | customers +-------------------+---------+ | | +---------------------+---------+ | store_id | int |-+ +--| customer_id | int | | type | varchar | | first_name | varchar | | name | varchar | | last_name | varchar | | state | varchar | | state | varchar | | first_opened_date | datetime| | birthdate | date | | last_remodel_date | datetime| | education | varchar | | area_sqft | int | | gender | varchar | +-------------------+---------+ | date_account_opened | date | +---------------------+---------+ Question 1: What brands have an average price above $3 and contain at least 2 different products? Question 2: To improve sales, the marketing department runs various types of promotions. The marketing manager would like to analyze the effectiveness of these promotion campaigns. In particular, what percent of our sales transactions had a valid promotion applied? Question 3: We want to run a new promotion for our most successful category of products (we call these categories “product classes”). Can you find out what are the top 3 selling product classes by total sales? Question 4: We are considering running a promo across brands. We want to target customers who have bought products from two specific brands. Can you find out which customers have bought products from both the “Fort West" and the "Golden" brands?
avatar

Data Engineer

Interviewed at Meta

3.5
May 22, 2020

products sales +------------------+---------+ +------------------+---------+ | product_id | int |------->| product_id | int | | product_class_id | int | +---->| store_id | int | | brand_name | varchar | | +->| customer_id | int | | product_name | varchar | | | | promotion_id | int | | price | int | | | | store_sales | decimal | +------------------+---------+ | | | store_cost | decimal | | | | units_sold | decimal | | | | transaction_date | date | | | +------------------+---------+ | | stores | | customers +-------------------+---------+ | | +---------------------+---------+ | store_id | int |-+ +--| customer_id | int | | type | varchar | | first_name | varchar | | name | varchar | | last_name | varchar | | state | varchar | | state | varchar | | first_opened_date | datetime| | birthdate | date | | last_remodel_date | datetime| | education | varchar | | area_sqft | int | | gender | varchar | +-------------------+---------+ | date_account_opened | date | +---------------------+---------+ Question 1: What brands have an average price above $3 and contain at least 2 different products? Question 2: To improve sales, the marketing department runs various types of promotions. The marketing manager would like to analyze the effectiveness of these promotion campaigns. In particular, what percent of our sales transactions had a valid promotion applied? Question 3: We want to run a new promotion for our most successful category of products (we call these categories “product classes”). Can you find out what are the top 3 selling product classes by total sales? Question 4: We are considering running a promo across brands. We want to target customers who have bought products from two specific brands. Can you find out which customers have bought products from both the “Fort West" and the "Golden" brands?

Viewing 11 - 20 interview questions

Glassdoor has 20,944 interview questions and reports from Data engineer interviews. Prepare for your interview. Get hired. Love your job.