Data engineer Interview Questions

Data Engineer

Interviewed at Meta

3.5★

Jun 8, 2020

# # sales # products # +------------------+---------+ +---------------------+---------+ # | product_id | INTEGER |>--------| product_id | INTEGER | # | store_id | INTEGER | +---<| product_class_id | INTEGER | # | customer_id | INTEGER | | | brand_name | VARCHAR | # +---<| promotion_id | INTEGER | | | product_name | VARCHAR | # | | store_sales | DECIMAL | | | is_low_fat_flg | TINYINT | # | | store_cost | DECIMAL | | | is_recyclable_flg | TINYINT | # | | units_sold | DECIMAL | | | gross_weight | DECIMAL | # | | transaction_date | DATE | | | net_weight | DECIMAL | # | +------------------+---------+ | +---------------------+---------+ # | | # | # promotions | # product_classes # | +------------------+---------+ | +---------------------+---------+ # +----| promotion_id | INTEGER | +----| product_class_id | INTEGER | # | promotion_name | VARCHAR | | product_subcategory | VARCHAR | # | media_type | VARCHAR | | product_category | VARCHAR | # | cost | DECIMAL | | product_department | VARCHAR | # | start_date | DATE | | product_family | VARCHAR | # | end_date | DATE | +---------------------+---------+ # +------------------+---------+ # */ # Question 1: # -- What percent of all products in the grocery chain's catalog # -- are both low fat and recyclable? #

first round - written: 3 sql and one about what will you do to improve the fastness of an insert on a huge table second round - get the players with highest streak get the employee details who has maximum members in a team. python-return the numbers which have maximum count in a list round 3: behavioral questions and 1 question on python lists. from the 2 lists get the numbers that are common , and return the numbers in the following way. [1,2,3,3,1,1,1],[1,1,2,2,3] - return [1,1,2,3]

Data Engineer

Interviewed at Amazon

3.5★

Apr 8, 2021

first round - written: 3 sql and one about what will you do to improve the fastness of an insert on a huge table second round - get the players with highest streak get the employee details who has maximum members in a team. python-return the numbers which have maximum count in a list round 3: behavioral questions and 1 question on python lists. from the 2 lists get the numbers that are common , and return the numbers in the following way. [1,2,3,3,1,1,1],[1,1,2,2,3] - return [1,1,2,3]

How to count occurrences of a word in a sentence [python]

Data Engineer

Interviewed at Amazon

3.5★

Jun 8, 2020

How to count occurrences of a word in a sentence [python]

Real time business problems and solutions

Data Engineer

Interviewed at Amazon

3.5★

Apr 29, 2016

Real time business problems and solutions

basic questions in python using loops

Data Engineer

Interviewed at Meta

3.5★

May 17, 2020

basic questions in python using loops

python question: given a two dimensional list for example [ [2,3],[3,4],[5]] person 2 is friends with 3 etc. find how many friends does each person has. note one person has no friends. SQL question: find the top 10 college/company that a average social person interacts with. something in those lines. I split the query in two. Not able to finish coding but was able to explain and write both the parts but didn't have time to test it. also had data modeling questions. on a social network website. cant give details.

Data Engineer

Interviewed at Meta

3.5★

Nov 16, 2020

python question: given a two dimensional list for example [ [2,3],[3,4],[5]] person 2 is friends with 3 etc. find how many friends does each person has. note one person has no friends. SQL question: find the top 10 college/company that a average social person interacts with. something in those lines. I split the query in two. Not able to finish coding but was able to explain and write both the parts but didn't have time to test it. also had data modeling questions. on a social network website. cant give details.

Implement data pipeline to answer business question

Data Engineer

Interviewed at Meta

3.5★

Apr 1, 2017

Implement data pipeline to answer business question

Python 1 #1.returns the number of times a given character occurs in the given string s1='missisipi' #print(s1.find('s')) res=[] for i in range(len(s1)): #print(s1[i]) if s1[i]=='s': res.append('s') print(len(res)) #2.[1,None,1,2,None} --> [1,1,1,2,2] arr=[None,1,2,None] new_l=[] for i in range(0,len(arr)): if arr[i] != None: new_l.append(arr[i]) else: new_l.append(arr[i-1]) print(new_l) #2. (python) Given two sentences, construct an array that has the words that appear in one sentence and not the other. A = "Geeks for Geeks" B = "Learning from Geeks for Geeks" d={} for w in A.split(): if w in d: d[w]=d.get(w,0)+1 else: d[w]=1 for w in B.split(): if w in d: d[w]=d.get(w,0)+1 else: d[w]=1 unmatchedW=[w for w in d if d[w]==1] print (unmatchedW) 3. d = {"a": 4, "c": 3, "b": 12} [(k, v) for k, v in sorted(d.items(), key=lambda x: x[1], reverse=True)] #[('b', 12), ('a', 4), ('c', 3)] SQL # # sales # products # +------------------+---------+ +---------------------+---------+ # | product_id | INTEGER |>--------| product_id | INTEGER | # | store_id | INTEGER | +---<| product_class_id | INTEGER | # | customer_id | INTEGER | | | brand_name | VARCHAR | # +---<| promotion_id | INTEGER | | | product_name | VARCHAR | # | | store_sales | DECIMAL | | | is_low_fat_flg | TINYINT | # | | store_cost | DECIMAL | | | is_recyclable_flg |… Show More 1. find top 5 sales products having promotions Select Sum(s.store_sales), brand_name, count(p.product_id) from products p inner join sales s p.product_id = s.product_id where promotion_id is not null group by brand_name having count(p.product_id) =1 /* single-channel media type */ order by 1 desc limit 5 2. # -- % Of sales that had a valid promotion, the VP of marketing # -- wants to know what % of transactions occur on either # -- the very first day or the very last day of a promotion campaign. select sum(case when valid_promotion = 1 then 1 else 0 end)/count(*) * 100 as percentage from sales where day = First_day(date) or day = last_day(date) or select sum(case when transaction_date = (select min(transaction_date) from sales) then 1 else 0)/count(*) as first_day_sales, sum(case when transaction_date = (select max(transaction_date) from sales) then 1 else 0)/count(*) as last_day_sales from sales or select avg(transaction_date in (p.start_date,p.end_date))*100 as first_last_pct from sales s join promotions p using(promotion_id)

Data Engineer

Interviewed at Meta

3.5★

Aug 25, 2020

Python 1 #1.returns the number of times a given character occurs in the given string s1='missisipi' #print(s1.find('s')) res=[] for i in range(len(s1)): #print(s1[i]) if s1[i]=='s': res.append('s') print(len(res)) #2.[1,None,1,2,None} --> [1,1,1,2,2] arr=[None,1,2,None] new_l=[] for i in range(0,len(arr)): if arr[i] != None: new_l.append(arr[i]) else: new_l.append(arr[i-1]) print(new_l) #2. (python) Given two sentences, construct an array that has the words that appear in one sentence and not the other. A = "Geeks for Geeks" B = "Learning from Geeks for Geeks" d={} for w in A.split(): if w in d: d[w]=d.get(w,0)+1 else: d[w]=1 for w in B.split(): if w in d: d[w]=d.get(w,0)+1 else: d[w]=1 unmatchedW=[w for w in d if d[w]==1] print (unmatchedW) 3. d = {"a": 4, "c": 3, "b": 12} [(k, v) for k, v in sorted(d.items(), key=lambda x: x[1], reverse=True)] #[('b', 12), ('a', 4), ('c', 3)] SQL # # sales # products # +------------------+---------+ +---------------------+---------+ # | product_id | INTEGER |>--------| product_id | INTEGER | # | store_id | INTEGER | +---<| product_class_id | INTEGER | # | customer_id | INTEGER | | | brand_name | VARCHAR | # +---<| promotion_id | INTEGER | | | product_name | VARCHAR | # | | store_sales | DECIMAL | | | is_low_fat_flg | TINYINT | # | | store_cost | DECIMAL | | | is_recyclable_flg |… Show More 1. find top 5 sales products having promotions Select Sum(s.store_sales), brand_name, count(p.product_id) from products p inner join sales s p.product_id = s.product_id where promotion_id is not null group by brand_name having count(p.product_id) =1 /* single-channel media type */ order by 1 desc limit 5 2. # -- % Of sales that had a valid promotion, the VP of marketing # -- wants to know what % of transactions occur on either # -- the very first day or the very last day of a promotion campaign. select sum(case when valid_promotion = 1 then 1 else 0 end)/count(*) * 100 as percentage from sales where day = First_day(date) or day = last_day(date) or select sum(case when transaction_date = (select min(transaction_date) from sales) then 1 else 0)/count(*) as first_day_sales, sum(case when transaction_date = (select max(transaction_date) from sales) then 1 else 0)/count(*) as last_day_sales from sales or select avg(transaction_date in (p.start_date,p.end_date))*100 as first_last_pct from sales s join promotions p using(promotion_id)

I asked the employee (not the manager) of the data team, tell me about your team and company.

Data Engineer

Interviewed at Enjoy Technology

3★

Jan 11, 2017

I asked the employee (not the manager) of the data team, tell me about your team and company.

1)- programmingl find the max no from the given set of elements in an array (without using max function) 2)- Find the minimum absolute difference between the set of elements of an array.

Data Engineer

Interviewed at Meta

3.5★

Jan 24, 2016

1)- programmingl find the max no from the given set of elements in an array (without using max function) 2)- Find the minimum absolute difference between the set of elements of an array.

Data Engineer Interview Questions

Data Engineer Interview Questions

Meest gestelde sollicitatievragen voor een data engineer (M/V/X) en hoe te antwoorden

Vraag 1: Kunt u tot in detail uw kennisniveau van programmeertalen omschrijven?

Vraag 2: Leg in uw eigen woorden uit wat data engineering inhoudt.

Vraag 3: Kunt u uw ervaring met Apache Hadoop en databeheer in een cloudomgeving beschrijven?

20,944 data engineer interview questions shared by candidates

See Interview Questions for Similar Jobs