implement a function(int[][] matrix, int rownum, int colnum) that prints a matrix spiraling out from a given index in a multi-threaded fashion
Big Data Analyst Interview Questions
1,784 big data analyst interview questions shared by candidates
Serialize and deserialize binary tree
Given a list of logs find out the ip address that is repeated
1.SQL: **d_customers** +-------------+-----------------------+---------------------+ | customer_id | membership_start_date | membership_end_date | +-------------+-----------------------+---------------------+ | 114 | 2015-01-01 | 2015-02-15 | | 116 | 2015-02-01 | 2015-03-15 | | 120 | 2015-02-15 | 2015-04-01 | | 221 | 2015-03-15 | 2015-10-01 | | 120 | 2015-05-15 | 2015-07-01 | +-------------+-----------------------+---------------------+ **d_shipments** +-------------+------------+-----------------------+----------+ | shipment_id | ship_date | receiving_customer_id | quantity | +-------------+------------+-----------------------+----------+ | 1 | 2015-02-13 | 114 | 2 | | 2 | 2015-03-01 | 116 | 4 | | 2 | 2015-03-01 | 116 | 1 | | 3 | 2015-06-01 | 116 | 1 | | 4 | 2015-03-01 | 120 | 6 | | 5 | 2015-10-01 | 120 | 3 | | 6 | 2015-03-01 | 321 | 10 | +-------------+------------+-----------------------+----------+ Populate **a_shipments** +-----------+-----------+----------+----------+----------+ | ship_date | customer_id | is_member | quantity | +-----------+-----------+----------+----------+----------+ the column [is_member]: if [ship_date] is between [membership_start_date] and [membership_end_date] then 'y', else 'N' sample of otput: 2015-03-01 | 116 | Y | 5 | 2015-06-01 | 116 | N | 1 | 2. Coding task. Check whether a string is palindrome. I have been asked to code a solution by iterative and recursive approach. 3. Big Data questions: 3.1. What format of files in Hadoop do I know? What is a difference between Avro and Parquet format? 3.2. How compression is used in Avro and Parquet formats? 3.3. Most difficult big data performance challenges you have faced and resolved? 3.4. Spark optimization. Spark cost based optimizer
Very broad range of questions covering data engineering, data science, distributed computing, architecture... and specialties like record linkage / deduplication + multiple code exercises
SQL question, how to retrieve data using a join condition along with windowing features.
What is a ROC Curve ?
Difference between Private IP and Public IP?
What exactly happens in shuffling of map reduce job?
Behavioural questions
Viewing 1701 - 1710 interview questions