Cheran Ilango Blog: Some useful big-data ecosystem interview questions

Friday, 8 February 2019

Some useful big-data ecosystem interview questions

so here are some good big-data ecosystem related questions shared for the community for their interview preparations.

HDFS

What are the main components and node types that make HDFS ?
Briefly explain the purpose and function of FSI image and Editlog ?
What is the difference between secondary name node and standby name node?
what are the options for replicating and HDFS cluster for example production and DR cluster ?
How are corrupted HDFS files detected and subsequently recovered ?
Describe HDFS tiered storage concepts?

KafKA

Describe functions of Kafka Mirror maker ?
how do you integrate Kafka mirror maker into an existing architecture ?

Spark

What are definitions in Spark for , Jobs , Stages ,Task.
Describe each of execution modes that spark supports

Yarn

What is yarn used for ? Describe each of its core components ?

Cassandra

Explain how Cassandra stores data
What is difference between partition and clustering ?

Thanks for the interview panel to share a copy of these questions

No comments:

Post a Comment

Subscribe to: Post Comments (Atom)