Friday 8 February 2019

Some useful big-data ecosystem interview questions

so here are some good  big-data ecosystem related questions shared for the community for their interview preparations.

HDFS
  • What are the main components and node types that make HDFS ?
  • Briefly explain the purpose and function of FSI image and Editlog ?
  • What is the difference between secondary name node and standby name node?
  • what are the options for replicating and HDFS cluster for example production and DR cluster ?
  • How are corrupted HDFS files detected and subsequently recovered ?
  • Describe HDFS tiered storage concepts?
KafKA
  • Describe functions of Kafka Mirror maker ?
  • how do you integrate Kafka mirror maker into an existing architecture ?
Spark
  • What are definitions in Spark for , Jobs , Stages ,Task.
  • Describe each of execution modes that spark supports
Yarn
  • What is yarn used for ? Describe each of its core components ?
Cassandra
  • Explain how Cassandra stores data
  • What is difference between partition and clustering ?
Thanks for the interview panel to share a copy of these questions

No comments:

Post a Comment