Hadoop Interview Questions and Answers for beginners

What is the difference between a NAS and HDFS?

In NAS data is stored on dedicated hardware. And in HDFS data blocks are distributed across local drives of all machines in a cluster.

In other word we can say that a NAS is not suitable for MapReduce since data is stored separately from the computations. HDFS designed to work with MapReduce system, since computation is moved to the data.

For a key and value class what is the Hadoop MapReduce APIs contract?

There are two Hadoop MapReduce API contract for a key and value class:

  1. The value must be defining the org.apache.hadoop.io.Writable interface.
  2. The key must be defining the org.apache.hadoop.io.WritableComparable interface.

How the client interacts with Hadoop Distributed File System?

The Client interacts to Hadoop Distributed File System utilizing HDFS API. The Client applications talk to the NameNode whenever they wish to find and when they need to add, copy, move, paste, and delete a file on HDFS.

How can of RecorddReader in Hadoop?

An InputSplit is defined with a work, but does not know how to access it. Record holder class is completely responsible for loading data from its source and converts it into a key pair easy for reading by the Mapper.  A RecordReader instance can be assigned by the Input Format.

What do you mean by JobTracer in Hadoop?

A JobTracer is a service within runs MapReduce jobs on the cluster.

Can you necessary to write jobs for Hadoop in Java language?

No, there are various techniques to deal with non-java codes. A Hadoop Streaming give any shell command to be used as a map or reduce function.

How to debug Hadoop code?

There are various methods to debug Hadoop codes but the most popular methods are:

  • By using web interface given by Hadoop framework.
  • By using Counters.

What do you mean by combiner in Hadoop?

The combiner is a mini_reduce technique which operates only on data generated by a Mapper. When the Mapper emits the data combiner receives it as input and the output to reducer.

Can you necessary to know Java to learn Hadoop?

If you know background in any programming language such as, C, C++, Java, Python, PHP and so on. So, it is really helpful, but if you are nil in Java language, it is necessary to learn Java and also get the core knowledge of Structure Query Language (SQL).