Hadoop Interview Questions and Answers for beginners

What is the main difference between an InputSplit and a Block?

The block is a physical division of data and does not take into account the logical boundary of records. And InputSplit considers the logical boundaries of records as well.

Can you change the number of mappers to be defined for a job in Hadoop?

No, the number of mappers is considered by the no of input splits.

How do you a file system check in HDFS?

FSCK command is used to do a file system check in HDFS.

How do you overwrite replication factor in Hadoop?

There are few ways to do this:

Hadoop fs –seterp –w 5 –R hadoop-test
Hadoop fs –Ddfs.replication=5–cp hadoop-meraj/meraj.csv hadoop-meraj/meraj_with_rep5.csv


What is a sequence file in Hadoop?

A sequence files support splitting even when the data within the file is compressed which is not possible with a regular compressed file.

How many input splits will be created by Hadoop framework?

Hadoop will make five splits such as:

-1 split for 64K files

-2 splits for 65MB files

-2 splits for 127MB files

What is WebDAV in Hadoop?

For support editing and updating files WebDAV is a set of extension to HTTP. The most operating system WebDAV shares can be mounted as filesystems, so it is possible to find HDFS as a standard filesystem by display HDFS over WebDAV.

What is the sqoop in Hadoop?

Transfer the data between Relational database management and Hadoop HDFS a tool is used is called Sqoop. By using Sqoop data can be converted from RDMS such as MySQL, Oracle into HDFS as well as exporting data from HDFS file to Relational Database Management.

What do you mean by Sequencefileinputformat?

A Sequencefileinputformat is used for reading file in sequence order. It is specific to the compressed binary file format which is optimized for other data between the outputs of one MapReduce job to input of some other MapReduce job.

What does the conf.setMapperClass do?

The conf.setMapperClass sets the mapper class and the entire deep related to map job like reading data and generating a key-value pair out of the mapper.