Apache Hadoop is an open source, Scalable, and Fault-tolerant framework written in Java. It efficiently processes large volumes of data on a cluster of commodity hardware. Hadoop is not only a storage system but is a platform for large data storage as well as processing. This Big Data Hadoop tutorial provides a thorough Hadoop introduction.
What is Hadoop?
Hadoop is an open-source tool from the ASF - Apache Software Foundation. Open source project means it is freely available and we can even change its source code as per the requirements. If certain functionality does not fulfill your need then you can change it according to your need. Most of Hadoop code is written by Yahoo, IBM, Facebook, Cloudera.
It provides an efficient framework for running jobs on multiple nodes of clusters. Cluster means a group of systems connected via LAN. Apache Hadoop provides parallel processing of data as it works on multiple machines simultaneously.
By getting inspiration from Google, which has written a paper about the technologies. It is using technologies like Map-Reduce programming model as well as its file system (GFS). As Hadoop was originally written for the Nutch search engine project. When Doug Cutting and his team were working on it, very soon Hadoop became a top-level project due to its huge popularity.
Apache Hadoop is an open source framework written in Java. The basic Hadoop programming language is Java, but this does not mean you can code only in Java. You can code in C, C++, Perl, Python, ruby etc. You can code the Hadoop framework in any language but it will be more good to code in java as you will have lower level control of the code.
Big Data and Hadoop efficiently process large volumes of data on a cluster of commodity hardware. Hadoop is for processing huge volume of data. Commodity hardware is the low-end hardware, they are cheap devices which are very economical. Hence, Hadoop is very economic.
https://data-flair.training/blogs/hadoop-tutorial/