Performance Hadoop

Apache Hadoop Based on an open-source Java software project. Basically it is a framework to be used by applications running large clustered hardware (servers). It is designed to expand a single server machine have a very high degree of fault tolerance. Instead of high-end hardware, to the reliability of clusters derived from the software is able to recognize and manage their own failures.

Credit goes to create a Hadoop Doug Cutting and Michael J. Cafarella. Doug is a Yahoo employee found apt renamed after his son play elephant "Hadoop". Originally developed to support the Nutch search engine marketer to sort large amounts of indexes.

Hadoop is a layman's term is a way that can handle large amounts of data using a large amount of application servers. The first Google Map-Reduce created to work with large data indexing and create Yahoo! Hadoop MapReduce implementation Function for their own use.

MapReduce : The Investigation Task Framework understands and shares the work of the nodes in a cluster. Use of smaller departments work, and all work can be assigned to different nodes in a cluster. It is designed so that any failure can automatically take care of the frame.

HDFS – Hadoop Distributed File System. This is a large-scale file system that spans all nodes in a Hadoop cluster for data storage. It connects to the local node number of file systems to make them into one big file system. HDFS assumes nodes fail, so to reach the reliability of replicating data to multiple nodes.

Large amounts of data to the talk of the modern IT world, Hadoop shows the way to utilize the large data. It makes the analysis much easier in terms of terabytes of data. Hadoop framework already boast some large users, such as IBM, Google, Yahoo!, Facebook, Amazon, Foursquare, etc. EBay large applications. In fact, Facebook claims to be the largest Hadoop cluster 21PB. Hadoop private commercial data analysis, web crawling, text processing and image processing.

The majority are not in use, and most of the data of enterprises in the world do not even try to use this information to their advantage. Imagine if you could keep all the data generated by the business and if there was a way to analyze the data. Hadoop brings this power to an enterprise.

Source by Aditi Tiwari

Leave a Reply

Your email address will not be published. Required fields are marked *