What is Hadoop ?
Hadoop is a framework written in
Java for running applications on large clusters of commodity hardware
and incorporates features similar to those of the Google File System and of MapReduce. HDFS
is a highly fault-tolerant distributed file system and like Hadoop
designed to be deployed on low-cost hardware. It provides high
throughput access to application data and is suitable for applications
that have large data sets.
Hadoop - Overview
• Hadoop includes:
– Distributed File System - distributes data
– Map/Reduce - distributes application
• Open source from Apache
• Written in Java
• Runs on
– Linux, Mac OS/X, Windows, and Solaris
– Commodity hardware
– Distributed File System - distributes data
– Map/Reduce - distributes application
• Open source from Apache
• Written in Java
• Runs on
– Linux, Mac OS/X, Windows, and Solaris
– Commodity hardware