Hadoop Development Program is a one stop course that introduces you to the domain of hadoop development as well as gives you technical know how of the same . At the end of the course you will be able to earn a credential of hadoop developer and you will be capable of dealing with terabyte scale of data and analyze it successfully using map reduce.
Who this course is for and not for?
For: Typically professionals with basic knowledge of software development, programming languages, and databases will find this course really helpful. Basic knowledge should be enough to succeed at this course.
Not For: Students who are absolute beginners at software development as a discipline will find it difficult to follow the course.
What You Get?
Who should attend?
Architects and developers, who wish to write, build and maintain Apache Hadoop jobs. This tutorial is appropriate for developers, managers, architects, or anyone who wants or needs to learn more about the complex and rapidly changing world of big data solutions.
This course is for professionals who either have some Java knowledge or Business Intelligence (ETL or DW) knowledge
Course Prerequisites: The participants should have basic understanding or knowledge of java and linux.
Why is Hadoop popular?
Hadoop's popularity is partly due to the fact that it is used by some of the world's largest Internet businesses to analyze unstructured data. Hadoop enables distributed applications to handle data volumes in the order of thousands of extra bytes.
Where does Hadoop find applicability in business?
Hadoop, as a scalable system for parallel data processing, is useful for analyzing large data sets. Examples are search algorithms, market risk analysis, data mining on online retail data, and analytics on user behaviour data.
Hadoop's scalability makes it attractive to businesses because of the exponentially increasing nature of the data they handle. Another core strength of Hadoop is that it can handle structured as well as unstructured data, from a variable number of sources.
What is Course Pre-Requisites?
The participants should have basic understanding or knowledge of java and linux. Prior knowledge of Hadoop is not required
I have a lot of data, but how do I know if it's "Big Data?"
Every company that has data likely has "Big Data," and it grows continuously. Big Data is any type of data, including structured and unstructured data such as log files, customer service information, retail data, text, database information and so on. All of this data can now be analyzed in aggregate, across types and formats, to help make more informed business decisions and drive new solutions
How much Hands-on is involved?
Around 50% of the training time is dedicated to the Hands-On training. Altogether 11 exercises are lab based.
What platforms and Java versions does Hadoop run on?
What kind of hardware scales best for Hadoop?
The short answer is dual processor/dual core machines with 4-8GB of RAM using ECC memory, depending upon workflow needs. Machines should be moderately high-end commodity machines to be most cost-effective and typically cost 1/2 - 2/3 the cost of normal production application servers but are not desktop-class machines. This cost tends to be $2-5K. For a more detailed discussion, see Machine Scaling page.