So you have been hearing the talk about Big Data and you have decided to did a little deeper and find out what it means to you. Well then this is the session for you. In this session, we will first attempt to define what is Big Data. From there, we will explore not only the need for big data but also promise big data has for the technology world. We will finish by talking about the technology used to harness this data. We will talk about some of the parallel technology like Cassandra, MongoDB and Big Table but will focus deeper on Hadoop and its companion technologies like Hive and Pig. There will be some demos but this talk will serve as the introduction to the more in depth talks I will be doing on Hadoop.
In this discussion we will be talking about using Hadoop for Big Data. We will talk about how all the pieces fit together, (HDFS, Map Reduce, Pig, Hive, and Zookeeper) We will talk about the different variations (Cloudera, Hortonworks) and finish will discussing the Hortonworks implementation on the Microsoft Cloud (HDInsight)
One of the HDInsight key components is Mahout, a scalable machine learning library that provides a number of algorithms relying on the Hadoop platform. In this session we will talk about using Mahout in HDInsight, the R programing language, many of the use cases for machine learning while we show a few examples.
- Not Interested