When
10:45 AM Sunday
Where
SC-127
Silicon Valley Code Camp : October 3rd and 4th 2015session

A Survey of Machine Learning Techniques Using Spark.ml 1.5.0

This session will provide an overview of using several core algorithms and performing common machine learning operations using the preferred Pipelines architecture of the latter releases of Spark ml/mllib. This session will focus on *Scala* API's.

About This Session

Recent releases of Spark machine learning libraries have shifted focus from the individual algorithms approach of the spark.mllib package to the data-driven pipelines approach of spark.ml. We will look at how to structure ML processes of data loading, modeling, predictions, and results analysis and distribution using the latest spark.ml api's.

Note: this year's session will focus only on the scala API's.

We will touch on one or more of the algorithms in the following areas:

  • Dimensionality Reduction / Feature extraction
  • Clustering
  • Classification and Regression
Depending on time available we may also touch on the following topics:
  • Statistical tools
  • Data generation and randomization
  • Evaluators

Time: 10:45 AM Sunday    Room: SC-127 

The Speaker(s)

undefined undefined

Stephen Boesch

Scala/Spark/Machine Learning Developer , Intuit

I am a developer focusing on scalable data pipelines and machine learning apps on Spark and Hadoop