Silicon Valley Code Camp : October 3rd and 4th 2015

Stephen Boesch

Intuit
About Stephen
I am a developer focusing on scalable apps for data pipelines and machine learning on Hadoop and Spark infrastructures. My background is in Java/Oracle/ETL from 1996 until 2011, at which point I started to focus on Hadoop, Spark and Scala. My work has been at a mix of the familiar large Internet/Systems company names and startups.
{speaker.firstName} {speaker.lastName}

Speaking Sessions

  • A Survey of Machine Learning Techniques Using Spark.ml 1.5.0

    10:45 AM Sunday   Room: SC-127
    Recent releases of Spark machine learning libraries have shifted focus from the individual algorithms approach of the spark.mllib package to the data-driven pipelines approach of spark.ml. We will look at how to structure ML processes of data loading, modeling, predictions, and results analysis and distribution using the latest spark.ml api's.

    Note: this year's session will focus only on the scala API's.

    We will touch on one or more of the algorithms in the following areas:

    • Dimensionality Reduction / Feature extraction
    • Clustering
    • Classification and Regression
    Depending on time available we may also touch on the following topics:
    • Statistical tools
    • Data generation and randomization
    • Evaluators