Silicon Valley Code Camp : October 3rd and 4th 2015
I have been in the software industry for 20+ years. In the past 4 years I have worked on cloud based distributed systems for Netflix and Stubhub. Most of these systems have been associated with data delivery and management. Some of the systems I have with with are: Solr, Cassandra, Zookeeper, Hadoop and StubHubs Big Data Platform. All these systems require a different approach in design,development and access patterns. However, there are underlying patterns in each of these systems. And these patterns radically change the approach of developers and users of these systems. Without them, building and deploying distributed systems, especially at scale, becomes exceedingly problematic. I wish to share some of these design and deployment patterns as well as their programming models with the folks at SVCC.
Over the past few years data has gone from the back office to center stage. Businesses realize that data and the business intelligence gained from it is one of the most valuable assets it possesses. Consequently data has migrated from vertical big iron systems to distributed horizontally scaleable systems with the potential for massive parallel processing.
And that is not all. Data systems, once used for MapReduce and long running batch processing are now processing data at near real time. The innovations in this area are foundational and seismic. Newer data systems will provide personalization, recommendations, business insights in near real time.
However, building, maintaining and managing these systems is non-trivial. There is a paradigm shift when dealing with distributed and partitioned data: how do we provide a consistent view of the distributed data; how do we maintain 24/7 availability; and how do we handle unexpected failures - and failures are always unexpected.
This talk goes into some of the principles and practices in distributed data systems, from messaging systems like Kafka to in-memory file systems and distributed caches. And it suggests best of breed practices to apply when building your own distributed system.