Name: Introducing Apache Spark for Distributed Analytics - Will Benton, Red Hat
Start: 2014-08-20T11:15:00-0500
End: 2014-08-20T12:05:00-0500

To Learn More or Register: LinuxCon North America | CloudOpen North America

Back To Schedule

Introducing Apache Spark for Distributed Analytics - Will Benton, Red Hat

Apache Spark is a compute engine for parallel and distributed computing. Spark is resilient to machine failures because each computation encodes its dependencies back to a collection on stable storage, so any intermediate result can be reproduced at any time. However, Spark is also fast because it allows these intermediate results to be cached in cluster memory. Spark also presents a productive programming model with a general, powerful abstraction that supports a wide range of analytical and query tasks.

In this talk, I'll provide a general introduction to Spark. We'll discuss the fundamental abstraction of Spark, the resilient distributed dataset, and examine Spark's rich standard libraries for machine learning, structured query, graph computations, and stream processing. We'll close with a case study showing how Spark made it easy for me to make sense of some real-world data.

Survey this Session

Speakers

Will Benton

William Benton works on distributed computing technologies at Red Hat; his recent efforts include working with the Fedora Big Data SIG as a packager and sponsor and contributing to the Spark project. His professional expertise includes research and development in the areas of static... Read More →

Wednesday August 20, 2014 11:15am - 12:05pm CDT
Colorado

LinuxCon

LinuxCon+CloudOpen North America 2014

Will Benton

Attendees (0)

LinuxCon+CloudOpen North America 2014

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Will Benton

Attendees (0)