pdsw-DISCS 2016:

1st Joint International Workshop on Parallel Data Storage & Data Intensive Scalable Computing Systems


held in conjunction with SC16

monday, November 14, 2016
Salt Lake City, UT


Program Co-Chairs:

Lawrence Berkeley National Laboratory


IBM
General Co-Chairs:

Carnegie Mellon University


Texas Tech University

KEYNOTE SPEAKER:
ion stoica, uC Berkeley


Trends and Challenges in Big Data Processing


abstract
: Almost six years ago we started the Spark project at UC Berkeley. Spark is a cluster computing engine that is optimized for in-memory processing, and unifies support for a variety of workloads, including batch, interactive querying, streaming, and iterative computations. Spark is now the most active big data project in the open source community, and is already being used by over one thousand organizations. In this talk, I'll take a look back at Spark's humble beginning, the lessons we learned, and its success as a unified system. Furthermore I'll outline the hardware and software trends, as well as challenges and the research opportunities. [slides - coming soon]

speaker bio: Ion Stoica is a Professor in the EECS Department at University of California at Berkeley. He does research on cloud computing and networked computer systems. Past work includes the Dynamic Packet State (DPS), Chord DHT, Internet Indirection Infrastructure (i3), declarative networks, replay-debugging, and multi-layer tracing in distributed systems. He is an ACM Fellow and has received numerous awards, including the SIGCOMM Test of Time Award (2011), and the ACM doctoral dissertation award (2001). In 2006, he co-founded Conviva, a startup to commercialize technologies for large scale video distribution, and in 2013, he co-founded Databricks a startup to commercialize Apache Spark.