2nd Parallel Data Storage Workshop

held in conjunction with
Supercomputing '07

Chair: Garth Gibson, CMU

Sunday, November 11, 2007
8:30AM - 5:00PM
Atlantis Hotel-Ballroom E
Reno, Nevada

SC07 Workshop Web Page

ACM Digital Library Proceedings

workshop abstract

Petascale computing infrastructures make petascale demands on information storage capacity, performance, concurrency, reliability, availability, and manageability. The last decade has shown that parallel file systems can barely keep pace with high performance computing along these dimensions; this poses a critical challenge when near-future petascale requirements are considered. This recurring one-day workshop focuses on the data storage problems and emerging solutions found in petascale scientific computing environments, with special attention to issues in which community collaboration can be crucial, problem identification, workload capture, solution interoperability, standards with community buy-in, and shared tools.


All papers presented at this workshop are also online at the ACM Digital Library
(table of contents of the procedings).

8:30am - 9:00am
Petascale Data Storage Workshop Introduction

Garth Gibson

9:00am - 10:20am
SESSION I: Scalable Systems

On Application-level Approaches to Avoiding TCP Throughput Collapse in Cluster-Based Storage Systems
E. Krevat (presenter), V. Vasudevan, A. Phanishayee, D. Andersen, G. Ganger, G. Gibson, S. Seshan, Carnegie Mellon University
Paper / Slides

pNFS/PVFS2 over InfiniBand: Early Experiences
Lei Chai, Xiangyong Ouyang, Ranjit Noronha (presenter) and Dhabaleswar K. Panda,
Ohio State University
Paper / Slides

Integrated System Models for Reliable Petascale Storage Systems
Brent Welch (presenter), Panasas, Inc.
Paper / Slides

Scalable Locking and Recovery for Network File Systems
Peter Braam, Byron Neitzel (presenter), Sun/Cluster File Systems

10:30am - 11:00am
POSTER SESSION 1 - see info below
11:00am - 12:20pm
SESSION II: Scalable Services

Searching and Navigating Petabyte Scale File Systems
Based on Facets
Jonathan Koren (presenter), Yi Zhang, Sasha Ames, Andrew Leung, Carlos Maltzahn, Ethan Miller, Univ. of California, Santa Cruz
Paper / Slides

GIGA+: Scalable Directories for Shared File Systems
Swapnil V. Patil (presenter), Garth A. Gibson, Sam Lang, Milo Polte, Carnegie Mellon University
Paper / Slides / Poster

End-to-end Performance Management for Scalable
Distributed Storage

D. Bigelow, S. Iyer, T. Kaldewey, R. Pineiro, A. Povzner, S. Brandt, R. Golding (presenter), T. Wong,C. Maltzahn, Univ. of California, Santa Cruz, IBM-Almaden

RADOS: A Scalable, Reliable Storage Service for
Petabyte-scale Storage Clusters

Sage A. Weil (presenter), Andrew W. Leung, Scott A. Brandt, Carlos Maltzahn, Univ. of California, Santa Cruz

12:30pm - 2:00pm
2:00pm - 3:20pm
SESSION III: Scalable Behaviors

A Result-Data Offloading Service for HPC Centers
Henry Monti (presenter), Ali R. Butt, Sudharshan S. Vazhkudai, Virginia Tech
Paper / Slides / Poster

Characterizing the I/O Behavior of Scientific Applications on the Cray XT
Philip Roth (presenter), Oak Ridge National Laboratory
Paper / Slides

Towards an I/O Tracing Framework Taxonomy
Andrew Konwinski (presenter), John Bent, Meghan Quist, James Nunez, Los Alamos National Laboratory
Paper / Slides / Poster

A Data Placement Service for Petascale Applications
Ann L. Chervenak (presenter), Robert Schuler, USC Information Sciences Institute
Paper / Slides

3:30pm - 4:00pm

Henry Newman, Instrumental Inc. -- Error Management and Storage Reliability in the Data Path

LBNL/NERSC -- Reliability Results of NERSC Systems

James Lentini, NetApp -- Status of NFS over RDMA in Linux

James Nunez, LANL -- New Failure Data Releases

Bianca Schroeder, U. of Toronto & CMU -- Computer Failure Data Repository

Evan Felix, PNNL -- fsstats Data Release

David Brown, PNNL -- Debian Lustre and PVFS Repository

4:00pm - 5:00pm


All presented papers and PDSW committee members were invited to bring a poster. The following additional posters were also invited.


Garth A. Gibson, Carnegie Mellon University and Panasas Inc.
Darrell Long, University of California, Santa Cruz
Peter Honeyman, University of Michigan, Ann Arbor,
    Center for Information Technology Integration
Gary A. Grider, Los Alamos National Laboratory
William T.C. Kramer, National Energy Research Scientific Computing Center,
   Lawrence Berkeley National Laboratory
Philip C. Roth, Oak Ridge National Laboratory
Evan J. Felix, Pacific Northwest National Laboratory
Lee Ward, Sandia National Laboratory

Other Workshops & Panels of Interest at SC07

Parallel Network File System (pNFS) BOF Session