abstract / cfp / submissions / WIP session /  workshop registration / committees
				      PDSW24 Reproducability Addendum
				      SUBMISSION DEADLINE EXTENDED: AUG 9, 2024 - final deadline
				       
                      
                     
                      Invited speaker: DR. İlkay AltintaŞ, University of California, San Diego
                      
                       
                      
					  
                    
 
						  
				    
                                                   
                        
                      agenda                      
                     
                      
                      Any additional genda information, slides and abstracts will be posted here as soon as it becomes available. You will also be able to view the official agenda on the SC workshop page for the latest information and abstracts for each of the talks at a future date.
                      
                
                      
                        | 9am-9:10am | 
                        PDSW 2024 Welcome 
                        Bing Xie, Microsoft  
                        Slides
  | 
                      
                      
                        | INVITED TALK: | 
                        
                      
                        | 9:10am- 10am | 
                        Invited Speaker:  
                        Bridging the Data Gaps in Computing for Science,  
                        Education and Society 
                        Dr. İlkay Altintaş, University of California, San Diego 
                        Slides                         
  | 
                      
                      
                        | MAIN SESSION: | 
                        
                      
                        | 10am- 10:30am | 
                        Morning Break
 
  | 
                      
                      
                        | 10:30am- 11am | 
                        Fault-Tolerant Deep Learning Cache with Hash Ring for Load Balancing in HPC Systems 
Seoyeong Lee,  Sogang University 
Awais Khan, Oak Ridge National Laboratory (ORNL) 
Yoochan Kim,  Sogang University, South Korea 
Junghwan Park,  Sogang University, South Korea 
Soon Hwang, Sogang University, South Korea  
Jae-Kook Lee,  Korea Inst of Science and Technology Information (KISTI) 
Taeyoung Hong, Korea Inst of Science and Technology Information (KISTI) 
Chris Zimmer, Oak Ridge National Laboratory (ORNL) 
Youngjae Kim, Sogang University, South Korea 
Paper | Slides 
  | 
                      
                      
                        | 11am- 11:30am | 
                        MOSAIC: Detection and Categorization of I/O Patterns in HPC Applications 
Théo Jolivel, French Institute for Research in Computer Science and Automation (INRIA) 
François Tessier, INRIA 
Julien Monniot, INRIA 
Guillaume Pallez, INRIA 
Paper | Slides 
                        
  | 
                      
                      
                        | 11:30am- 12pm | 
                        Exploring DAOS Interfaces and Performance 
Nicolau Manubens Gil, European Centre for Medium-Range Weather Forecasts (ECMWF) 
Johann Lombardi, DAOS Foundation 
Simon Smart, ECMWF 
Emanuele Danovaro ECMWF 
Tiago Quintino, ECMWF 
Dean Hildebrand, Google Cloud 
Adrian Jackson, EPCC, The University of Edinburgh  
Paper | Slides 
                        
  | 
                      
                      
                        | 12pm- 12:05pm | 
                        [WiP] Scalable RPC Layer Towards Millions of IOPS per Server 
                          Hiroki Ohtsuji, Fujtisu Limited 
                          Munenori Maeda, Fujtisu Limited 
                          Reika Kinoshita, Fujtisu Limited 
                          Masahiro Miwa, Fujtisu Limited 
Osamu Tatebe, (University of Tsukuba 
                        Abstract | Slides 
                        
  | 
                      
 
                      
                        | 12:05pm- 12:10pm | 
                        [WiP] Reducing I/O Bottleneck for Pretraining AI Foundation Models for Climate 
                          Gabriele Padovani, University of Trento, Italy 
Awais Khan, Oak Ridge National Laboratory, USA 
Sandro Fiore, University of Trento, Italy 
Valentine Anantharaj, Oak Ridge National Laboratory, USA 
                        Abstract | Slides 
                        
  | 
                      
                      
                        | 12:10pm- 12:15pm | 
                        [WiP] BULKI - Binary Unified Layout for Key-value Interchange 
                          Wei Zhang, Lawrence Berkeley National Laboratory  
                          Houjun Tang, Lawrence Berkeley National Laboratory 
                          Suren Byna, The Ohio State University 
                        Abstract | Slides 
                        
  | 
                      
                      
                        | 12:15pm- 12:20pm | 
                        [WiP] Distributed, Resilient and In-Memory Storage of Key-Value Data for HPC 
                          Rüdiger Nather, University of Kassel, Germany 
                          Mia Reitz, University of Kassel, Germany 
                          Claudia Fohry, University of Kassel, Germany 
                        Abstract | Slides 
                        
  | 
                      
                      
                        | 12:20pm- 12:25pm | 
                        [WiP] A Global In-Memory Cache and Computation Tier for DAOS 
                          John Byrne, Hewlett Packard Enterprise, HPE  
                          Clarete Crasta, Hewlett Packard Enterprise, HPE  
                          Abhishek Dwaraki, Hewlett Packard Enterprise, HPE 
                          David Emberson, Hewlett Packard Enterprise, HPE 
                          Harumi Kuno, Hewlett Packard Enterprise, HPE 
                          Sekwon Lee, Hewlett Packard Enterprise, HPE 
                          Sharad Singhal, Hewlett Packard Enterprise, HPE 
                          Ramya Ahobala Rao, Hewlett Packard Enterprise, HPE 
                          Shreyas Vinayaka Basri K S, Hewlett Packard Enterprise, HPE 
                          Amitha C, Chinmay Ghosh, Hewlett Packard Enterprise, HPE 
Rishi Kesh Rajak, Hewlett Packard Enterprise, HPE 
Sriram Ravishankar, Hewlett Packard Enterprise, HPE 
Porno Shome, Hewlett Packard Enterprise, HPE 
Lance Evans, Hewlett Packard Enterprise, HPE 
Sherin George, Hewlett Packard Enterprise, HPE  
Kevan Rehm, Hewlett Packard Enterprise, HPE 
Myungjun (MJ) Son, Hewlett Packard Enterprise, HPE 
Taeklim Kim, Hewlett Packard Enterprise, HPE  
Shiyue (Jason)
Hou, Hewlett Packard Enterprise, HPE 
                        Abstract | Slides 
                        
  | 
                      
                      
                        | 12:25pm- 12:30pm | 
                        [WiP] Are Streaming Engines and Vector Databases Integrated Well? 
                          Yeonwoo Jeong, Sogang University, Republic of Korea 
Sungyong Park, Sogang University, Republic of Korea 
                        Abstract | Slides 
                        
  | 
                      
                      
                        | 12:30pm- 2pm | 
                        Lunch Break | 
                      
                      
                        | 2pm-2:30pm | 
                        Initial Experiences With DAOS Object Storage on Aurora 
Rob Latham,  Argonne National Laboratory 
Robert Ross,  Argonne National Laboratory 
Phillip Carns,  Argonne National Laboratory 
Shane Snyder,  Argonne National Laboratory 
Kevin Harms,  Argonne National Laboratory 
Kaushik Velusamy,  Argonne National Laboratory 
Paul Coffman,  Argonne National Laboratory 
Gordon McPheeters, Argonne National Laboratory 
Paper | Slides  
                        
  | 
                      
                      
                        | 2:30pm-3pm | 
                         Understanding and Predicting Cross-Application I/O Interference in HPC Storage Systems 
Chris Egersdoerfer,  University of Delaware 
Hasanur Rashid,  University of Delaware 
Dong Dai, University of Delaware 
Bo Fang,  Pacific Northwest National Laboratory (PNNL)  
Nathan Tallent, Pacific Northwest National Laboratory (PNNL)  
Paper | Slides  
                        
  | 
                      
                      
                        | 3pm-3:30pm | 
                        Afternoon Break 
                           
                         | 
                      
                      
                        | 3:30pm-4pm | 
                        Copper: Cooperative Caching Layer for Scalable Data Loading in Exascale Supercomputers 
                          Noah Lewis, Ohio State University 
                          Kevin Harms,  Argonne National Laboratory  
                          Kaushik Velusamy,  Argonne National Laboratory  
                          Huihuo Zheng, Argonne National Laboratory  
                        Paper | Slides 
                        
  | 
                      
                      
                        | 4pm-4:05pm | 
                        [WiP] Jarvis: Towards a Shared, User- Friendly, and Reproducible, I/O Infrastructure 
                          Jaime Cernuda,  
                          Luke Logan, Illinois Institute of Technology 
                          Noah Lewis, Illinois Institute of Technology 
                          Suren Byna, The Ohio State University  
                          Xian-He Sun, The Ohio State University   
                          Anthony Kougkas, Illinois Institute of Technology 
                        Abstract | Slides 
                        
  | 
                      
                      
                        | 4:05pm- 4:10pm | 
                        [WiP] DAOS Project Update - One Year in the DAOS Foundation 
                          Michael Hennecke, Intel Corporation 
                          Johann Lombardi, DAOS Foundation 
                        Abstract | Slides 
                        
  | 
                      
                      
                        | 4:10pm- 4:15pm | 
                        [WiP] Improving SQL Query Execution of Distributed Query Engines on Object- Based Computational Storage through Multi-Layere... 
                          Soon Hwang, Sogang University, Republic of Korea 
                          Junhyeok Park, Sogang University, Republic of Korea  
                          Junghyun Ryu, Sogang University, Republic of Korea  
                          Jungahn Park, Memory System Research, SK hynix Inc. 
                          Jeongjin Lee, Memory System Research, SK hynix Inc. 
                          Jungki Noh, Memory System Research, SK hynix Inc. 
                          Soonyeal Yang, Memory System Research, SK hynix Inc. 
                          Woosuk Chung, Memory System Research, SK hynix Inc. 
Youngjae Kim, Sogang University, Republic of Korea 
                          Abstract | Slides 
                        
  | 
                      
                      
                        | 4:15pm- 4:20pm | 
                        [WiP] Lustre for Grace Hopper: Current Status Report 
                          Sohei Koyama, DataDirect Networks, Japan 
Shuichi Ihara, DataDirect Networks, Japan 
                        Abstract | Slides 
                        
  | 
                      
                      
                        | 4:20pm- 4:25pm | 
                        [WiP] Exploring the Proactive Data Containers Runtime System in VAST - A Case Study 
                          Jean Luca Bez (Lawrence Berkeley National Laboratory)  
                          Suren Byna (The Ohio State University)  
                        Abstract | Slides 
                        
  | 
                      
                      
                        | 4:25pm- 4:30pm | 
                        [WiP] Silent Errors to Scientific Applications: Impacts of PFS Metadata Corruptions 
                          Dong Dai (University of Delaware)  
                          Mai Zheng (Iowa State University)  
                          Bo Fang (Pacific Northwest National Laboratory (PNNL) 
                        Abstract | Slides                         
  | 
                      
                      
                        | 4:30pm- 4:35pm | 
                        [WiP] When Stream Processing Engine Meets Log-structured Merge-tree as State Store 
                          Kyuli Park, Sogang University, Republic of Korea  
                          Sungyong Park, Sogang University, Republic of Korea  
                        Abstract | Slides                         
  | 
                      
                      
                        | 4:35pm - 5:30pm | 
                        Panel: Data, Data Everywhere 
                          Moderator: Kathryn Mohror, 
                        Lawrence Livermore Lab 
                        Panelists: 
                        Laura Biven, Jefferson Lab  
                        Eli Dart, Lawrence Berkeley National Laboratory  
                        Sarp Oral, Oak Ridge National Laboratory  
                        Manish Parashar, University of Utah 
Adam Thompson, NVIDIA 
 
                         | 
                      
   
                      
                      
                      
                      WORKSHOP ABSTRACT
                      
                      
                      We are excited to announce the 9th International Parallel Data Systems Workshop (PDSW’24), to be held in conjunction with SC24: The International Conference for High Performance Computing, Networking, Storage, and Analysis, in Atlanta, GA. PDSW’24 builds upon the rich legacy of its predecessor workshops, the Petascale Data Storage Workshop (PDSW, 2006–2015) and the Data Intensive Scalable Computing Systems (DISCS, 2012–2015) workshop. Since their successful merger in 2016, the joint workshop has drawn an average of 200 attendees annually. 
                      The increasing importance of efficient data storage and management continues to drive scientific productivity across traditional simulation-based HPC environments and emerging Cloud, AI/ML, and Big Data analysis frameworks. Challenges are compounded by the rapidly expanding volumes of experimental and observational data, the growing disparity between computational and storage hardware performance, and the rise of novel data-driven algorithms in machine learning. This workshop aims to advance research and development by addressing the most pressing challenges in large-scale data storage and processing. 
                      We invite the community to contribute original research manuscripts that introduce and evaluate novel algorithms or architectures, share significant scientific case studies or workloads, or assess the reproducibility of previously published work. We emphasize the importance of community collaboration for problem identification, workload capture, solution interoperability, standardization, and shared tools. Authors are encouraged to provide comprehensive experimental environment details (software versions, benchmark configurations, etc.) to promote transparency and facilitate collaborative progress. 
                      Topics of Interest: 
                      
                        - Scalable Architectures: Distributed data storage, archival, and virtualization. 
 
                        -  New Data Processing Models and Algorithms: Application of innovative data processing models and algorithms for parallel computing and analysis. 
 
                        -  Performance Analysis: Benchmarking, resource management, and workload studies. 
 
                        -  Cloud and Container-Based Models: Enabling cloud and container-based frameworks for large-scale data analysis. 
 
                        -  Storage Technologies: Adaptation to emerging hardware and computing models. 
 
                        -  Data Integrity: Techniques to ensure data integrity, availability, reliability, and fault tolerance. 
 
                        -  Programming Models and Frameworks: Big data solutions for data-intensive computing. 
 
                        -  Hybrid Cloud Data Processing: Integration of hybrid cloud and on-premise data processing. 
 
                        -  Cloud-Specific Opportunities: Data storage and transit opportunities specific to cloud computing. 
 
                        -  Storage System Programmability: Enhancing programmability in storage systems. 
 
                        -  Data Reduction Techniques: Filtering, compression, and reduction techniques for large-scale data. 
 
                        -  File and Metadata Management: Parallel file systems, metadata management at scale. 
 
                        -  In-Situ and In-Transit Processing: Integrating computation into the memory and storage hierarchy for in-situ and in-transit data processing.
 
                        -  Alternative Storage Models: Object stores, key-value stores, and other data storage models. 
 
                        -  Productivity Tools: Tools for data-intensive computing, data mining, and knowledge discovery. 
 
                        -  Data Movement: Managing data movement between compute and data-intensive components. 
 
                        -  Cross-Cloud Data Management: Efficient data management across different cloud environments.
 
                        -  AI-enhanced Systems: Storage system optimization and data analytics using machine learning. 
 
                        -  New Memory and Storage Systems: Innovative techniques and performance evaluation for new memory and storage systems. 
                          
                         
                      
                      
                      CALL FOR PAPERS
                       
                      Call for papers available now (pdf). 
                        
                      
                      
                      Regular paper SUBMISSIONS
                        
                      
                      All submissions to the PDSW’24 will undergo a rigorous double-anonymous peer review process overseen by the workshop program committee. Successful submissions will be published in the SC24 Workshop Proceedings and featured on the workshop website alongside associated talk slides.  
                      Template and Submission
                        
                      
                      
                        -                         A full paper up to 6 pages in length, excluding references and AD/AE appendices. 
 
                        -  Artifact Description (AD) Appendix is mandatory and Artifact Evaluation (AE) Appendix is optional.  
                          
                            - AD due: Aug 16th, 2024, 11:59 PM AoE - DEADLINE EXTENDED
 
                            -  Submissions with AD and AE Appendix will be considered favorably for the PDSW Best Paper award. 
 
                          
                         
                        -  Papers must adhere to the IEEE proceedings template. Download it here. 
 
                        -  EXTENDED FINAL DEADLINE - Submit your papers by Aug 9th, 2024, 11:59 PM AoE at https://submissions.supercomputing.org/
 
                      
                      Reproducibility Initiative 
                      
                        Aligned with the SC24 Reproducibility Initiative, we encourage detailed and structured artifact descriptions (AD) using the SC24 format.  The AD should include a field for one or more links to data (zenodo, figshare, etc.) and code (Github, GitLab, Bitbucket, etc.) repositories. For the artifacts that will be placed in the code repository, we encourage authors to follow the PDSW 2024 Reproducibility Addendum on how to structure the artifact, as it will make it easier for the reviewing committee and readers of the paper in the future. 
                      
                      Deadlines - Regular Papers and Reproducibility Study Papers
                      
                      Submissions website: https://submissions.supercomputing.org/
                      Submissions due: EXTENDED DEADLINE - Aug 9th, 2024, 11:59 PM AoE 
                        AD due: EXTENDED DEADLINE - Aug 16th, 2024, 11:59 PM 
                        AoE 
                        Paper Notification: Sep 6th, 2024, 11:59 PM AoE 
                        Camera ready due: Sep 27th, 2024, 11:59 PM AoE 
                        Final AD/AE due: Oct 15, 2024, 11:59 PM AoE
                        
                        Copyright forms due: TBD
                        Slides due before workshop: TBD
                        
                      
                      
                      Work In Progress (WIP) Session                      
                      
                      The WIP session will showcase brief 5-minute presentations on ongoing work that may not yet be ready for a full paper submission. WIP papers will not be included in the proceedings. A one-page abstract is required for participation.
                      Submissions due:  Sept 13th, 2024, 11:59PM AoE
WIP Notification:  On or before Sept 21st, 2024
                      
                     
						
						
						
                      
                      Workshop Registration
                        
                        
                      
                      Registration opens July 10, 2024. To allow you to prepare, find further details on registration pricing, and policies affecting registration changes and cancellations.
                        
                      
                      
                      PDSW 24  Committee Members:                      
                        
                      
                      Technical Committee                      
                      
                        - Jalil Boukhobza, University of Western Brittany, France 
 
                        - Wei Der Chen, The University of Edinburgh 
 
                        - Dong Dai, University of North Carolina at Charlotte 
 
                        - Hariharan Devarajan, Lawrence Livermore National Lab 
 
                        - Andreas Dilger, Whamcloud 
 
                        - Kira Duwe, EPFL, Switzerland  
 
                        - Qian Gong, Oak Ridge National Laboratory 
 
                        - Velusamy Kaushik, Argonne National Laboratory 
 
                        - Youngjae Kim, Sogang University  
 
                        - Johann Lambardi, DAOS 
 
                        - Xiaoyi Lu, University of California, Merced 
 
                        - Preeti Malakar, Indian Institute of Technology, Kanpur 
 
                        - Qizhong Mao, Bytedance Inc 
 
                        - Sarah Neuwirth, Habilitation Candidate at Goethe University 
 
                        - Joao Paulo, INESC TEC 
 
                        - M. Mustafa Rafique, Rochester Institute of Technology 
 
                        - Woong Shin, Oak Ridge National Laboratory 
 
                        - Masahiro Tanaka, Microsoft 
 
                        - Osamu Tatebe, University of Tsukuba 
 
                        - Lipeng Wan, Georgia State University 
 
                        - Wei Zhang, Lawrence Berkeley National Laboratory 
 
                        - Qing Zheng, Los Alamos National Lab  
 
                        - Mai Zheng, Iowa State University
 
                      
                      Steering Committee
                      
                        -  John Bent, Cray 
 
                        -  Ali R. Butt, Virginia Tech 
 
                        -  Philip Carns, Argonne National Laboratory 
 
                        -  Shane Canon, Lawrence Berkeley National Laboratory 
 
                        -  Raghunath Raja Chandrasekar, Amazon Web Services 
 
                        -  Yong Chen, Texas Tech University 
 
                        -  Evan J. Felix, Pacific Northwest National Laboratory
 
                        -  Gary Grider, Los Alamos National Laboratory 
 
                        -  William D. Gropp, University of Illinois at Urbana-Champaign 
 
                        -  Dean Hildebrand, Google
 
                        - Shadi Ibraim, Inria, France 
 
                        -  Dries Kimpe, KCG, USA 
 
                        -  Glenn Lockwood, Lawrence Berkeley National Laboratory 
 
                        -  Jay Lofstead, Sandia National Laboratories 
 
                        -  Xiaosong Ma, Qatar Computing Research Institute, Qatar 
 
                        -  Carlos Maltzahn, University of California, Santa Cruz 
 
                        -  Suzanne McIntosh, New York University 
 
                        -  Kathryn Mohror, Lawrence Livermore National Laboratory 
 
                        -  Robert Ross, Argonne National Laboratory
 
                        -  Philip C. Roth, Oak Ridge National Laboratory 
 
                        - Kento Sato, Riken, Japan
 
                        -  John Shalf, NERSC, Lawrence Berkeley National Laboratory
 
                        -  Xian-He Sun, Illinois Institute of Technology 
 
                        -  Rajeev Thakur, Argonne National Laboratory 
 
                        -  Lee Ward, Sandia National Laboratories 
 
                        -  Brent Welch, Google 
 
                        - Amelie Chi Zhou, Hong Kong Baptist University, China