Call for Papers

Short summary
We seek 4-page position papers and/or 1-page lightning talk proposals that involve database techniques for high-peformance computing applications.  Lightning talks will be 5 minutes and should be as provocative as possible.  Authors of accepted position papers will give a 25 minute talk. 
 

Full papers: October 1 

Notification: October 17

Workshop: November 11 
 

Submission site

 
Description
The high-performance computing (HPC) community is facing significant data challenges: the performance of simulations on evolving leadership class computing architectures are increasingly dominated by the costs of data access, movement, transformation and analysis on the HPC system. These problems are expected to only get worse as we move towards exascale computing. The database community has developed a collection of approaches that have allowed them to effectively meet similar data challenges for the commercial sector by emphasizing a rigorous data model, a simple but expressive query algebra, cost-based optimization, declarative query languages, and logical and physical data independence. This workshop is focused on bringing together the HPC and database communities in an effort to facilitate discussions that will lead to both a greater awareness of each other and eventually solutions to the myriad of data problems facing high performance computing. Among others, workshop participants will discuss the following questions:

  • What are the appropriate data models for HPC data? (arrays, structured and unstructured meshes, graphs, trees, relations?)
  • Which critical systems and subsystems found in HPC environments would benefit from features typically associated with databases? (e.g., filesystems equipped with indexing and query optimization; monitoring implemented as stream queries)
  • Can declarative query languages make HPC systems and datasets accessible to a new class of data-oriented scientists?
  • The hallmark of the database community is to “push the computation to the data,” insulating users and applications from details of data representation, scale, system architecture, and evaluation method, while affording runtime optimization opportunities unavailable to compile-time techniques. In what other HPC contexts might this general approach be applied?
  • Which science domains or specific HPC applications are particularly well-suited to this approach?

 

Similarly, the database community is building new data management systems for the massive-size datasets that users are increasingly accumulating. Today, however, most of the work in that community focuses on commodity, shared-nothing systems with some extensions toward incorporating new hardware advances (e.g., SSDs or GPUs). The database community could greatly benefit from interactions with the HPC community and the development of new database systems capable of leveraging the power of HPC platforms.

 
 
Position Papers
In this workshop, we invite 4 page position papers that define or clarify the data challenges facing HPC, explore the design space between the two communities, or describe work in progress bridging these communities. Applications and platforms of interest to the HPC community will drive the focus of the workshop. We seek new approaches to the problems in these areas as opposed to providing yet another broad forum for big data or “how fast can I write data to disk” discussions. In particular, we request that each position paper include a section considering how the work could be deployed in the context of leadership-class computing platforms OR address a specific application of interest in HPC (e.g., an application involving simulation, visualization, or other typical HPC area.)
 
Lightning talks
We also solicit 1-page lightning talk proposals.  These talks should start from a brief provocation designed to engender discussion.  Outrageous ideas specifically encouraged, e.g., "All HPC work should be done in the commercial cloud."  "HPC platforms should run relational databases."  "Relational technology has no place in scientific computing."
 
Topics of Interest

  • New techniques for exascale IO
  • Data models and query algebras for HPC (arrays, meshes, graphs, images)
  • Query languages for parallel processing
  • HPC “data challenges”
  • Distributed data structures
  • Extending file-based systems with database features
  • DBMS on massively multicore platforms
  • Very-low footprint and main-memory DBMS architectures
  • In situ analysis via streaming and continuous queries
  • Column-stores for computational science
  • Databases and high-performance visualization
  • Simulations and linear algebra as queries
  • Batch query processing and multiple-query optimization

 
Submission Guidelines We invite position papers of no more than 4 pages, following ACM conference formatting guidelines. All submissions in PDF format. A collection of the best papers may be invited to a special issue of a journal to be determined.  Workshop information is also available on the Supercomputing 2012 website
 
Submission site
 
Full papers: October 1 
Notification: October 17
Workshop: November 11 
 
Organizers Bill Howe (University of Washington), Kirsten Kleese-Van Dam (Pacific Northwest National Lab), Terence Critchlow (Pacific Northwest National Labs), Magda Balazinska (University of Washington),  Jeff Gardner (University of Washington)