The Changing Paradigm of Data-Intensive Computing

Richard T. Kouzes, Gordon A. Anderson, Stephen T. Elbert, Ian Gorton, and
Deborah K. Gracio, Pacific Northwest National Laboratory

Through the development of new classes of software, algorithms, and hardware, dataintensive applications provide timely and meaningful analytical results in response to exponentially growing data complexity and associated analysis requirements.

The continued exponential growth of computational power, data-generation sources, and communication technologies is giving rise to a new era in information processing: data-intensive computing.

According to a 2004 study on data management for science by the US Department of Energy (DOE), “We are entering an information-dominated age. Ability to tame a tidal wave of information will distinguish the most successful scientific, commercial, and national-security endeavors.”1 Another study on systems biology for energy and the environment, when discussing computational models, noted that “these enormously complex and heterogeneous full-scale simulations will require not only petaflop capabilities but also a computational infrastructure that permits model integration. Simultaneously, it must couple to huge databases created by an ever-increasing number of high-throughput instruments.”

More recently, a DOE-sponsored report on visual analysis and data exploration at extreme scale3 found that “datasets being produced by experiments and simulations are rapidly outstripping our ability to explore and understand them, and there is, nationwide, comparatively little basic research in scientific data analysis and visualization for knowledge discovery.”

Read full text