Visualizing static ensembles for effective shape and data comparison


L. Hao, C. G. Healey, S. Bass, H.-Y. Yu
Proceedings Visualization and Data Analytics (VDA '16), vol. 1, 2016, pp. 1-10

View PDF Semantic Scholar DBLP DOI


Hao, L., Healey, C. G., Bass, S., & Yu, H.-Y. (2016). Visualizing static ensembles for effective shape and data comparison (Vol. 1, pp. 1–10).

Hao, L., C. G. Healey, S. Bass, and H.-Y. Yu. “Visualizing Static Ensembles for Effective Shape and Data Comparison.” In , 1:1–10, 2016.

Hao, L., et al. Visualizing Static Ensembles for Effective Shape and Data Comparison. Vol. 1, 2016, pp. 1–10.


Ensembles are large, multidimensional, multivariate datasets generated in areas like physical and natural science to study real-world phenomena. Simulations or experiments are run repeatedly with slightly different initial parameters, producing members of the ensemble. The need to compare data and spatial properties, both within an individual member and across multiple members, makes analysis challenging. Initial visualization techniques focused on ensembles with a limited number of members. Others generated overviews of larger ensembles, but at the expense of aggregating potentially important details. We propose an approach that combines these two directions by automatically clustering members in ways that help scientists locate interesting subsets, then visualize members within the subset. Our ensemble visualization technique includes: (1) octree comparison and clustering to generate a hierarchical level-of-detail overview of inter-member shape and data similarity; (2) a glyph-based visualization of an ensemble member; and (3) a method of combining multiple glyph visualizations to highlight similarities and differences in shape and data values across a subset of ensemble members. We apply our approach to a Relativistic Heavy Ion Collider ensemble collected by nuclear physics colleagues at Duke University studying quantum chromo-dynamics. Our system allows the physicists to interactively choose when to explore inter-member relationships, and when to visualize fine-grained details in individual member datasets. Introduction An ensemble is formed by executing a simulation or an experiment repeatedly, with slightly different initial conditions or parameterizations for each run. Data produced from a run forms one member of the ensemble. Researchers from a wide range of disciplines are now using ensembles to investigate complex systems, explore a system’s sensitivity to its input parameters, measure uncertainty, and compare both spatial and data characteristics of the resulting models. Not surprisingly, ensembles are difficult to analyze due to their size and complexity. Wilson et. al. compared ensembles to traditional scientific data and summarized the characteristics and challenges unique to ensemble visualization [25]. Different techniques have been developed for ensemble analysis. One approach creates concise overview visualizations, but these may hide potentially important details in the original data [3, 20]. Another method extends existing scientific visualization techniques to support comparison between members [1, 17]. This can offer an improved view of individual members, but often cannot scale beyond small member sets. This suggests the two main approaches to ensemble visualization are currently: (1) generate an overview that scales but may not maintain detail, or (2) present a visualization that maintains detail but can only analyze a small number of members at one time. More recent systems try to support interactive ensemble analysis at different levels of detail [12, 18]. These systems rely on the scientists to select a subset of members for detailed visualization, however. Currently, little work has investigated ways to automatically capture inter-member relationships. We propose an approach that combines the two directions of ensemble analysis. A key strength of our method is the automatic construction of hierarchical representations of ensembles based on their shape and data similarity. The hierarchy is visualized to the scientists, allowing them to use their current interests and domain expertise to control the trade-off between individual member detail versus the number of members being visualized. Our technique reveals hierarchical inter-member relationships and supports visualization of both a single member and multiple member subsets. We use an octree representation to compress the data and extract shapes from the ensemble [9, 21]. The hierarchical structure of the octree naturally encodes shapes and variations between members at multiple levels of detail. We extend the similarity matching in [26] to mathematically measure shape dissimilarity between member pairs by comparing their octrees. Based on these estimates, we apply hierarchical clustering to collect similar members into common groups. The result is a level-of-detail cluster tree visualization that allows scientists choose where to perform comparative analysis by interactively selecting individual member datasets or clusters of members with varying levels of similarity. Next, we represent member and inter-member relationships with a visualization technique that displays the members within a cluster. We merge member data using statistical aggregation into a visual presentation that highlights shape and data differences through the use of size, colour, and motion. In this way, we extend traditional multivariate visualization to support general shape visualization and region-by-region comparative visualization across multiple ensemble members. This provides a detailed view of shape, data element distributions, and important attribute value differences across the members in a cluster. Related Work In the past decade, different visualization techniques have been proposed to facilitate interpretation and analysis of 2D or 3D ensemble data using volume rendering, multidimensional visualization, and comparative visualization [2, 10, 16]. Noodles is a visualization technique designed to analyze meteorological ensembles [22]. It includes statistical aggregation and uncertainty measurements, visualizing results with circular glyphs, ribbons, and spaghetti plots, a visualization method that uses contours to represent attribute value boundaries. EnsembleVis also focuses on statistical data visualization for analyzing weather forecast and climate model ensembles [19]. EnsembleVis presents data using a collection of visualizations connected through linked views. Data from multiple member sets are summarized with means and standard deviations, then visualized using colour maps, contours, height fields, trend charts, and spaghetti plots. Follow-on research extends ensemble visualization to explicitly support member comparison. Ensemble Surface Slicing (ESS) compares surfaces extracted from n ensemble members in a single view by colour-coding the members, then slicing them into equal-width strips [1]. A combined representation is built by abutting strips member-by-member, where every n-th strip belongs to a common member, and visual discontinuities between strips highlight surface shape differences. Phadke et. al. proposed: (1) pairwise sequential animation, and (2) screen door tinting for 3D ensemble visualization [17]. Pairwise sequential animation extracts data elements from a member, visualized as glyphs whose colour and shape represent attribute value and parent member, respectively. Screen door tinting divides a projected ensemble visualization into equal sized cells whose colour and luminance identify a cell’s parent member and differences versus a user defined reference member, respectively. Recently, Matkovic et. al. developed a visualization tool to interactively investigate ensembles as families of 2D data surfaces. [12]. The system presents projections and aggregations of the data surfaces at three different levels: a parallel coordinate and scatterplot level to explore correlations and trends in data attributes; a parallel coordinates level to explore relationships across surfaces through aggregated profiles and function graphs; and 2.5D or 3D height fields to to support in-depth analysis of a selected surface. Piringer et. al. designed a system for comparative visual analysis of 2D function ensembles [18] using: (1) a domain-oriented overview that aggregates features across an ensemble using a heatmap; (2) a member-oriented overview that visualizes members as icons in a scatterplot; and (3) a detailed member view that presents small subsets of members in a 3D scatterplot. Whitiker and Mirzargar developed specialized contour and curve boxplots to accurately visualize statistical properties, outliers, and variability in ensembles of contours or 2D and 3D curves [13, 24]. They statistically summarize the centrality of members in an ensemble, visualized using specialized boxplots. Demir developed a method of overlaying bar and line charts to present statistical summaries and variations in ensemble members [4]. Köthur focused on temporal aspects of ensembles, generating clusters from temporal profiles of different members to support feature identification and ensemble comparison [11]. Past research shows numerous examples of ensemble visualization research built on previous techniques like glyphs, comparative visualization, charts, and linked views. We adopt a similar approach in our work, which is perhaps most similar to the contour and curve boxplots of Whitaker and Mirzargar [13, 24]. Their goals differ from ours, however. Contour boxplots visualize contours and functional level sets within an ensemble. We are focused on defining a hierarchical representation of 3D ensemble members that support both shape and value comparison across Figure 1: A calculated transition from ordinary nuclei to free quarks and gluons, where protons and neutrons within the nuclei disintegrate at extremely high temperature or density multiple members. To achieve this goal, we focus on two critical issues in ensemble visualization: (1) scalability to larger member sets; and (2) visualizations that allows scientists to make informed decisions about how to trade-off individual member detail against the number of members being compared. We measure shape dissimilarities between ensemble members, hierarchically combining members with similar shapes into clusters for more detailed exploration. Clustering uses an octree-based ensemble visualization framework that offers: (1) a mathematical measure of shape similarity between 3D spatial ensemble members; (2) a cluster tr