Sandia’s 'Ovis' now available as open-source software

The initial version of “OVIS” – a software tool developed by Sandia National Laboratories that provides intelligent, real-time monitoring of computational computer clusters – is now available for free download at its Web site. OVIS, say Sandia researchers, offers a statistical approach to the problem of computational platform monitoring and analysis, which can be inefficient and ineffective due to the traditional emphasis on manufacturer-specified, “absolute” thresholds. Instead, OVIS observes the overall statistical properties and environmental effects of a cluster, characterizing individual device behaviors and comparing them to a large number of statistically similar devices. Thus, individual node values that appear to deviate from the norm (given the current applicable model, as established by real-time analysis) are flagged as aberrant. This technique, say Sandia’s OVIS developers, can accurately expose problems much earlier than the current practice of simply waiting for a pre-determined threshold -- necessarily set high to preclude too many false alarms – to be crossed. OVIS not only addresses the issue of aberrant node detection but also allows the system builder to visualize the spatial distribution of a particular characteristic over the entire system. Sandia is a National Nuclear Security Administration (NNSA) laboratory. The baseline capabilities of OVIS currently available for download include: - Visualization and correlation tools that display information about state variables (such as temperature, CPU utilization, and fan speed) and their aggregate statistics. - Statistical tools that present the cluster as a comparative ensemble (rather than as individual nodes), a convenient and useful method for tuning cluster set-up and determining the effects of real-time changes in the cluster configuration and its environment. - An XML based cluster configuration information template. Though not part of the current download distribution, OVIS also incorporates a novel Bayesian inference scheme to dynamically infer models for the normal behavior of a system and to determine bounds on the probability of values manifested in the system. (“Bayesian” analysis, according to the International Society for Bayesian Analysis , is a well-known approach to data analysis that casts statistical problems in the framework of decision making). This and other advanced features will be available in future releases.