Pervasive DataRush Puts the Power of Parallelism into the Hands of Big Data Developers to Tackle Big Data at Breakneck Speeds

: Written by: Webmaster; Category: SCIENCE; Published: February 2, 2011, 3:30 am

Pervasive Software enables developers to quickly harness multi-core servers and clusters to tackle big data challenges with the release ofPervasive DataRush 5.0. The new version is being announced today by Pervasive executives at the European Data Innovation Summit in London and at the O’Reilly Strata Conference in Santa Clara. In addition to its inherent ability to automatically scale across all cores on mainstream and monstermulti-core servers, Pervasive DataRush now also scales across clusters, including the ability to accelerate every node in a Hadoop cluster, offering unmatched speed and economics. Pervasive DataRush 5.0 also provides easy parallel development capabilities for programmers in any JVM language, notably Java, Python, JRuby and Scala.

“We’ve put Pervasive DataRush 5.0 to work inside wukong, our open source tool for executing Ruby tasks on Hadoop clusters, which we use to drive Infochimps’ production jobs on Amazon’s EC2,” said Philip Kromer, CTO at Infochimps. “The result is exactly on-target: improving computational efficiency means we can get the largest jobs done using fewer resources in the cloud, which translates to hard-dollar cost savings.”

“As more and more organizations adopt Hadoop to take on large scale data challenges, organizations risk wasting resources by scaling out with inefficient server workloads," said David Menninger, VP and Research Director, Ventana Research. "Products like Pervasive DataRush can help increase the efficiency of each node in a Hadoop cluster to maximize throughput and help harness ever-growing data volumes.”

For organizations who want high-throughput data preparation and analytic applications that manipulate massive datasets, Pervasive DataRush enables rapid development of applications that scale up, out and over automatically. “The scalability of Pervasive DataRush applications, which we’ve demonstrated on a single box with up to 384 cores as well as across Hadoop clusters with more than 20 machines, delivers future-proofing and turns on its head the notion that ‘the free lunch is over,’” said Ray Newmark, vice president of sales and marketing for Pervasive DataRush. “By taking care of the daunting details of parallelism, Pervasive DataRush unlocks the explosive power of multi-core nodes for each compute task, slashing runtimes and leaving developers time to focus on their crucial application requirements.”

Using the Intel Concurrency Checker, designed to evaluate the performance scaling of applications on multi-core systems, Pervasive was able to gauge the degree of concurrency achieved and quantify the effectiveness of their tuning and optimization efforts. “Pervasive DataRush helps organizations unlock the powerful parallelism of Intel Xeon processors to take on big-data challenges,” said J. Scott Harrison, director, Developer and ISV Scale Programs, Intel Software and Services Group. “Together, Intel and Pervasive are teaming up to help deliver highly innovative, scalable solutions to meet customers’ growing needs as they tackle data processing and analytic bottlenecks with easily accessible hardware and software.”

Whether searching through logfiles to detect intrusion patterns, doing fuzzy matching across vast datasets, analyzing data streams from thousands of industrial sensors, processing patient healthcare and claims records, or powering a recommender system for ecommerce sites, Pervasive DataRush blows through performance bottlenecks in data preparation and analytics. Pervasive DataRush can access data in a variety of data stores including Hadoop HDFS, data warehouses, databases and flat files.

Free trial downloads of Pervasive DataRush 5.0 are currently available here.

SCIENCE

Pervasive DataRush Puts the Power of Parallelism into the Hands of Big Data Developers to Tackle Big Data at Breakneck Speeds