HP today announced the launch of Haven Predictive Analytics, a program used to speed up the statistical analysis and machine learning processes that allow companies to interpret vast sets of data.
Integral to the offering is Distributed R, HP's improvement on the basic scripting language known as R. Widely used to visualize and interpret data in graph form, R is one of the preferred languages used by data scientists. Some practical applications have been drug discovery and financial modeling. Haven Predictive Analytics uses Distributed R as its engine, and assigns data-related tasks to various processing nodes, thus making the large amounts of information more manageable. Distributed R will also integrate with HP Vertica, a relational, columnar database, and will streamline operations for developers who previously had to rewrite data in order to share it within their organizations.
Even though R has limited value when it comes to working with extremely large volumes of data, many analysts and data miners rely on it, points out Leslie Ament, senior vice president and principal analyst at Hypatia Research Group. Despite its weaknesses, she states that it has the ability to modify algorithm options to fine-tune analyses; contains a variety of available algorithms; and can automate repetitive tasks. "HP's offering should enable fans of R to expand their use of a familiar tool to big data analytics projects without fear of crashing systems," Ament says.
Jeff Healey, director of product marketing at HP Vertica, says that the release of the product comes in response to customers' increasing need for quicker and more precise data interpretations. "We have quite a few customers within healthcare saying that they don't want to take a subset of the data to base it on their models and predictive algorithms and what have you. They need full data sets. Those volumes that in the past were in the gigabyte range, now we have customers that really need full corpus and terabytes of data."
Further, the release is in alignment with HP's larger plan to improve companies' use of big data analytics to increase their understanding of customers. "What we're finding is that companies within retail or online gaming [for instance] can really accurately predict now when a group of customers is going to churn, and they can do something about it—put an offer in place, discount the product—to create a closer relationship with their customers [and] drive increasing revenues," Healey adds. "That's the real power of predictive analytics."
The offering, available through the Vertica Web site, operates on an open-source platform and has available enterprise support from HP. It is priced per node.
"In order to attain the promised boost in performance by splitting tasks between multiple processing nodes, which is necessary with large data sets," Ament says, "buyers will need to purchase enterprise support from HP, priced per node. This approach to big data analytics may be attractive to data scientists loyal to R, but will obviously come at a price which each organization will need to evaluate for themselves."