-->

Is Hadoop Worth the Hype?

Article Featured Image

on cost "right away," says Marc Gallman, manager of global data architecture at Lenovo. According to Gallman, Lenovo saved roughly $140,000 on the initial migration alone.

Lenovo's primary goal was to "capture the end-to-end customer journey that spanned large volumes and stretched across a variety of data sources." While the task was manageable with one integration system running, having up to 20 integration systems running in parallel strained operations, Gallman says. After using Talend to move into the Hadoop environment, however, processing became 30 times more efficient.

"Talend makes Hadoop much more user-friendly. We can take someone right out of college and train them on Talend in a matter of weeks, but Hadoop is a different story. With Talend, you don't have to be an expert to move data in and out of the Hadoop environment," Gallman says.

While solutions such as those offered by Talend make the Hadoop migration more manageable for companies, vendors such as MapR tackle the batch-processing lag. To understand what MapR offers, you have to look at how data is traditionally processed, Jack Norris, chief marketing officer at MapR, explains. Traditionally, enterprise storage is in one location, or cluster, while computational and analytics operations take place in another; because workload production is kept separate from the data warehouse, traditional data processing occurs in disparate environments. With Hadoop, users can access all of these distributed capabilities across their ecosystem, because the data lives in one unified environment.

MapR developed a solution that enhances the Hadoop data platform to make it behave like enterprise storage. "We basically rewrote the platform to make it enterprise grade. It's completely read/write, so there's no jumping back and forth from read-only versions," Norris says. This means data work can happen inside any version, which saves time. "And it's got all the enterprise data protection," he adds.

What makes MapR's offering unique, according to Norris, is that it enables Hadoop to be accessed as easily as network-attached storage is accessed through the network file system; this means faster data management and system administration without having to move any data. In other words, the solution enables users to access and manipulate data directly on Hadoop without running into batch performance lag or other restrictions that prevent data architecture integration. "Rather than running in batch mode, we provide an enhanced data platform that runs analytics in real time," Norris explains. This is a key improvement on Hadoop's functionality, because for companies such as music-streaming service Beats Music that rely on real-time data, waiting for delayed processing is not an option.

Before being acquired by Apple, Beats Music used MapR's technology to deliver real-time music recommendations to its customers. "Understanding our listeners is at the core of the Beats Music service, and the MapR Distribution for Hadoop [enabled] us to keep up with our listeners' preferences by processing events. With the reliability and performance capabilities of the MapR Distribution, we [were] able to analyze the high volume of data from our users and make music recommendations immediately personalized to them," Brian Rogosky, director of big data engineering at Beats Music, said in a company statement.

Veteran data solution vendors such as Oracle are innovating as well, developing platforms that make Hadoop easier to use and to incorporate into existing data infrastructures. Its latest updates revolved around allowing users to store and analyze structured and unstructured data together and giving users a set of tools to visualize data and find data patterns or problems.

At Oracle Openworld in September, the company debuted its Big Data SQL and Big Data Discovery tools, and Chris Lynskey, a vice president of product management, showed how the combination of tools can be used to generate a sophisticated set of retail point-of-sale analytics.

Oracle Big Data SQL enables users to keep data in SQL storage and Hadoop to analyze it in both places simultaneously, Lynskey demonstrated. "You've got the Oracle database and Hadoop running at the same time, and you only have to query once," he said. Oracle Big Data Discovery, the "visual face for Hadoop," complements the Big Data SQL functionality, and allows users to visualize data for prediction and correlation purposes.

Using the retail point-of-sale example, Lynskey illustrated how the tools enable users to determine not only who's shopping most frequently and most recently using data found in Hadoop, but also who's spending the most, information available through traditional 

CRM Covers
Free
for qualified subscribers
Subscribe Now Current Issue Past Issues

Related Articles

Datarista Offers Full-Featured Platform-as-a-Service to Data Providers

Data can be conveyed to cloud-based platforms through one all-encompassing integration point.

On Day Two of the Gartner BI Summit, Analysts Call for 'More Synthesis'

Business intelligence professionals must become agile change agents.

5 Ways Small Data Can Be More Valuable than Big Data

Real-time information lets your business act more quickly.

Accenture Partners with Hortonworks to Expand Big Data Capabilities

The companies will collaborate on making structured and unstructured databases more accessible.

Gartner BI Summit Focuses on Uniting Technology Evangelist and Skeptic

Pragmatic approach can unlock strategic and monetary value of information.

SAS Unveils Visual and In-Memory Stats for Hadoop

Visualization and big data-geared updates announced at SAS Global Forum.

Oracle Buys Big Data Platform BlueKai to Extend Marketing Cloud

BlueKai's data management platform will bring added personalization to Oracle's marketing programs.