Native Apps At The Client & Cloud

Srinivasan Sundara Rajan

Subscribe to Srinivasan Sundara Rajan: eMailAlertsEmail Alerts
Get Srinivasan Sundara Rajan: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Related Topics: Big Data on Ulitzer, Internet of Things Journal, Data Lakes News

Data Lakes: Blog Feed Post

Data Lake Phenomenon | @ThingsExpo #IoT #M2M #BigData #Microservices

Is the Data Lake an effective catchment for all of the enterprise data?

Data Lake Phenomenon Among Enterprises

Over the past few years, there has been an explosion in the volume of data. To tackle this big data explosion, there has been a rise in the number of successful Hadoop projects in enterprises. Due to the large volumes of data, the emergence of Hadoop technology, and the need to store all soloed data in one place, has prompted a phenomenon among enterprises called: Data Lake.

Is the Data Lake an effective catchment for all of the enterprise data?

Yes and No. Data lakes are good to house the current, inter-related data but they don’t address the need for an enterprise-wide data management system

  • Since the data lake holds raw data of different types the business user cannot have controlled access to risk-free, secure, governed and curated data with semantic consistency as in the case of an enterprise data warehouse
  • Enterprise data today is heterogeneous, locked in disparate data sources and data from these systems are in conflict
  • A data lake is agnostic to the type of data it receives and due to issues such as lack of governance, descriptive metadata and a mechanism to maintain it, the data lake can easily turn into a data swamp with too much data
  • Hadoop and related technologies are still nascent even among early adopters, who are mostly conversant with SQL for data discovery and require training in Pig and MapReduce for data access. This slows down time-to-value for enterprises

Hortonworks has helped with the Data Lake phenomena. One example of this is the largest member-owned healthcare company in the US delivering industry leading supply chain management services and clinical improvement services to its members, VHA. The company had its product, supplier, and member information, and other data, spread across multiple sources, residing in silos. VHA used Hortonworks Data Platform to enable the business users to discover the related data and provide services to their members. Because of their previous success with data virtualization using the Denodo Platform, VHA decided to use data virtualization to enable their business users to discover data using the familiar SQL, and thus abstract their access directly to Hadoop.

Read more about Data Lake here.

Credit: .

More Stories By Bob Gourley

Bob Gourley writes on enterprise IT. He is a founder and partner at Cognitio Corp and publsher of CTOvision.com