Virtustream Blog

Orchestrating Enterprise Data with Data Hub Service


in SAP, Big Data, Cloud

In recent years, we have seen an explosion of data. Social media, Internet, IoT and cloud are some of the sources of the massive and exponentially growing set of data that exists today. According to a recent IDC whitepaper on data growth, total worldwide data is expected to reach approximately 160 Zettabytes (ZB) by 2025. That’s 160 trillion Gigabytes! Approximately a 10-fold increase of the worldwide data that exists today.

Businesses have been making significant investments in their Big Data initiatives for several years. Yet, they are struggling to extract real value from their increasingly complex data landscape. To be a truly data-driven enterprise, your business must be able to monetize its data assets regardless of the type of data and where it resides. Unbelievable but true, less than 0.5% of all accessible data is being analyzed and used. Imagine the opportunity! There is an ocean of data that is waiting to be explored.

Data Management Challenge

The business data landscape is getting increasingly complex and diverse. Data assets are massive, come from various sources, exist in various forms and are everywhere. Data resides on-premises, in the cloud, in enterprise data warehouses, in data lakes and so on. Data is constantly being generated from ERP apps, mobile apps, Internet and IoT to name a just a few. These data assets are comprised of both structured and unstructured data, historical and real-time data. Because of this diverse data landscape, enterprises are having a difficult time integrating the data, classifying their use and relevance. To holistically solve these data management challenges, different data processes must be orchestrated to collect, manage and integrate all of the data assets.

SAP Data Hub Vision

The true value of your data assets will be realized when you can bring together data from various data locations and sources and combine, analyze and integrate them alongside your existing business processes in SAP or other mission-critical systems. SAP Data Hub acts as the center-piece for integration of data-driven processes across enterprise services to process and orchestrate data in the overall data landscape. It manages data processes and shares them across the enterprise with seamless and unified data operation capabilities. These capabilities include:

  • Data Discovery - Catalogs and profiles data, allowing data scientists to better understand data uses, relationships and quality by providing a detailed and easily understood view of how to integrate data across the data landscape
  • Data Governance – Provides metadata management of all forms of data, resulting in management of the availability, usability, integrity and security of data across the data landscape
  • Data Refinery – Transforms and enriches data natively without ever leaving the data source
  • Tool Orchestration – Orchestrates and integrates existing data integration tools like the SAP Data Services
  • Data Orchestration - Orchestrates the flow of data across data landscapes while meeting data governance requirements and isolation of use cases for resource optimization. This allows data processing to be close to the data source first and then data orchestrating.
  • Data Ingestion – Processes data for immediate use or stores data in a database. This means streaming data in real-time or in batches and connecting them to data integration tools.
  • SAP Vora – Acts as the runtime and leverages massive distributive processing while bridging various analytics environments such as Hadoop and SAP HANA

Whether you are utilizing SAP HANA for in-memory processes, Hadoop and object storage for persistent data or both, SAP Data Hub enables you to integrate these environments and data integration tools to create powerful data pipelines, connecting data in various formats to make them accessible for numerous analytics algorithms. 

Data Hub Service

Virtustream has a long running and deep partnership with SAP. We were the first cloud provider to have a production SAP customer in a multi-tenant cloud, the first to run production SAP ECC on HANA in the cloud and the first to run production SAP S/4HANA in the cloud. Recently, we partnered with SAP to offer Data Hub Service.

Virtustream’s SAP Data Hub Service is designed to access, harmonize, transform and process your enterprise data from disparate sources, allowing your organization to leverage existing investments in HANA, Hadoop/Spark and other data management solutions. Major benefits of the service include:

  • Full management below the application container including patch, capacity and upgrade management
  • Integration with SAP environments to enable SAP HANA and BW data integration
  • Support for multi-tenancy with data pipeline isolation
  • SAP Data Hub pipeline deployment and management through Kubernetes
  • Container-based Architecture for flexibility, deployment, orchestration and version controls

The service also supports Hybrid Data Hub service to enable your enterprise to build applications to harness data where it currently resides - from on-premises enterprise sources, data lakes, data warehouses and the cloud. Overall, our solution off-loads data operation complexities, simplifying management and control of your enterprise-wide data assets to help you monetize your data. 

It's About Value Creation

Enterprises’ digital transformation is being fueled by data. At the end of the day, you want to leverage all of your data assets to better understand your customers, aid in innovation and, ultimately, drive revenue growth. By providing a better understanding of data usability, relevance, quality and creating powerful data pipelines between various data sources, Virtustream’s SAP Data Hub Service makes your data assets accessible and establishes greater confidence in your data scientists. In turn, it helps them build data-driven applications that analyze the data for deep and actionable insights. Our goal is to truly empower you with the ability to access and analyze all of your enterprise data, not just the 0.5%.