How open data structures can drive business transformation

by Arun Shankar   6 December, 2016
How open data structures can drive business transformation

Peter Sjoberg, Vice President and Chief Technology Officer, Hitachi Data Systems.

The process of business transformation begins with data. The journey usually starts in the datacentre where storing, preserving and guaranteeing the availability of data is a central part. But a challenge that businesses face is to transform data into something more meaningful usually referred to as business outcome.

“Our mission at Hitachi Data Systems is to turn data into value. We take the ones and zeros that we have been storing for so many years and turn it into something meaningful. Financial services companies are now calling themselves IT companies that deliver financial services. It is information and their use of financial information that sets them apart,” explains Peter Sjoberg, Vice President and Chief Technology Officer, Hitachi Data Systems.

Hitachi as an industrial business has been in existence for more than a hundred years, since the advent of the electric motors. A key core competency has been its ability to integrate industrial and operational systems and data into one solution. Control systems within Hitachi’s industrial solutions produce vast quantities of data. Now through its new One Hitachi group approach, it is integrating a third component into its legacy core strength, information technology.

“We have created the Hitachi Insight Group specifically focused on gaining control of data, allowing analysis and insight to be done against the data to produce an outcome, and that is what we really see coming together,” says Sjoberg. “The ability to create and leverage data is not why they exist, but to provide better outcomes. The industrial and operational side is important to us and drives our company.”

Going forward Hitachi is offering its railway as a service approach to deliver successful outcomes to its future railway customers as a business and sales model. This forward looking service takes over a railway system to deliver passengers on time. The ability to control and manage a railway system using a combination of operational and information technology, allows Hitachi to deliver better service and results. “Turning ones and zeros of data into value is everything, leading to better outcomes and better railway systems.”

The process of generating business outcomes from data starts by getting control over data wherever it is being generated. By default, data inside a datacentre is siloed by application or by database, separating it and limiting its ability to be used. Once the data is under control by moving it out of the silos, it needs to be ingested. This allows it to be used, transformed and leveraged. This is the first and most crucial step, points out Sjoberg. “People do not recognise that.”

There are other peripheral steps that need to be added to complete the cyclic process. The data needs to be governed so that there is predictable control. It needs to be augmented by metadata, enriching the core objects, allowing comparison and generation of analytics. It needs to be monetised by developing suitable algorithms that generate the required outcomes, driving business benefits and return on investment. A final touch is the ability to apply rules to dispose the data, recently termed as the right to be forgotten.

“For us it is key to make sure we mobilise that data, which is making it available and useful for us.”

An important requirement of business digital transformation is to guarantee the right of data protection across its entire lifecycle. In other words to guarantee that the data has never been changed. This implies the original data as it has been created is preserved and the changes are recorded in the metadata. This allows organisations to host any type of data in their systems, preserve digital data assets as per specific retention regulations, and guarantee the disposition of the data at the end of it.

Hitachi Content Platform from Hitachi Data Systems is an answer for this requirement. Hitachi Content Platform is a software layer that sits on top of traditional storage technology, whether physical or virtual. Hitachi Content Platform delinks the data from the application and helps to make it accessible through any application and any location. It also ingests, retains, governs, augments, and disposes the data as per user defined guidelines.

While the application can still see and access the data, its siloed connections are replaced by a more open architecture. This type of an open networked solution is common in private datacentres. “This technology is the foundation of every storage cloud. Every storage cloud that can be envisioned is built upon this type of architecture,” remarks Sjoberg.

Organisations usually start by governing email and documents, but the requirements for business digital transformation go far beyond that. Hitachi Content Platform helps organisations to begin their journey of digital transformation by allowing them to extend the reach of their data governance. Self-driven cars are being facilitated by years of data collection by car manufacturers and the development of algorithms from the underlying analytics. Financial institutions can meet the requirements and reduce the costs of data compliance and data governance by investing in solutions like Hitachi Content Platform.

While data inside datacentres can be secured, the challenges of data at rest and in-flight can be met by encryption, which is built into the Hitachi Content Platform. The challenges of managing and storing the exponential growth of content data can also be met suitably by tape drives and optical drives.

Going by Hitachi’s bold move of offering railway as a service, by combining its core competence across industrial, operational, and information technologies, and its spin-off of Hitachi Insight Group, we may see the same innovation appearing soon in other real life use cases as well.


Peter Sjoberg, Vice President and Chief Technology Officer, Hitachi Data Systems.

Key takeaways

  • A requirement of business transformation is to guarantee the right of data protection across entire lifecycle
  • By default data inside a datacentre is siloed by application or by database limiting its ability to be used
  • Process of business transformation begins with data
  • Process of generating business outcomes from data starts by getting control over data wherever it is being generated
  • Turning ones and zeros of data into value is everything leading to outcomes
  • While the application can still see and access data its siloed connections are replaced by more open architecture

Description of Hitachi Content Platform

One of IT’s greatest challenges today is an explosive, uncontrolled growth of unstructured data. The vast quantity of data being created, the difficulties in management and proper handling of unstructured content, and the complexity of supporting more users and applications pose significant challenges to IT departments. Hitachi Data Systems provides an alternative solution to these challenges through Hitachi Content Platform. This single object storage platform can be divided into virtual storage systems, each configured for the desired level of service. Hitachi Content Platform assists with management of distributed IT environments and control of the flood of storage requirements for unstructured content, and it addresses a variety of workloads.

Hitachi Content Platform is a multipurpose distributed object-based storage system designed to support large-scale repositories of unstructured data. Hitachi Content Platform enables IT organisations and cloud service providers to store, protect, preserve and retrieve unstructured content with a single storage platform. It supports multiple levels of service and readily evolves with technology and scale changes.

Hitachi Content Platform obviates the need for a siloed approach to storing unstructured content. Massive scale, multiple storage tiers, non-disruptive hardware and software updates, multitenancy and configurable attributes for each tenant allow the platform to support a wide range of applications on a single physical Hitachi Content Platform instance.

By dividing the physical system into multiple, uniquely configured tenants, administrators create virtual content platforms that can be further subdivided into namespaces for further organisation of content, policies and access. With support for thousands of tenants, tens of thousands of namespaces, and Petabytes of capacity in one system, Hitachi Content Platform is cloud-ready.

One infrastructure is far easier to manage than disparate silos of technology for each application or set of users. By integrating many key technologies in a single storage platform, Hitachi Data Systems object storage solutions provide a path to short-term return on investment and significant long-term efficiency improvements.

Hitachi Content Platform, as a general-purpose object store, allows unstructured data files to be stored as objects. An object is essentially a container that includes both file data and associated metadata that describes the data. The objects are stored in a repository. Each object is treated within Hitachi Content Platform as a single unit for all intents and purposes. The metadata is used to define the structure and administration of the data. Hitachi Content Platform can also leverage object metadata to apply specific management functions, such as storage tiering, to each object. The objects have intelligence that enables them to automatically take advantage of advanced storage and data management features to ensure proper placement and distribution of content.

Hitachi Content Platform architecture isolates stored data from the hardware layer. Internally, ingested files are represented as objects that encapsulate both the data and metadata required to support applications. Externally, Hitachi Content Platform presents each object either as a set of files in a standard directory structure or as a uniform resource locator accessible by users and applications via HTTP or HTTPS.

Hitachi Content Platform’s repository object is composed of fixed-content data and the associated metadata, which in turn consists of system metadata and, optionally, custom metadata and an access control list. Fixed-content data is an exact digital copy of the actual file contents at the time of its ingestion. It becomes immutable after the file is successfully stored in the repository. If the object is under retention, it cannot be deleted before the expiration of its retention period, except when using a special privileged operation.

Metadata is system or user generated data that describes the fixed content data of an object and defines the object’s properties. System metadata, the system managed properties of the object, includes Hitachi Content Platform specific metadata and POSIX metadata. Hitachi Content Platform specific metadata includes the date and time the object was added to the namespace, the date and time the object was last changed, the cryptographic hash value of the object along with the namespace hash algorithm used to generate that value, and the protocol through which the object was ingested. It also includes the object’s policy settings such as DPL, retention, shredding, indexing and versioning.

Access control list is optional, user-provided metadata containing a set of permissions granted to users or user groups to perform operations on an object. Access control lists control data access at an individual object level and are the most granular data access mechanism. In addition to data objects, Hitachi Content Platform also stores directories and symbolic links in the repository. Only POSIX metadata is maintained for directories and symbolic links; they have no fixed-content data, custom metadata or access control lists. All the metadata for an object is viewable; only some of it can be modified. The way metadata can be viewed and modified depends on the namespace configuration, the data access protocol and the type of metadata.

hds-hcp_1000x550_2

hds-hcp_1000x550_1

A single Hitachi Content Platform consists of both hardware and software. It is composed of many different components that are connected together to form a scalable architecture for object based storage. Hitachi Content Platform runs on an array of servers, or nodes, that are networked together to form a single physical instance. Each node stores data objects and can also store search index. All runtime operations and physical storage, including data, metadata and index, are distributed among the system nodes. All objects in the repository are distributed across all available storage space but still presented as files in a standard directory structure. Objects that are physically stored on any particular node are available from all other nodes.

Hitachi Content Platform has an open architecture that insulates stored data from technology changes and from changes in Hitachi Content Platform itself due to product enhancements. This open architecture ensures that users will have access to the data long after it has been added to the repository. Hitachi Content Platform acts as a repository that can store customer data and an online portal. As a portal, it enables access to that data by means of several industry-standard interfaces, as well as through an integrated search facility and Hitachi Data Discovery Suite.

Hitachi Content Platform implements the open, standards-based Internet Protocol version 6, the latest version of the Internet Protocol. This protocol allows Hitachi Content Platform to be deployed in very large scale networks and ensure compliance with a number of government agencies where IPv6 is mandatory. Hitachi Content Platform provides IPv6 dual stack capability that enables coexistence of IPv4 and IPv6 protocols and corresponding applications. Hitachi Content Platform can be configured in native IPv4, native IPv6, or dual IPv4 and IPv6 modes where each virtual network will support either or both IP versions.

The IPv4 and IPv6 dual-stack feature is indispensable in heterogeneous environments during transition to IPv6 infra- structure. Any network mode can be enabled when desired, and existing IPv4 applications can be upgraded to IPv6 independently and with minimal disruption in service.

Multitenancy support allows the repository in a single physical Hitachi Content Platform instance to be partitioned into multiple namespaces. A namespace is a logical partition that contains a collection of objects particular to one or more applications. Each namespace is a private object store that is represented by a separate directory structure and has a set of independently configured attributes. Namespaces provide segregation of data, while tenants, or groupings of namespaces, provide segregation of management. An Hitachi Content Platform system can have up to 1,000 tenants and 10,000 namespaces. Each tenant and its set of namespaces constitute a virtual Hitachi Content Platform system that can be accessed and managed independently by users and applications. This Hitachi Content Platform feature is essential in enterprise, cloud and service provider environments.

Adaptive cloud tiering expands Hitachi Content Platform capacity to any storage device or cloud service. It enables hybrid cloud configurations to scale and share resources between public and private clouds. It also allows Hitachi Content Platform to be used to build custom, evolving service level agreements for specific data sets using enhanced service plans.

The Hitachi Content Platform portfolio products integrate tightly to deliver powerful file sync and share capability, and elastic backup-free file services for remote and branch offices.

Hitachi Content Platform Anywhere provides safe, secure file sharing, collaboration and synchronisation. End users simply save a file to HCP Anywhere and it synchronises across their devices. These files and folders can then be shared via hyperlinks. Because HCP Anywhere stores data in Hitachi Content Platform, it is protected, compressed, single-instanced, encrypted, replicated and access-controlled.

Hitachi Data Ingestor combines with Hitachi Content Platform to deliver elastic and backup-free file services beyond the data center. When a file is written to HDI, it is automatically replicated to Hitachi Content Platform. From there, it can be used by another HDI for efficient content distribution and in support of roaming home directories, where users’ permissions follow them to any HDI site. Files stay in the HDI file system until free space is needed. Then, HDI reduces any inactive files to pointers referencing the object on Hitachi Content Platform. HDI drastically simplifies deployment, provisioning and management by eliminating the need to constantly manage capacity, utilization, protection, recovery and performance of the system.

hds-hcp_1000x550_3

hds-hcp_1000x550_4


2016 Awards Banner

Latest Whitepapers

Whitepaper

Whitepaper

Whitepaper

Whitepaper


Gigamon Advert