What is it and how to manage it
When it comes to enterprise functions, entry to information – and a lot of it – is normally a great factor. And the better the amount of required information held domestically to the place it is processed, the higher for the enterprise, its functions, decision-making and, in some instances, compliance.
But the necessity to retailer and manage information brings its personal issues too, together with larger prices, decrease system efficiency, and administration overheads. Here we’re coping with the thought of data gravity.
There is rising proof that data-rich techniques appeal to extra information. This, in flip, attracts much more data-dependent functions, which then usher in but extra.
The idea of data gravity was first coined by IT researcher Dave McCrory in 2010. He argued that as organisations collect information in a single place, it “builds mass”. That mass attracts providers and functions, as a result of the nearer they’re to the info, the higher the latency and throughput.
As extra information comes collectively, the method accelerates. Eventually, you arrive at a scenario the place it turns into troublesome or not possible to transfer information and functions elsewhere to meet the enterprise’s workflow wants.
As a consequence, prices rise, workflows turn out to be much less efficient, and corporations can encounter compliance issues. McCrory, now at Digital Realty, publishes an information gravity index. He expects information gravity, measured in gigabytes per second, to develop by 139% between 2020 and 2024. This will put pressure on IT infrastructure, he says.
At Forrester, researchers describe information gravity as a “chicken and egg” phenomenon. A current report on datacentre tendencies units out the issue.
“The concept states that as data grows at a specific location, it is inevitable that additional services and applications will be attracted to the data due to latency and throughput requirements,” it says. “This, in effect, grows the mass of data at the original location.”
Harder to scale
Examples of knowledge gravity embody functions and datasets transferring to be nearer to a central information retailer, which may very well be on-premise or co-located. This makes finest use of present bandwidth and reduces latency. But it additionally begins to restrict flexibility, and could make it tougher to scale to take care of new datasets or undertake new functions.
Data gravity happens within the cloud, too. As cloud information shops improve in measurement, analytics and different functions transfer in direction of them. This takes benefit of the cloud’s potential to scale shortly, and minimises efficiency issues.
But it perpetuates the info gravity concern. Cloud storage egress charges are sometimes excessive and the extra information an organisation shops, the dearer it is to transfer it, to the purpose the place it will be uneconomical to transfer between platforms.
McCrory refers to this as “artificial” information gravity, attributable to cloud providers’ monetary fashions, fairly than by expertise.
Forrester factors out that new sources and functions, together with machine studying/synthetic intelligence (AI), edge gadgets or the web of issues (IoT), threat creating their very own information gravity, particularly if organisations fail to plan for information progress.
The progress of knowledge on the enterprise edge poses a problem when finding providers and functions until corporations can filter out or analyse information in situ (or probably in transit). Centralising that information is doubtless to be costly, and wasteful if a lot of it is not wanted.
Impact on storage
The affect of knowledge gravity on storage is basically twofold – it drives up prices and makes administration tougher. Costs will improve with capability necessities, however the improve for on-premise techniques is unlikely to be linear.
In observe, corporations will discover they want to spend money on new storage arrays as they attain capability limits, probably needing costly capex spend. But there is a robust likelihood they may even have to spend money on different areas to enhance utilisation and efficiency.
This would possibly contain extra solid-state storage, or tiering to transfer less-used information off the highest-performance techniques and redundant techniques to guarantee availability, and storage administration instruments to management the entire course of.
Some suppliers report that corporations are turning to hyperconverged techniques – which embody storage, processing and networking in a single field – to deal with rising storage calls for whereas balancing efficiency. By bringing processing and information nearer collectively, hyperconverged techniques ship proximity and reduce latency. But once more, these techniques are tougher to scale easily.
In the cloud, capability scales extra easily, so CIOs ought to have the opportunity to match information storage extra carefully to information volumes.
However, not all companies can put all their information into the cloud, and even these whose regulatory and buyer necessities enable it will want to have a look at the price and the time it takes to transfer information.
Proximity of knowledge to processing is not assured, so firms need cloud architects who can match compute and storage capability, in addition to ensure cloud storage works with their current analytics applications. They additionally want to watch out to keep away from information egress prices, particularly for information that strikes steadily to enterprise intelligence and different instruments.
Cloud-native functions, akin to Amazon QuickSight, are one possibility. Another is to use cloud gateways and cloud-native applied sciences, akin to object storage, to optimise information between on-premise and cloud places. For instance, Forrester sees corporations co-locating vital functions in datacentres with direct entry to cloud storage.
At the identical time, CIOs want to be rigorous on value administration, and be sure that “credit-card cloud” purchases don’t create information gravity hotspots of their very own. Technologist Chris Swan has developed a cost model of data gravity, which may give fairly a granular image, for cloud storage.
Dealing with information gravity
CIOs, analysts and suppliers agree that information gravity can’t be eradicated, so it wants to be managed.
For enterprise CIOs and chief information officers, this implies hanging a stability between an excessive amount of and too little information. They ought to problem enterprise on the info they gather, and the info they maintain. Is all that information wanted? Could some be analysed nearer to the sting?
Tackling information gravity additionally means having sturdy information administration and information governance methods. This ought to prolong to deleting unneeded information, and making use of efficient tiering and archiving to reduce prices.
Cloud will play its half, however prices want to be managed. Firms are doubtless to use a number of clouds, and information gravity could cause pricey information actions if utility and storage architectures should not designed properly. Analytics functions, specifically, can create silos. Firms want to have a look at the datasets they maintain and ask that are susceptible to information gravity. These are the functions that want to be hosted the place storage will be designed to scale.
Tools that may analyse information in situ and take away the necessity to transfer giant volumes can scale back the affect of knowledge gravity and additionally among the value disadvantages of the cloud. This comes into its personal the place organisations want to have a look at datasets throughout a number of cloud areas, software-as-a-service (SaaS) functions, and even cloud suppliers.
Organisations also needs to have a look at the community edge to see whether or not they can scale back volumes of knowledge transferring to the centre and use real-time analytics on information flows as a substitute.
With ever-growing demand for enterprise information and analytics, CIOs and CDOs are unlikely to have the opportunity to get rid of information gravity. But with new and rising information sources akin to AI and IoT, they a minimum of have the prospect to design an structure that may management it.