In a previous article, we discussed the ways in which hybrid cloud storage equips storage administrators with a powerful tool to help manage data more efficiently, making it fast and easy to “cloud out” cold data or move performance-intensive data sets from the cloud to the highest tier of on-premises storage. In this article, we take a closer look at how hybrid can help administrators get the upper hand on a specific type of data that is proliferating within IT systems and business intelligence and analytics applications alike — unstructured data.
Even a cursory rundown of the most common types of unstructured data, including Word documents, email messages, presentations, images, video files, and log files, will serve to demonstrate just how prevalent this information can be. However, as in many areas of life, too much of a good thing isn’t always good. An abundance of unstructured data can bog organizations down in major ways. If storage capacity reaches its limit, additional, expensive storage needs to be purchased. Additionally, bulky storage can slow down access to the data, causing application performance to suffer.
A hybrid cloud storage model is ideally suited to help restore order to unstructured data that may have been built on file and folder hierarchies within block storage but needs someplace else to go. This is because the platform gives administrators the ability to view and manage on-premises and cloud storage as a single, unified pool. To make it work, administrators essentially have to make two important decisions.
Decide What Stays and What Goes
So, what do you do when you realize that you have more unstructured data than you know what to do with? If you have a hybrid cloud or hybrid public cloud storage array, the first step is to use a modern data archiving solution to identify which data stays and which data goes. To simplify things, establish the criteria that best suit your organization’s needs. What you view as meaningful criteria will vary from industry to industry and from company to company, but a basic list includes variables such as size of file, type of file, date of creation, and last date of access.
Once you’ve identified data by priority or value, you have the option to automatically, or via policy, move data to free up capacity. As an example, let’s say you define cold data as information created more than two years ago or not accessed in the last 12 months. With hybrid cloud storage, you can set policies to automatically send that data to long-term storage – without losing management control over it. In this case, when a piece of cold data is accessed, it might no longer be considered cold, so it may get moved back onto primary storage if policy dictates.
Decide Where to Put Your Data
On premises or in cloud: where is the best place to move unstructured data when it pushes the capacity limits of your block- or file-based storage arrays? Before we make that call, it’s important to note that no matter where it’s located, object storage may be an appropriate medium because it can handle large amounts of unstructured data at a reasonable price. In addition, object storage allows an administrator to use metadata to manage it on a per-object basis.
The determining factor of where it is physically located depends on the frequency of access due to bandwidth egress fees related to public cloud. While sending data to public cloud, storage is often cheaper per gigabyte than on premises object storage; cloud-based storage sometimes demands something of a premium when it comes to retrieving your data. For this reason,long-term archive that is rarely or never accessed — except for emergency, disaster or by special request— may be better served by the public cloud from a cost perspective. Conversely, the public cloud is almost certainly notsomething you’d depend on for primary storage, again because of usage and egress charge rates.
Interestingly, if you have multiple physical locations and are simply trying to contain costs associated with primary storage growth, data can be archived or moved from primary storage systems to object-based systems within your own data centers. This gives you the ability to select the storage platform best suited to tier appropriately for the normal course of business.
Of course, “normal course of business” is a relative term. A medical technology company or healthcare provider may have to meet compliance standards that require confidential patient records to be made available for immediate use for a given time period. A graphic design firm, on the other hand, may be well within its compliance comfort zone in defining “cold” data under a different set of variables.
Conclusion: Flexibility Is Key
As you can see, when it comes to storing unstructured data you’re usually dealing with rather dynamic and fluid “if this, then that” scenarios that are subject to change. A hybrid cloud system can furnish you with a cost-effective solution to these changing scenarios over time, but only if you manage the platform and unstructured data based on a few best practices. To sum up, these include:
- carefully identifying unstructured data that you want to keep handy versus data that’s ready for long-term storage
- moving data that cannot be handled by your file and block platforms to a scalable and easily managed object-based platform
- using an on premises array for tier-2 data you want to access on a regular or semi-regular basis
- sending stale and/or cold data to the cloud for long-term, low-use or no-use storage
To learn more about hybrid cloud storage, please visit https://veristor.com/datacenter/enterprise-storage/cloud-storage.