This post summarizes Cleversafe’s Whitepaper “Why RAID is Dead for Big Data Storage”. You can find the paper here. Find Cleversafe on the web, here.http://info.cleversafe.com/rs/cleversafe/images/Cleversafe%20-%20Why%20RAID%20is%20Dead%20for%20Big%20Data%20Storage.pdf?mkt_tok=3RkMMJWWfF9wsRonuqjMZKXonjHpfsX56u0qWKeylMI%2F0ER3fOvrPUfGjI4ATMFhI%2FqLAzICFpZo2FFbFuWDeZJT%2B%2FNY“>
IT professionals are beginning to consider different approaches to storing and maintaining terabytes of data. Digital content such as videos, audio, and images require massive amounts of storage space. Over the years most organizations’ data storage option of choice has been the Redundant Array of Independent Disks or RAID. RAID was developed to enhance the capability of creating similar or equal data on multiple disks. RAID is only capable of controlling up to 64 terabytes of data before an organization begins to experience loss of data and bit errors.
SATA drives experience a bit rate error (BRE) every 10^14 or 100,000,000,000,000 bits. Multiple simultaneous drive failures in the same RAID group could cause a data loss condition. In essence an IT professional will need to rebuild a drive with missing data. This is time consumer and can be very detrimental when restoring terabytes and petabytes of big data. The functionality of RAID is to copy information multiple times, known as data replication. As the demand for more storage grows, so does the demand for replicating data. Over time replicating data for large amounts of storage will become expensive.
Information Dispersal is a new technology designed to overcome the shortcomings associated with using RAID 5 or RAID 6 and storing big data. The core objective of Information Dispersal is to provide accessible data with minimal BRE. Unrecognizable pieces of data are distributed over multiple storage devices and locations. These locations are locally or globally. Additional bits are stored with each instance of data making it easier to accumulate each piece. Resilience is a key benefit of this technique – making drive failures or system crashes disinclined to effect performance.
Information Dispersal is considerably less expensive than using the RAID with Replication mechanism for storing big data. For example, Information Dispersal could slice data into 16 individual pieces(this is a configurable variable) and store them in 16 storage nodes located in 1 data center or dispersed in multiple data centers. In this example, only 10 slices are required to recreate the stored information. An organization can experience 6 data outages (drive failures, storage node failures, or an entire site failure) or BRE before data is lost compared to two simultaneously RAID 5/6 data outages. Using Information Dispersal will cost an organization 80% less than using RAID 6 and Replication, depending on the number of copies made.
As data storage capacities begin to increase, data protection functionality using a RAID configuration begins to decrease. A study was conducted to determine how much data is lost overtime. Over an eight year period and storage space maxing out at roughly 524 terabytes, RAID 6 without replication experienced a high increase in data loss at 512 terabytes within the first year. The probability for data lost decreased as the demand for storage space decreased. Adding replication will incur more cost expenditures for big data environments.
Cleversafe is a data storage company that has taken innovative steps towards the growing demand for digital storage. Cleversafe’s Dispersed Storage Platform has the capabilities of providing digital storage beyond the limits of RAID 6 with Replication. Data being transfer from the client to the storage locations are securely transmitted. IT professionals at Cleversafe have taken away the “fear of resume-generating” or admitting a company’s data has been poorly managed. Cleversafe provides a solution that will not increase cost as the demand for more digital space increase.