Amazon Outage: Why Cloud Storage Requires Decentralization

Amazon’s recent outage (details here) points to why having a centralized storage model won’t cut it in the long-term vision of the cloud. Although Amazon customers realize an economy of scale as promised by the cloud, they also risk the fact that they don’t have control of where their data resides in the cloud, making it subject to centralized failures.

As a clear result, cloud storage will need to be designed on one key principle – it needs to shift to a decentralized model. So what does it mean to decentralize storage?

When thinking of decentralization, the Internet is the first place one should look to since it was designed to be fault tolerant. Storage will need to learn some lessons from the network and behave similarly in order to be successful. TCP/IP decentralizes packet routing enables packets to be routed on different paths through the network, which is beneficial in the long-run.

To borrow from the network and to prevent bottlenecks of scale, the data itself will need to take on a “packetization” architecture. This means that besides using packets to traverse the network, the data itself should be packetized and stored as packets.

Rather than saving data to centralized servers in a single data center that’s at risk for local outages, data will need to be virtualized into packets and stored across multiple servers in multiple data centers. When data is requested back, the system is smart enough to gather enough packets to reconstruct the data. Zero outage in the US Eastern Region data center? No problem.

But wait – you must be screaming about the latency… First off, networks are still increasing in speed so by the time the cloud is truly adopted by the masses, the network isn’t going to be the bottleneck. Second, for the early adopters out there, you could configure such as system to have enough packets at local data centers to address latency concerns while still having some packets spread out. That way, if the local data center is down, you can still access data seamlessly.

Once we can all embrace a decentralized storage architecture, a whole slew of design requirements come into play. Such as, how does the storage system optimize reading and writing of the packets knowing that all may not be available? How can such a decentralized system be used for content distribution to the masses?

This naturally leads to Cleversafe, where we are already packetizing data into something we call slices and already thinking of how a decentralized world of storage will work.

So how could the Amazon outage be avoided and what will the cloud storage model need to evolve to in order to be stable? Move to a decentralized platform.

Iron Mountain shutters – data security needed for public cloud adoption

I like many was as surprised by Gartner’s note regarding Iron Mountain’s decision to close its cloud storage service.  It reminded me of recent conversations with  Terri McClure of ESG where I  I mentioned one of the fundamental problems with cloud storage is service providers are essentially asking customers to trust them with their data, and most enterprises don’t feel comfortable moving data outside their four walls. Supporting this line of thought, Terri pointed out that security and control still remain top inhibitors for cloud adoption.

Cloud services that are succeeding have added value beyond cost per GB. Iron Mountain is still running their File System Archiving (FSA) business which offers policies, indexing and classification.

It’s time for service providers to focus on adding value with data security. It would go a long way to aiding public cloud adoption.

So what does it take to achieve data security?

If service providers couldn’t actually access the data, would that make customers more comfortable with adopting public cloud services?

Suppose customers were in control of virtualizing their data to an unrecognizable format before sending it to the public cloud. And only they had control to put their data back together again.

In this view of the future, service providers would stopped asking customers to trust them with their data, and instead positioned themselves as operators who can deliver on SLAs.

And BTW – such a solution is possible with Cleversafe. Today.

SNW – Cloud Storage has Arrived

SNW Spring 2011 – Cloud Standards Have Arrived

This is my fifth year to attend Storage Networking World (SNW), and each year I continue to expect new and interesting developments in storage; once again I was not disappointed.  The theme of Cloud Storage and its role in the enterprise persisted at this year’s conference, along with Solid State Drives, virtualization and the role of replication/deduplication for data protection. However, of particular interest was the elevated discussion around security in the cloud and how encryption and other security methods are taking front row prominence when discussing Cloud Storage.

I had the pleasure of talking with various industry analysts including Robin Harris of Storage Mojo (one of my favorite storage sites by the way) and David Floyer of Wikibon. Of interesting note is that both commented on the lack of any real “noteworthy” announcements being made at this year’s SNW spring conference and no high-profile vendor announcing any breakthrough technology or product offering. Contrast this with the fall 2009 SNW conference in which Xiotech announced the launch of their Emprise product, complete with magician that handed out dollars like they were candy and about 30 people manning their booth.  And while this seemed overly extravagant at the time, looking back it was highly entertaining and conveyed a sense of excitement and growth within the storage industry.

Additionally, this year I began a stint as co-chair of a new Storage Networking Industry Association (SNIA) group titled “Cloud Storage Initiative Cloud Archive and Long Term Preservation.”  Our focus is on Cloud Storage Archiving and the promotion of archiving and preservation within the cloud to take advantage of the cloud’s cost efficiencies and projected high availability benefits.  As a participant in the SNW Cloud Pavilion, we had the opportunity to participate in the inaugural Cloud Storage Theatre Presentation which allowed participants 5 minutes to describe their company and a topic around cloud storage. I took this opportunity to expand upon the growing challenges and costs associated with ensuring high-availability for petabyte scale storage systems; it’s an area where Cleversafe excels as a result of its information dispersal algorithm technology, which eliminates the need for RAID and replication.  I was actually surprised at how well attended the Theatre presentation was, with about 50 people staying to watch all of the Cloud Pavilion participants.  I am already anxiously awaiting SNW’s fall show with the expectation that Cleversafe will have an even stronger presence.

Epsilon catastrophe points to need for better data protection

Last week what is thought to be one of the biggest data breaches in U.S. History occurred when a hacker goto into online marketer Epsilon’s system that houses customer emails for a wide range of companies.

I personally received a notice from Chase, and a colleagues forwarded breach notices from Target, Verizon, and World Financial Network National Bank, which owns Ann Taylor, Victoria’s Secret and Best Buy store cards.

Epsilon has said around 2% of it’s 2500 customers were affected – it’s highly likely that you are on one of these folks’ lists and received a notification too. Here’s a partial list based on the companies who have reached out to their customers:

Ameriprise Financial, Barclays Bank of Delaware, Bebe, Best Buy, Brookstone, Capital One Bank, Citi, City Market, The College Board, Dillons, Disney Destinations, Eddie Bauer, Ethan Allen, Food 4 Less, Fred Meyer, Fry’s, Hilton Hotels, Home Shopping Network, JPMorgan Chase, King Soopers, Kroger, Lacoste, LL Bean Visa Card, British retailer Marks and Spencer, Marriott Rewards, McKinsey & Co., Moneygram, New York & Company, Ralphs, Red Roof Inns, Ritz-Carlton Rewards, Target, TD Ameritrade, TiVo, U.S. Bank, Verizon and Walgreens.

Epsilon is trying to downplay the breach by saying it is just emails addresses, but there are many non-tech-savvy users out there who will fall victim to email campaigns from people attempting to appear like legitimate businesses to gain further personal information from users.

The Epsilon catastrophe points to a need for better information security for Data at Rest. Storing data as actual data leaves it vulnerable for security breaches. Although Cleversafe hasn’t optimized information dispersal for transactional systems, it is only a matter of time until the fundamentals of dispersal where information isn’t stored as actual data but as slices or fragments that are not centralized, will become a dominate architecture for protecting data. Once information is split and decentralized, it decreases breach exposure since the cost of breaching the data is high – hackers would need to break into multiple decentralized systems versus a single centralized system.

Considering Storage as a Service

I recently wrote an article for Computer Technology Review on the topic of Storage as a Service, and in particular, how one should weigh the cost and scaling benefits against security.  The conclusion I present in the article is that the idea of Secure Storage as a Service is not oxymoronic, but it is elusive.  Some investigation and research is required to determine whether the underlying storage mechanisms of a provider align with the security policies of the organization using that service.  In the article, I present some of the advantages storage as a service offers, and I wrap the article up with a set of questions that companies should consider to begin the process of investigation and evaluation of storage provider candidates.  Read the full article here.