Response Part 2: Complexities of Key Management

In a recent post we discussed some drawbacks of encryption.  We received a lot of feedback, both negative and positive. Now we’d like to explore some of the math behind that claim and clarify some of the points we made.

This is the second post in a 3 part series. We will discuss the following points in these three posts:

  1. Today’s encryption systems are designed to provide a sufficient level of security for protecting against today’s threats. These systems are not designed to provide security for long term storage over decades. The security benefits of Dispersal are more resistant to advances in computing power and mathematical discoveries.
  2. Today’s storage systems generally make a trade-off between confidentiality and availability. Systems which accomplish both do exist, but often at the expense of low storage efficiency. Dispersal allows the user to achieve high levels of confidentiality and availability while minimizing storage inefficiency.
  3. Given current laws, companies must disclose data loss regardless of whether data was encrypted or not. Dispersal changes the conversation because a typical loss event will result in the original data being mathematically impossible to recover from any lost components.

Confidentiality vs. Availability


When storing important confidential data, one has two goals:

  • Availability: The data should always be available to the authorized entities, and never be lost
  • Confidentiality: The data should be kept private and inaccessible to unauthorized entities

So simultaneously we must keep the data as available as possible to the right individuals while also keeping it as unavailable as possible to the wrong individuals. Traditionally, confidentiality is achieved using encryption. By encrypting the data all efforts toward confidentiality can focus on the key, because encrypted data remains private to someone who doesn’t have the key.

One could stop here, having achieved a decent level of confidentiality but only if one doesn’t care about losing the data. In the current state, such data would be highly vulnerable to loss. If the hard drive, CD-ROM, thumb drive or whatever media storing the key is lost, breaks or becomes corrupted then the encrypted data will remain forever irretrievable. Likewise any similar problem with the media storing the encrypted data will cause a loss of data.

One way to deal with the threat of loss in this situation is to make multiple copies. Backup the encrypted data to tape, or to another off-site location. Likewise store multiple copies of the key in different locations to prevent the failure or loss of any one device from causing a loss of data. In doing so one will have achieved a decent level of availability.

In making all these copies to maintain availability, what have we done to our confidentiality? We now have multiple copies of encrypted data, and multiple copies of the key. Each copy represents another attack vector for an adversary, and another thing which attention must be focused on protecting. The more we try to enhance availability the worse things become for confidentiality.

If only we could have the best of both worlds: a high level of availability AND confidentiality.

Secret Sharing


Secret Sharing schemes promise just that: high availability and confidentiality. They have been known for a long time, having been invented independently by Adi Shamir and George Blakley in 1979, and some key management systems use it for storing keys.

The basic operation of a secret sharing scheme is to take a secret, in this case a key, and create some number of shares from it. To recover the secret, some user-defined minimum number of shares must be used in the calculation. This provides high availability; a number of shares can be lost and so long as you can obtain the minimum number required, you can get the key back. For example, lets say we created 5 shares for a key, and made the minimum number required for recovery 3. In such a case we can tolerate the loss of any two shares, much as if we had made 3 copies but without the associated loss in confidentiality.

In fact, the security of storing the key as shares greatly enhances the confidentiality, beyond that of storing a single copy of a key. Again using the 5/3 (5 shares, 3 needed) example, if a single share is exposed, stolen, or otherwise compromised the privacy of the key is maintained. Even if two shares are simultaneously revealed, the confidentiality of the data is maintained, because the threshold of 3 is not met. The practicality for an adversary to locate and steal multiple shares from potentially different locations is much harder to pull off than gaining access to a single key at a single location.

Information Theoretic Security

Another benefit of secret sharing schemes is that they provide information theoretic security. This means that with less than the threshold number of shares, there simply isn’t enough information available to figure out what the secret is, even if one had infinite computing power or time to try to crack it. To provide this level of security, however, comes at a cost: Every share needs to be the same size as the original secret. This is generally not a problem when storing something small like a key, which is usually less than a Kilobyte in size, however it makes secret sharing schemes impractical for bulk data storage.

To see why, imagine you are designing a secure storage system which needs to store 500 GB of data. You learned of secret sharing systems and want to use one to securely store that 500 GB of data. Lets say you decide to create 5 shares of which any 3 are needed to recover the data. To do this will require buying 2,500 GB of storage, since each of the 5 shares will be of equal size to the data being stored. Therefore you have a 5-fold increase in storage requirements and therefore a 5-fold increase in the cost of the system.

Wouldn’t it be great if there were a way to store information efficiently while keeping the availability and confidentiality of secret sharing schemes?

Dispersed StorageTM Technology


Dispersal, when combined with an All-or-Nothing Transform (AONT) makes such a thing possible. To achieve this level of efficiency requires that we abandon information theoretic security, but the practical benefits of information theoretic security are minor.  One-Time-Pad encryption is a type of encryption which is provably unbreakable, because it provides information theoretic security, but like secret sharing, its theoretical benefits simply don’t outweigh its practical costs when used for bulk data encryption.  Given that the level of security provided by the AONT can be set arbitrarily high (there is no limit to the length of key it uses for the transformation), information theoretic security is not necessary as one can simply use a key so long that it could not be cracked before the stars burn out.

By dropping the requirement for information theoretic security we can achieve the highest theoretically possible efficiency for the given level of fault tolerance. Cleversafe is unique in providing information dispersal with an AONT and was the first to make secret-sharing-like systems for efficient bulk data storage. Unlike shares, slices are only a fraction of the size of the original data, specifically they are 1/threshold the size. If we again look at the 5/3 example, instead of storing 5 shares each of which is 500 GB, we would instead store 5 slices, each of which is (500/3) or 167 GB. Therefore the total storage requirements would be 5*167 or 835 GB, this is 1/3rd the cost of the 2,500 GB system.

The availability and confidentiality benefits of dispersal become even greater the wider the dsNet. A common configuration we recommend is 16/10, that is 16 slices, with a threshold of 10. This setup has greater availability than say the 5/3 example because we can tolerate the loss of 6 slices simultaneously, compared with 2 for the 5/3 setup. Furthermore it becomes more confidential because an attacker would need to compromise 10 slices, not just 3. Surprisingly, the system is even more efficient, a 16/10 configuration would need 16 50 GB slices, so it needs only 800 GB not 835. Note that this is less costly than creating just a single copy which would require a total of 1,000 GB.

Fewer Headaches

With Dispersed StorageTM Technology there are fewer headaches than with existing key management systems. This is true for a number of reasons. The first is that one no longer has to worry about having to trade availability for confidentiality, or vice versa. One need not choose which is more important; both are important, and the storage system should reflect that fact. Dispersal provides the high availability and confidentiality of secret sharing schemes, and it can do it for your data directly. No separate system for storing keys is needed, it would be superfluous and only hurt availability.

Availability will always be detrimentally affected by adding a key management system, because it is one more thing which can fail. If the key management system fails and you lose your keys, then you’ve also lost all your encrypted data.  Short of replicating the key many times or storing shares in many locations, existing key management systems cannot match the availability that information dispersal provides.  Therefore the key management system will be the weakest link in the chain and the more likely path to data loss.

The All-or-Nothing Transform merges the key and the data such that only one storage system is required, it meets both the confidentiality and availability requirements.  The AONT is also not vulnerable to advances in math or quantum computers as RSA and ECC are, nor does it at any point rely on passwords which can be easily cracked or forgotten.

Diagrams

Encode Steps:

aont-diagram1

All-Or-Nothing Transform

  1. Generate random symmetric key: R
  2. Encrypt data
  3. Calculate hash of encrypted data: H
  4. Append (H XOR R) to the encrypted data

Information Dispersal (configured with N/K)

  1. Add padding to data up to next multiple of K bytes (using PKCS5)
  2. Divide padded data into K equal pieces, these are the first K slices
  3. Using Reed-Solomon codes, compute (N-K) additional slices to provide forward error correction
  4. Send each of the N slices to a different storage node

Decode Steps:

aont-diagram-decode1

Information Dispersal (configured with N/K)

  1. Request slices from any K of N storage nodes
  2. Using Reed-Solomon codes, compute any of the first K slices that are missing
  3. Concatenate the first K slices in the correct order
  4. Remove padding from the concatenated data

All-or-Nothing Transform

  1. Strip the appended masked key from the end of the data
  2. Calculate the hash of the encrypted data: H
  3. XOR the hash with the masked key to recover the random key: R
  4. Use the key to decrypt the data

Explanation

Ordinarily the availability world be hurt by using an All-or-Nothing transform and storing just the divided pieces in different locations, this is where the IDA comes to the rescue.  By creating additional code slices we can recover from situations in which some of the data slices are missing or otherwise not available.  Just how many are required is determined by the user-configurable threshold, which in the above diagrams is 5.  So so long as any 5 are still recoverable then the entirety of the All-Or-Nothing Transformed data can be recovered, and thus the original data can be as well.

Looking at how the decode operation works, one can see the criticality in having ALL of the data.  Without all of the data in its entirety one cannot compute the digest of the encrypted data, and therefore one cannot un-mask the random key.  Without knowledge of the random key NOTHING of the original data can be decrypted.

Costs

It may seem magical that all these goals can be achieved, but the benefits are not without cost.  Dispersed StorageTM Systems have requirements beyond those of most existing storage systems.  To achieve the full benefits of dispersal requires infrastructure: multiple sites, high bandwidth connections, at least as many computers as the IDA’s configured width, memory for managing TCP connections, and CPU power for encoding and decoding slices.  For these reasons dispersal may not be an option for everyone, however if one has massive storage requirements, desires the utmost reliability or security, or already has the existing infrastructure then dispersal may be the ideal storage solution.

Please stay tuned for the final blog posts in this series where we will explore disclosure laws and their impact on businesses.

4 Comments

  1. Posted June 26, 2009 at 8:31 am | Permalink

    Hi Jason,

    As a follow-up of my questions on the previous article, i have some comments:

    1. As stated in the Ron Rivest paper about AONT, “We note that the all-or-nothing transformation is not itself “encryption”, since it makes no use of any secret key information”. The interesting property is that if you do not have ALL the data, you cannot decrypt it.

    2. At the end, there is no secret sharing scheme, such as Shamir or Blakley. The IDA scheme is the application of Reed-Solomon codes.

    3 I’m not sure but i think that because you can reconstruct the data without having it all (but having several blocks on different copies) that collides with the AONT.

    May you please give me your analysis on these 3 points?
    Thanks

  2. jresch
    Posted June 26, 2009 at 9:07 am | Permalink

    Sergio,

    In point one, I think what Rivest was saying is that the AONT state is easily reversible, so long as no data is missing. In that sense the data isn’t even really encrypted, since there is no additional key involved. The security in our system comes from the way that without a threshold number of slices one cannot get all the data.

    Right we do not use a secret sharing scheme like one proposed by Shamir or Blakley. The secret schemes they proposed are threshold based schemes which rely on information theoretic security to ensure that with less than a threshold number of shares no information can be obtained. Instead we use Reed-Solomon codes to provide the threshold system, and the AONT to ensure that with less than a threshold number of slices it is very difficult to get any information about the original data. By difficult, I mean as hard as brute forcing the random symmetric key used in the transform.

    Regarding your final point you are right. We start with the All-Or-Nothing Transformed data, but the IDA makes it such that the data becomes a “Some-Or-Nothing” scheme, in that some threshold number of slices is needed to get all of the original All-Or-Nothing transformed data. The number of bytes needed to get to that point is still the same as the the length of the transformed data, we just have more options as to where we may retrieve slices from. Without the IDA, splitting the transformed data would always be an N of N system, the IDA allows it to become a K of N system much like the secret sharing schemes proposed by Shamir and Blakley, but without the massive overhead those schemes have.

    I hope this clarifies things for you, please let me know if you still have any lingering questions.

    Jason

  3. Posted July 25, 2009 at 10:28 pm | Permalink

    I agree with many of your points. I would like to make a few of my own.

    1) If you are already paying the large penalty to Reed Solomon the encrypted data, the cost of your secret sharing scheme is a small additional cost to bear, agreed. Using the hash to “prove” you have all the pieces is cute and does turn Reed Solomon into an AONT. I will argue that if you were to do a Blakley key split of a random key, and append each portion to each portion of the encrypted file you would get similar performance results. I will give you that your scheme is simpler to describe.

    2) In my opinion, key management is more about process than cryptography. The whole premise of Shamir and Blakley is that each share is independently managed. In your case, they are not. All of the pieces are managed by the same people, possibly in the same data center, etc. Because of this, some could argue that the encryption has little value, not because it is bad crypto, but because the shares are not independently controlled. I agree that if someone steals one piece, they have nothing. They will argue, that if someone can steal one piece, it is feasible to steal the rest.

    3) Unless broken drives are degaussed, if they are discarded, they can be considered lost. Because of this, there will be drive loss all the time (3% per year according to several papers). As long as all N pieces are not on the same media, you can actually lose the media, no problem. This can be expanded to a loss of a server, raid controllers, NAS box, etc. without problem as long as there is only N-1 pieces, no problem. What if you lose N chunks (drives, systems, etc.) over time, are you sure you have not lost control of someone’s data? Have you tracked what was on each and every lost drive? What is your process when you do a technology refresh and retire a complete configuration? If media destruction is still necessary, will resulting operational process really any easier or safer than if the data were just split?

    4) What do you do if you believe your system has been compromised by a hacker? Could they have read N pieces? Could they have erased the logs?

    5) I also suggest that there is other prior art out there for this kind of storage system. I suggest the paper “POTSHARDS: Secure Long-Term Storage Without Encryption” by Storer, Greenan, and Miller at http://www.ssrc.ucsc.edu/Papers/storer-usenix07.pdf because it covers the same space, and has a good set of references to other systems.

    My final comment is that you raised the bar, yes. I will argue that you did not make the case that key management is not needed. Secrets are still needed to get past the residual problems described in these comments. Keys are small secrets that can be secured at lower cost that securing the entire bulk of the data. Your system requires the bulk of the data to to be protected, and thus in the long run, does not offer operational efficiency that simple bulk encryption with a traditional key management provides.

    Jim

  4. jresch
    Posted May 17, 2010 at 1:53 pm | Permalink

    James,

    Thank you for your comment, it is very insightful. I will attempt to address the five points you raised.

    1) What you describe would indeed offer similar benefits and performance, however we considered it to be architecturally more complicated to integrate, since the pre-IDA data transformation would have to affect the state of the post-IDA transformation of individual slices. Using the AONT there is a single location which we can apply the transformation on the entire data source.

    2) One of the key pieces of our technology is the concept of Information Dispersal, which ideally occurs over geographic distances. Before we integrated the AONT we still advocated geographic separation between sites for improved independence of failures and outages. With AONT is offers a very strong security story, as storage of slices is split between distinct sites, and the hardware is managed by different people. Under this configuration, the process is likely better than than typical keyed-encryption deployments.

    3) One technology we are looking into to deal with lost disks is harddisk encryption based on a key stored in a TPM chip. This greatly simplifies the process for RMA’ing defective disks, the disks store no useful information and even with all N disks, one would learn nothing. Further when it comes to retiring an entire set of old machines, the key can be quickly erased. Even short of this, AONT offers a lot of advantages, especially in a geographically dispersed configuration. If an attacker was dumpster diving for discarded drives, they would have to be lucky enough to find a threshold number of drives which belonged to the same set of N disks. While certainly feasible for a well funded adversary, it is unlikely that a casual dumpster diver would have such luck.

    4) Regarding the detection of compromise, when it comes to what an attacker could compel a machine to do anything is possible. However, to foil the plans of an attacker one may institute a data refreshing period. For example, the scheduled might be to read and re-write all data in the system every 6 months. This process of refreshing the data makes old slices an attacker may have compromised obsolete. Therefore it limits the window of time an attacker has to get a threshold number of machines to six months. If one believes a machine has been compromised, that machine ought to be re-imaged, and all peer-slices to the ones stored on that machine should be refreshed as soon as possible.

    5) We’ve become aware of the POTSHARDS paper since announcing our technique. While POTSHARDS achieves a similar goal for a key-less encryption system, it was based on information theoretically secure secret sharing schemes. The advantage of information theoretic security is it can never be brute forced. However, the downside is every “share” is going to be the same size as the original data. Therefore in a 10-of-15 configuration, the data size will be 15X the original. Using Reed-Solomon with AONT forms a computationally secure secret sharing scheme. These slices could be brute forced but the difficulty would be equivalent to brute forcing the underlying cipher. The advantage is that each “slice” is only 1/Threshold the size of the original data. So a 10-of-15 configuration would only require 1.5X the storage of the original data. Therefore the level of performance and storage efficiency for our approach is significantly higher than POTSHARDS.

    Regarding your final comment, I believe the POTSHARDS paper makes a reasonable case for why key management isn’t strictly required to build a secure system. What you say, however, is that costs are lower to protect a key than to protect data, because keys are smaller in size. I’m not convinced of this, however. Keys are typically smaller than data to make them more manageable, but I don’t think this lowers the cost of securing them. In a typical keyed-encryption system, there will be at least two systems, one for storing keys and another for storing data. However, just because one location is storing encrypted data doesn’t mean it doesn’t have to be secured. The storage location still needs to be secured for availability reasons. If anyone can access the storage location, they could corrupt or destroy that data, making it permanently unavailable.

    In a dispersed storage system, there are typically more than two locations that need to be secured, however the more locations the higher the security guarantees for the system. For example, one might choose a 2-of-2 configuration for dispersal, and the security properties and storage overhead would be identical to a typical keyed-encryption system. 2 locations need to be compromised to yield the data, the loss of one location causes the data to be unavailable, etc. However by using a 10-of-10 configurations, while you need to secure 10 locations instead of 2, for an attacker to break confidentiality they would need to compromise 10 locations instead of 2. This affords users more flexibility regarding the level of security they achieve for their data.

    Jason

2 Trackbacks

  1. [...] in defense of the claims made in this post, we have written three follow up responses: Part 1, Part 2, and Part 3 which we invite you to [...]

  2. [...] approach is not only clearly explained, but relies on well known and analyzed techniques for achieving data security.  Moreover, it [...]

Post a Comment

Your email is never shared. Required fields are marked *

*
*