The process to perform these sanity checks before deleting the block is too resource intensive to perform real time, so must be performed post process. This is advantageous as when a block is deleted that exists in the shared area several checks need to be made to check if this block is still required.
We know that approximately only 10% of data will be deduped, and therefore the net results of this design is that the majority of data will sit in the private area. it is a duplicate) it is placed in the shared space. When a write comes into the system and has a known hash (i.e. it has never been seen before) it is placed in the private area of the volume. When a write comes into the system and has a unique hash (i.e. There are two elements to how deduped data is stored in a system, a private space that exists inside every volume and a shared space that exists one per CPG. In summary a significantly greater proportion of data will be unique versus deduped. Given that the average dedupe level seen on a 3PAR is 2:1 it would be logical that half the blocks would be deduped and half unique, so why would we care where the blocks are stored given that the number of each is equal? Further analysis has shown that not half of data is deduped, but that 10% of data is deduped several times over giving an overall dedupe ratio of 2:1. This amendment will effectively determine where unique and deduped data is stored. How data is now stored on disk is shown graphically below. What does change is how deduped data is stored to disk, it is written to a shared area within the CPG versus to a private space in the volume previously. Zero detect is one of the original thin technologies that removes zeros from writes, most of you are probably already familiar with this so lets focus on the new and updated technologies, stepping through each in turn.ĭedupe continues to operate like before, analysing incoming writes in 16K pages assigning a hash for each and checking if this is unique to the system, this is all done inline before the data is written to disk. There are no changes to Zero Detect but dedupe receives a code update. Data Packing and compression are new technologies.
The new data reduction stack in 3PAR OS 3.3.1 and the order they are applied is shown in the following graphic. To see what else is new in the 3PAR OS 3.3.1 release check out this post. Well not 3PAR it’s gut busting, data crunching release comes in the form of the 3PAR OS 3.3.1 which combines existing data reduction technologies with new ones including compression. HPE are a bit late on this release, it’s normally January that people want to start losing weight.