Data Security and Backups

Friday 16 September 2011

Understanding Data De-duplication

What is De-duplication

De-duplication explained in simple terms could be, the managing software will look at data, compare it to existing data that is already stored and if that data is identical, instead of storing that second copy of data the de-duplication technology establishes a link to the original data. Rather than storing the whole file it establishes a link which takes less storage space.

Imagine if you needed 100TB's of storage but one of your providers only needed to supply you with 50TB because they had this technology, clearly that provider would have a significant advantage in winning your business.

How it works

For example, suppose a company has 100 employees and each one has their mailbox. If an email which is 5 MB is send to each of these employees, the data stored shall be 500 MB. Instead, in de-duplication that same shall be stored as 5 MB for the first and thereafter other instances shall establish as a link to that first instance.

In the above example, a 500 MB storage can be reduced to 5 MB. Data de-duplication ensures that only the unique data is saved to disk. Subsequent iterations of the data are only saved as references which point to the saved copy, so that end-users still see their own files in place.

Why de-duplicate - Benefits

Basically, de-duplication as the term suggests, is to remove duplicate data that keeps getting stored over and over again consuming lot of unnecessary storage space, power, internet bandwidth and increasing costs for resources to maintain.

Some of the benefits are listed as below:

§  Lowers storage costs since eliminating redundant data

§  Ideal for organizations wishing to backup, consolidate and improve
   performance during backups.

§  In cases where the data is being backed up or archived over and over
   again, the realized storage savings get better and better, achieving 20:1

§  Eliminating redundant data can significantly shrink storage requirements
   and improve bandwidth efficiency.


Three kinds of de-duplication technology
§ File de-duplication. Only one copy of each identical file is stored. This
   technology is also known as Single File Instance technology.

§ Block-level de-duplication. Divide the information into blocks and
   only one copy of each identical block is stored.

§ Byte-level de-duplication. Analyze the content of the information to
   be de-duplicated at byte-level and store only the unique data. This is
   the only technology which guarantees fully redundant elimination.

No comments:

Post a Comment