Synology deduplicator

2/13/2023

There are two main types of data deduplication, “in-line” and “post-process”. Now, technology experts might be quick to point out that data deduplication is not literally a technology unto itself, due to the fact that it can be performed in different ways using different types of technologies. This method of data compression helps save disk space and also serves to speed up the data transfer times necessary for backing up. This lone copy is then designated with references to all the files that it pertains to so that all of these files remain functionally intact. Simply put, data deduplication is the process in which redundant (or duplicated) data is removed from disk storage, with only one copy of the previously duplicated data remaining.

This would surely not hurt your situation.The increasing popularity of online data backup has served to bring into common usage many innovative new technologies, one of which is data deduplication. You should update to DSM 5.1 if you haven't already. Then figure out if you need to change your backup methodology by either compressing the data beforehand or moving to an alternative backup solution.īy the way, as of this posting, Synology DSM 5.1 is the latest version, and 5.2 is in beta. Your first step would be, of course, to figure out how long it takes the rsync job to run. That would save you in off-site data storage costs anyway, although it does open up another can of worms. So the issue may be that you should compress the files/folders before running rsync, and then copying the compressed file to your off-site data center.

Take a look at this (" Transferring millions of images") as an example discussion on Stack Overflow, as well as this (" Which is faster, and why: transferring several small files or few large files?") as an example discussion here on Serverfault. It is common knowledge among IT folks that transferring tons of little files takes a whole lot more time than transferring a few very large files with all else equals (same internet speed, same amount of data, etc. Depending on your situation, it could be (but is unlikely) that the rsync job each week is taking more than 1 week to complete, causing a new rsync job to begin before the previous rsync job finished. Rsync in and of itself doesn't choke on large file sizes or "too many" files. They say the backup consists about 10TB of data because rsync has problems with "versioning / de-duplication" (retention: 30 days) and goes haywire.īecause of this, they suggest using a "professional online backup service", which cranks up our costs per GB to the online backup significantly. This folder is supposed to be backupped in a datacenter, weekly through our internet connection.Īll of our IT is being handled by a third party, which claims that our backup is beginning to form a certain size ("100GB+") where the default implementation of the DSM (4.3) rsync is unable to handle the vast amount of data to the online backup (on one of their machines in their datacenter). 430GB in size which only consists of the currently running projects. They have projects running which consist of high resolution stock photos, large PSD's, PDF's and what not. This NAS is used by a few designers where they directly work from. Is it true that Synology DSM 4.3's default rsync implementation is not able to handle "vast" amounts of data and could mess up versioning / deduplication? Could it be that any of the variables (see detailed info below) could make this so much more difficult?Įdit: I'm looking for nothing more then an answer if the above claims are non-sense or could be true.Īt work, we've got an Synology NAS running at the office.

0 Comments

Synology deduplicator

Leave a Reply.

Author

Archives

Categories