Beware of dedupe appliances
Beware of dedupe appliances
I recently ran into a situation on a Veeam backup server where a bunch of jobs were stuck at 99%.
Digging into the stats, it turned out the culprit was Synthetic Full creation — when a job is waiting for that process to complete, it sits at 99%, and the only way to monitor progress is to check each VM individually.


In this case, the backup repository was a deduplication appliance (a Dell EMC DataDomain), and some of those Synthetic Fulls were sitting at 30% after 24 hours — or even longer. Most likely, things piled up, creating a bottleneck that was freezing everything in place.
One thing I noticed: the recommended configuration (1 Synthetic Full per week + 1 Active Full per month) wasn’t being followed exactly. As we all know, dedup appliances are optimized for write operations, but reading from them is a whole other story. And creating a synthetic full involves a lot of read I/O. So sometimes you’re forced to skip synthetic fulls altogether and just go with periodic active fulls.
Maybe it’s just me, but I’ve never been a fan of deduplication appliances. Sure, they save disk space, but between the obvious costs (high license prices) and the hidden ones (operational complexity, updates, patching, weird backup server configs), you’re often better off buying a “normal” server with internal storage.
No deduplication, sure — but with significantly lower costs, you can get hardware that gives you equal or even better usable capacity than a dedup appliance.
And let’s face it: a Linux or Windows box is way easier to manage than a traditional dedup appliance. Restores are usually faster too. And don’t even get me started on how painful it can be to pull data from a dedup appliance when writing to tape…
Let me be unpolite… don’t be masochist! Save money, time and sleeping hours! Life is too short for paying and dealing with that kind of thing :-D
P.S.: dedicated Object Storage Appliances are totally another story.