Sync, Security and Backups

When you have only one computer, you don't really think about data. You dump everything you have on your hard disk and then call it a day. After few years, either the hard disk is full or it has gone corrupt and then it is the time you start thinking about your data, storage needs and backups. I have lost my data a few times, none of which were due to hardware failure. During my first years with computers(I warn against doing this), I experimented on my personal data which were mostly images.

However I did not realize the need of backup even after that incident because no hardware has failed me. I now have two laptops, of which one is provided by my employing company. Now my data is divided in two different devices. I want all my data to be in sync so that it is readily available to me at all times but don't want to be worrying about my data in case of loss due to theft, hardware failure or exit from my company. All my data is full-disk encrypted so there is no need to worry about the data being accessible to anyone unless there is direct threat from state-level actors.

For Backups, I have two laptops and when synced perfectly, I won't lose much data in case any one device fails. Now I need to find a good solution for the sync.

Available Sync Solutions

Syncthing

Syncthing is a wonderful solution which I have used for more than six months. It works fine, unless I start giving it unreasonable amount of data which is hundreds of GBs. It starts consuming Gigabytes of memory and starts to take resources away from the system. I wouldn't mind giving it hundreds of MBs of memory but GBs is asking too much. And I don't fault syncthing for it, even though it could do better. Any part of the filesystem could change at any time and Syncthing needs to watch for changes in the whole filesystem. Some kind of append only log of events that have happended in a file tree would greatly reduce the complexity and resource requirements of the project.

Rsync

Rsync seems to be suggested frequently in forums I visit but I have not tried it yet. Rsync seems to be proper solution for backups and is said to be very efficient according to the testimonials. However, the need for a central server(publicly accessible IP address) seems to be a limitation. Both my devices would be behind NAT and hence I can't really expect Rsync to work.

Wireguard VPN

Wireguard is the next gen VPN driver that is being included in the Linux Kernel. It is said to be the best piece of VPN code available and I am hopeful about it. When Wireguard comes around, it will let me create a Virtual Private Network of devices that will be accessible to each other as if they were present in same LAN. That will simplify my needs drastically. However, it also suffers from the same problem as Rsync. It needs a central signalling server so that all the peers can discover each other. While I understand the need for the signalling server, it might be better if the endpoint could be an URI, more specifically a Decentralized Identifiers like Bitcoin address or anything like that. Such protocols will be driven by Distributed Hash Table and hence will not need a rendezvous servers for any network, except of course the bootstrapping servers for DHT. But we could use any public bootstrapping servers and hence a dedicated VPN server is not required for my use case.

There are some projects which try to add peer-to-peer capability in the Wireguard interface. I can only hope for the best 🤞.

IPFS

In my personal opinion, IPFS is broad project that has failed to move beyond proof of concept. Routing in large network is still a problem and the IPFS node that I run on each device itself takes too much (hundreds of MBs) when it is sitting idle. I don't think it is viable yet.

Dat Project

Similar in goals to IPFS, Dat project has modular libraries that can be used independent of each other. It has successfully developed Beaker browser that works in terms of dats(archives). The project has moved beyond proof of concept and I think is going to win IPFS. Dat has a concept of history.

Creating a syncing library on top of Dat seems to be a good project for a weekend or two.

Security with LUKS

I think full disk encryption is the best solution available that is also the most easiest one. Hoever, in my current setup, my root filesystem is not encrypted(even if it is readonly due to it being rpm-ostree root). I will move forward with full disk encryption for all the data in near future when I go down for system maintenance, which can be as late as the next release of Fedora but not later. I think I will upgrade my current systems to LUKS2 too.

There is another project in the cooking: systemd-homed. This seems to solve many issues with data encryption. If I can find a proper solution to my softwares(which could include properly signed softwares only in the system root), I will go forward with systemd-homed.

Update 2020-01-31

I have now settled for tailscale due to its easy UX and my low requirements. I will keep an eye on wirehub as it seems to be the best alternative shall tailscale fail. However, I don't see that day coming yet. Seems like a good alternative to ZeroTier too.