-
-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatic periodic docker commit as snapshot #1
Comments
So, I have also felt the pain having to reindex nodes and it taking forever. However I have been handling my backups at the volume level. For instance, in the GCE Kubernetes configurations it uses a data volume, and Google provides an easy way to do disk level snapshots. Users can also use file systems that have native snapshot support. There is a gotcha though. Sometimes the snapshots are bad, as you can't guarantee a good snapshot unless you stop the node! This also would happen using the So, I propose something slightly different. We add a way to do If the images don't fit there, then we would need to setup a docker registry for people to grab the larger, ready to run images. However, that means this project would have some level of operating costs. Not sure how to best deal with that part. |
You're right, I haven't had chance to try this solution myself yet and was hoping maybe this would be easily possible somehow, but if the snapshotting process has the risk of corrupting or losing data, it may be worse than not having at all. Maybe there's a clever way to run a round-robin of containers to pull this off somehow, I'll think about it further.
Yes, but most people don't have access to this type of sophisticated filesystem nor have the skills to do so, and my idea was that Docker approach lowers the barrier significantly so that even the people who just run a node on their laptop or own device (I imagine in the future POS may ship with a Bitcoin node built-in) can take advantage of this feature with a simple command (and no need to sign up for google cloud, etc.--to be honest I think when it comes to building Bitcoin related infrastructure, we should assume that cloud providers like Amazon or Google will someday have their economic incentives misaligned with those who run Bitcoin nodes and may ban them)
Yes, I wasn't looking to go that far, but rather just was looking for a solution that anyone with a node can use on their own. Even if a hosted service is provided, there still needs to be a way to verify the authenticity of the image, etc. and it introduces trust issues and it's too much headache. And I don't think anyone's going to pay either :)
That sounds like a great idea. |
I have a plan to allow This will probably take me a little while, but the plan has started. =) |
Brilliant! That sounds too good and exactly what I was looking for. Very much looking forward to it. Docker + Bitcoin = Future :) |
The project recommends the use of volumes as a way to persist the blockchain. This is great and is usually the "best practice" in the docker community.
Problem
However the problem arises when the blockchain becomes corrupt for some reason. Then the only way to recover is to either start fresh or reindex from the beginning (last time this happened to me, I tried both methods and surprisingly starting from scratch was much faster than re-indexing because of file read throughput) which takes anywhere from a couple of days to a week.
Solution Proposal
One solution I've been thinking about is the use of docker commit to periodically create a snapshot image from the current state. Users could have a windowing policy to keep only the last couple of snapshots for safety, and if a node goes down, they can simply restart the node from the snapshot image WITHOUT having to restart indexing from the beginning which takes a couple of days (to a week in some cases)
This also means NOT using volumes. Everything will be self contained within the docker container which can be spun up instantly like a digitalocean droplet.
What the problem means in terms of decentralization
I've been thinking about this problem for a while since last time this happened to me (every service I was running was down for the entire two days while I was re-indexing the blockchain and there was nothing I could do about it, the only way to recover was through "proof of work"). Since then I now run multiple servers with multiple bitcoin nodes, and in case one goes down I connect to another one for JSON-RPC while I reindex the corrupted one for a day or two.
However this is not ideal and hurts decentralization. Running a full node is definitely not everyone's job but it's still doable for those who have the incentive to do so. But nobody wants to run multiple nodes simultaneously when they're only using one of them at any given point. This will eventually result in most users relying on a 3rd party trusted node for reliability.
With a completely containerized Bitcoin node which can be deployed instantly from snapshots, I believe it significantly improves decentralization because:
What do you think?
The text was updated successfully, but these errors were encountered: