During Chaos Unicorn Day (observations, retrospective) we noticed that the Status app was unusable by default. While there are many components of making Status usable (several outlined in the retrospective), it didn’t take too much work to get the app into a somewhat usable state for chatting.
The ultimate goal of Chaos Unicorn Day is for there to be no difference between if it is happening or not. One small step we can take towards this is to increase out bootnode diversity.
What is a bootnode?
A bootnode is the first thing you connect to do discover the rest of the network.
In non-technical terms, using a slightly stretched metaphor, we can imagine a huge city with lots of unmarked streets. To navigate it, you need an entry point, or several. This can be a person that you know, who can tell you who to ask to find your local yoga studio, butcher or cryogenic institute. If they don’t know personally, they probably know someone or some tool you can use that’ll get you closer (some friend that exercises, or how to get to the local library). If you don’t know anyone (and have no other tools, such as a map, dictionary, or - god forbid - search engine), you’ll have no other option but to wander millions of streets (IP addresses) and will likely give up. In practice you are also blind, so you need very precise instructions.
At Status, we are currently hosting these Whisper network bootnodes.
What is bootnode diversity?
Several ways of looking at it, but essentially about avoiding a monoculture. We are already geographically diverse, in that our cluster has nodes in US, EU and Asia.
However, they are all controlled and paid for by Status LLC (equivalent). This means they are a single point of failure, and as we can see during Chaos Unicorn Day Status could force the network to go down. This is undesirable.
(Additionally, more geographical diversity and people using different sets of bootnodes is also desirable, as it helps avoid censorship attacks, whether direct or indirect.)
The general principle here is one of redundancy: anything that can fail will.
How can we increase it?
This is what is to be discussed. There are many approaches, though we should start by having the basics done: Status LLC should not be able to forcefully bring down the network.
How do other projects do it?
Ethereum
-
Ethereum appears to have the same issue in that it relies on EF bootnode servers bootnodes.go. Goerli has at least two sources of nodes.
-
Parity client seems to have more foundation.json, though it isn’t clear what the source of all of them are (some hints in PRs).
Note that both rely on specific high up time IP addresses.
Note: this is just cursory investigation done by OP and it might very well be wrong, i.e. there could be other sources of bootstrapping connection to the network. If you think this is incorrect, please correct.
You can also add your own bootnodes with bootnode
flag, and I imagine most serious Ehereum deployments leverage this (Infura, Exchanges, etc).
Also note that these bootnodes are for Eth protocol and not for Whisper.
Bitcoin
Bitcoin uses a different method, where they leverage DNS lookup to connect to network. They also have some hardcoded addresses.
Read more here.
Tox
Tox has a plethora of bootstrap nodes run by community volunteers. They also have a mailing list for coordination, as well as tests to see if the bootstrap node is good or not.
Questions and further work
More projects?
A lot of other P2P projects out there, both current and historical. What have they used it the past? What has worked and not worked?
Security considerations (byzantine case)
If we allow other nodes to be added as bootstrap nodes to our code base, what are we exposing our users to? In terms of eclipse attacks, etc.
(Note that this is in some way the wrong question: if an attack works for untrusted nodes then that same attack works for our so called “trusted” nodes, which is what we want to avoid.)
Availability considerations (friendly bad case)
Assuming we let some semi-availability nodes into our releases, what are the consequences? I imagine it’ll just be a loop with pings and then marking that bootnode as bad, then taking it out of rotation the next release.
Unlike (current structure with) mailservers, bootnodes don’t necessarily need to be high uptime.
Dynamically changing nodes?
What are the best mechanisms for dynamically changing a list of bootnodes? E.g. based on availability etc. See Status.app for some ideas in this direction.
Avoiding a monoculture
How do we avoid everyone using the same bootnodes? To what extent does this matter? This is one worry with everyone relying on a single “smart” contract - it is another single point of failure.
How can we get the community involved?
Fewer people are running Whisper nodes than Ethereum nodes. What simple things can we do to get people to run nodes? E.g. Gitcoin badges, etc.
I don’t want (monetary) incentivization to be an excuse here, because it leads to lazy thinking and punting on the problem. Of course it’d be better (and it is necessary long term), but it’s not the only way to do this, as history proves. In this thread the premimum is on KISS solutions.
It’s also worth noting this is a general problem in the Ethereum space, and lots of projects are doing a great job here, like Dappnode etc.
This reduces to the “easier” question of how we get core contributors involved.
How can we get core contributors involved?
Instead of talking about something abstract, let’s start with ourselves. I’d like to see all Status contributors to run their own nodes, just like we are all using Status. This is something it’d be great to see People Ops involved in, as we need to make it easier for non-technical people to run and understand why this matters. Bruno has a lot of thoughts on this too. So far the people who have engaged most with this are engineers, but we need to change this. Ideas for making it more inclusive are welcome. There’s also a huge difference between people at Status running their own nodes and Status cluster
Perhaps as a start we can do something similar to “Slack deactive” and principles signing? Gitcoin/Kudos badges or whatever.
Most important - what is the minimal thing we can do to ensure next Chaos Unicorn Day things still work?
I’ll start with the most basic action I can imagine:
Include the bootnodes Jacques, Ricardo and Bruno (and possibly more) hosted in Status. Assuming they want to keep them up. Then give them unique flair as NODE OPERATORS (informally in a thread, through Kudos or whatever people think is appropriate).
Additionally, explore ways to get and use a mutable pointer to multiple bootnodes. E.g. DNS/ENS/Swarm Feeds/smart contract to a set of bootnodes. The selection of this set can be different and there’s no need to have a one-size-fits all. E.g. it can be voted on (see Dean’s proposal), decided by core contributors (current but static) or whatever.
What else?