We got hit by the increased traffic usage spike, roughly described there: [FIXED] Suspicious mailserver traffic spike - CodiMD
One of the main reasons is that we use a single topic for all 1-to-1 communications (and group chats are bases on 1-1 communication). So, essentially all communications except public chats are using a single topic.
What is good about it and why is it done this way?
Darkness. If everyone sends messages to the same topics, it is impossible to figure out whom is talking to whom.
What is bad about it and why do we need to find an alternative?
Since each communication happens in the same topic, each client has to download all the messages for this topic, not matter who sent it. That mean, any active 1-1 chat or a group chat will increase traffic for everyone on the network. With Status audience growing, this topic will become more and more busy. Today (20181207), we got > 100Mb of messages in this topic. That leads to slowing down every client and huge traffic usage.
Hence, we need a way to somehow split this topic into multiple topics, hopefully, preserving darkness as much as possible. That might not be possible, and we might have to tradeoff darkness to less traffic usage, otherwise, very soon, we will end up with each client having to download gigabytes of data each day just because of the activity of the users he doesn’t know anything about.
I want to have a call mid-next-week to discuss it, because it needs to be addressed.
Questions that are needed to be answered:
(1) what is the best way to go multi-topics without trading off too much darkness?
(2) how much darkness can we trade off for that?
Any ideas are appreciated in the comments or on the call.
Ping me here if you want to participate.
I would like to participate, also I had been working on this before, but there was very little appetite for this and the PR received no reviews after quite a few solicitations https://github.com/status-im/status-react/pull/5004 and [DO NOT MERGE FOR NOW] Dont use discovery topic by cammellos · Pull Request #4135 · status-im/status-mobile · GitHub
I would like to participate.
I think is possible to keep almost the same darkness by making a dynamic list of topics used for private messaging, where clients would shuffle through this “pseudo random” topics through private conversations.
A darkness adjustable on demand could be measured on how much “not for you” messages downloaded, which imply on
more darkness = more traffic.
The clients would automatically select an amount of darkness based on auto adjust (or user) setting, and might be able to disable it. Clients should not accept messaging with too far amount of darkness then they accept. A warning could be displayed to user if they are communicating in a less darkness setting.
A broadcast topic is used to handshake, where is defined the different topics for each message nonce to be used on the conversation.
The broadcast topic could be a single one where everyone listen for new “connections”; also the darkness setting can be used as broadcast topic, this would make easier for other nodes learn which about the business of topics.
The handshake could also use a topic derived on the identity public key (used for public chat) of the receiver.
If a topic become too busy clients could handshake again and select a more broad list of topics.
How the function of how topic list is generated can be defined in many ways, but it can simply borrow architectural elements from forward secrecy. The most simple would be to use the first bytes of receiver message public key (which changes on every message) as topic. The amount of bytes used as hash would be at least one or two, and are limited by the size of public key. This might not apply for group chats, which would probably use the sender key to variate on what it should listen.
status-react PR 5004 is going to a good direction. @cammellos do you think what I described above makes sense? Seems like you were going to a similar (or same) direction there?
@igor if outage was a spike, and not organically grown, then I think group chats or status-js might be related with incident.
@ricardo3 topic ratcheting makes sense, it is something we discussed before.
The main difficulty with that approach is wallet recovery, namely if Bob lose all their data on one device, Alice will try to contact Bob on a different topic (Bob does not know that he was talking to Alice), so we need a way to check from Alice to Bob that he knows about the topics (basically falling back to the broadcast/handshake topic).
We talked about heartbeats or any form of storage (swarm/ipfs/contracts) so that at least some state can be recovered and another handshake can be performed.
But all the solutions comes with tradeoffs (darkness,bandwidth etc), so I would first get the broadcast/handshake topic right and then move onto ratcheting/multiple topics.
Another issue in general is how to keep backward compatibility, without having to publish multiple messages, probably some form of protocol negotiation is necessary, upgrading to the new topic on confirmation.
As every client would be scanning for the broadcast topics for first handshakes with new contacts, so it would request a new handshake as a new contact.
By using the hash(hash(Bob private key + Alice public key) + message nonce), could be used to build a predictable pseudo-random topic.
hash(Bob private key + Alice public key) would be given to alice at first message, so Alice can know what topic to send for.
If so, in new handshake, the Alice would notice the unexpected handshake and send the nonce that would help Bob know what are the recent messages topics, and continue from that.
So the loss of messages would only happen if both Alice and Bob lost all their data.
Compatibility is sure important, specially for multiple forks being able to communicate and fallback to “status default” if their special communication is not supported.
the psudorandom topic generation is by using private key + pubkey hash of receiver and answer that in the handshake, with the nonce to be used, and vice versa. The public key used is the one from username.
This hash would be then hashed with the current message nonce, so each time a message is delivered it increases, This nonce is given in the handshake
the scalability is by using amount of bytes from the hash result: ex:
Hash(Hash(Sender PrivKey + Receiver PubKey), Nonce) = 0x1122334455667788990011223344556677889900112233445566778899001122
The topic to be sent depends on the “exposure” level:
Exposure 0 = 0x0000001122
Exposure 1 = 0x0000112233
Exposure 2 = 0x0011223344
Exposure 3 = 0x1122334455
This exposure amount would be agreed in Handshake, and clients would choose by observing the handshakes, so if many users in lower topics then
The handshakes would happen in fixed topics:
0x0000: exposure 0 handshake topic
0x0001: exposure 1 handshake topic
0x0002: exposure 2 handshake topic
0x0003: exposure 3 handshake topic
0x0004: exposure 4 handshake topic
Or the handshake could also happen in the “first message receiver” pubkey derived topic, this would make observing the handshakes harder, and the exposure level should be defined if clients notice too much messages in the private messaging topics being used.