Thanks for posting this, amazing work!
A couple of questions:
Status Mobile is initiating a ācommunicationā with the desktop, not another way around
This is interesting, we had a time when desktop was widely used among core contributors, and from my own personal experience this would be fairly inconvenient for me.
Desktop installations are inherently more āpermanentā than mobile, as we might uninstall/install apps more frequently on mobile devices, we loose/break them more easily.
Effectively throughout desktop was the installation that I was keeping, and then every now and then I would sync a mobile phone (by entering the seed phrase as before).
If we only allow one way, it means that if I first install desktop and I later install on a mobile, my option would be to restore account through a seed phrase, and then initiate communication with desktop from mobile? That would be fine by me, as long as thereās one option and we donāt prevent this, it would be working fine for my use case.
Another use case is to link multiple mobile devices (such as my phone and tablet), is that an option?
One thing that I believe is important to keep in mind when designing the flows is that (contrary to other apps like whatapp, signal etc), we have no source of authority on how many devices/which one is authoritative etc.
Iāll try to explain that.
Letās take Signal as an example (whatsapp has limited support for multiple devices).
Basically in Signal, you first register a device associated with a phone number.
Signal will treat that device as authoritative, itās associated to your phone number and thatās how they verify your identity.
In this way, desktop apps are seen as subordinates of this device, and that information is communicated with anyone wanting to send messages to you.
So actions such as āUnlink deviceā are very clear. Once you unlink desktop, desktop wonāt receive messages anymore. And āunlinkingā your mobile phone from desktop is not possible (i.e have desktop receive your messages only, and not mobile phone), as thereās an hierarchy that is clear and can be enforced. ( https://support.signal.org/hc/en-us/articles/360007321111-Unlinking-devices)
In our world we donāt really have that option. First, thereās no central authority (signalās server in the previous example), to check which one is āauthoritativeā, second anyone can enter a seed phrase in multiple devices and access their account.
If two devices have the same seed phrase, who to trust? (This can be a user having lost their phone and restored an account, a user having two devices and having restored the seed phrase, or a malicious user having stolen a seed phrase).
So practically speaking, no device should be considered above others and imposing a hierarchy might lead into convoluted technical solution which will likely be hard to maintain (not just from a technical point of view, but UX as well).
In terms of options for how this technically is achievable, 3 options come to mind:
Whatsapp
Whatsapp provides a web interface https://web.whatsapp.com/ (I know is web but desktop can be potentially use a similar mechanism).
Here basically the phone will act as a relayer for the desktop app. No key material is shared, your identity key is always on your phone, and given that you have authorized desktop, your phone will encrypt messages to desktop as they are received (We can go in what the QR code will contain, but basically it would be just a key exchange, for example an X3DH bundle).
The benefit is that no key material is shared, the drawbacks are that your phone needs always be connected for this to work (same as whatsapp) and that still does not address the case where for example a user enter the same seed phrase on 2 devices, bypassing this mechanism.
This can be potentially applied both ways, so your desktop might act as a relayer (but it only makes sense if you entered your seed phrase first on it, and then you would sync your mobile phone).
Overall this does not look very appealing, as keeping a device always on is fairly inconvenient, it suits a web interface (not surprisingly ), but nonetheless is an option.
Signal
Signal uses a different mechanism, when syncing it will actually share your private identity key with your desktop (sending it through an encrypted e2e channel, which is what the qr code will help you set up).
This is the equivalent of restoring a seed phrase on both devices (there are variation on this, for example your donāt have to send the root key, but you could send only the chat key if you donāt want to share wallet, but the concept is identical).
Currently this is basically how multi-device works (minus the fact that we donāt send the key over the wire, only through entering the seed phrase).
The benefits is that it applies most cleanly to our settings, itās already implemented in the protocol, if itās good enough for Signal gives us some degree of confidence. The drawback is that key material is shared across devices, so inherently less secure than the solution above.
Hierarchical key scheme
This is another option, where your root key would delegate other keys and instruct other peers to send to them. So effectively no key material is shared (you only sign some authorization that will be presented to other peers).
The benefit is that this is the more secure of the Signalās way (less thanāt the whatsapp), as no key material is exchanged.
The drawbacks is that is more complex and fraught with technical issues as it does not fit very well with a decentralized system, and does not solve all the uses cases. For example, what happens if a user restores a root key on multiple devices (in this case you get the extra complexity, and you still might have to maintain a syncing like in Signalās way)? What happens if a key that is a delegate, delegates further?
I would not go for this option to be honest, as the complexity seems not to be justified.
I have only considered options that do not require any on-chain action, to best fit with current usage of the app, that does not require any interactions with the blockchain for its basic functioning.
I favor Signalās way seems as it seems to be the best compromise between security/ease of use, and covers all the use cases (included a user restoring a seed phrase in multiple devices).
Questions
To answer your questions:
- What happens on the background when you scan (QR) - Is it desktops temporary public key?
Thatās really dependent on whatās the mechanism underneath. It is likely a key exchange of some sort to build a encrypted session between the two devices that can be used for further communication (could be a public key or an x3dh bundle, but no reason to settle on a strategy for now).
- Can the user initiate and scan the desktopās public chat key (QR) from the mobile wallet or another place or should this be initiated as a new 1:1 chat?
I donāt have a strong opinion on it myself, technically both are feasible.
- Do we need to send verification code (OTP) after the user scans QR? If yes, what device should display it and which should enter it?
- Can synching start automatically after you authorize desktop client (mobile can revoke synching anytime)? Previously, you had to allow syncing on both sides (not user friendly). Is there a reason we want to continue doing it this way?
Yes, it can start automatically, we previously had it so that you would have to do it on both sides because it was non-interactive and you definitely want to have both devices have some sort of explicit action to enable syncing (so that who ever is operating either of the devices is aware that syncing has taken place, to prevent ghost devices).
If itās interactive (so an action is required on both devices), syncing can start automatically.
Overall my advice is to move a bit away from mobile/desktop having different roles, and consider them as equal as much as you can, in terms of their role they have in syncing (of course sometimes thatās not possible, for example we canāt ask a desktop user to scan a QR code )
Thereās much more to say about syncing unfortunately as itās quite complex given the landscape we have, but let me know if something is not clear. There might be other options that I didnāt consider of course, and Iāll be glad to hear them.