NIP-2: Pseudonymous Contact Discovery

Proposal Information

Proposal Name: Pseudonymous Contact Discovery

Author(s): Status Core Contributors

Date of Submission: Nov 7 2023

External Discussions: N/A

What is the problem this project is trying to solve?

As a messaging network, Status is most useful when users are matched with contacts they would like to communicate with, such as their existing contacts on less private platforms such as Twitter or SMS. Discovering existing Status users is desirable, but doing so requires users to reveal themselves as Status users, sacrificing a degree of privacy - this proposal provides an opt-in way for users to discover which of their existing contacts from other communication platforms are also on Status with limited sacrifices to privacy.

Who is this problem being solved for?

This problem is relevant to Status users who wish to match with other users they know, and who are comfortable with potentially leaking the fact that they have signed up for Status to external parties. More details on the privacy tradeoffs involved are discussed below.

Requirements / Constraints

In order to preserve user privacy, this proposal must strictly be mutually opt-in. In other words, users do not have to add any external account details, and even after doing so, do not have to pair with any particular contact unless both parties consent to doing so.

Furthermore, this proposal must in general maximize privacy as much as possible while allowing for Status users who are mutual contacts on other platforms like SMS or Twitter to discover each other.

Key Concepts

This proposal introduces a pseudonymous contact discovery feature to the Status app:

  • Verified 3rd Party Account Identifier
    • In order to be matched with mutual contacts on other platforms such as SMS or Twitter, Status users can opt-in to linking one or more 3rd party platform identifiers to their Status profile.
    • Crucially, this 3rd party platform identifier must be verified in order to prevent identity theft.
    • Even when verified, linking this 3rd party platform identifier introduces some privacy tradeoffs, as outlined in the Risks section of this proposal.
  • Contact Discovery Address (CDA)
    • In order to retain pseudonymity, each Status profile needs to have a separate Waku address (distinct from its primary chat key) that is only used for contact discovery. Similar to a public-key and private-key pair, one must not be able to derive a userā€™s primary chat key by merely knowing that userā€™s CDA.
    • The CDA prevents leaking a userā€™s primary Status chat key and identity to any party before mutual consent from both parties to become contacts has been established.
  • Contact Matching Channel
    • If a user opts in, a cryptographic hash of their verified 3rd party account identifier if posted to a matching channel at regular intervals.
    • Each Status user subscribes to this channel, and cross-references posted CDAs and their associated hashed identifier to locally stored hashes of their own contacts.
    • If a match is found, the match finder initiates a message with their own hash from their own CDA to the receiverā€™s CDA. If the receiver recognizes that hash as a mutual contact, they can either accept or ignore the request.
    • This channel must be permissioned, in order to only permit users to send the verified hash associated with their own CDA. Otherwise, identity theft is possible.
    • This channel is not visible to the user in the app UI.

User Journey

  1. Sign up for Status
    • User signs up for Status as normal, and are assigned a CDA.
    • Upon signing up, the user is prompted to optionally connect one or more external accounts such as SMS, email address, or Twitter.
    • If they opt-out the flow ends here and does not continue to Step 2.
  2. Verify External Account
    • Given a user has opted-in, they connect the external account and complete verification (ex. Twitter OAuth, SMS 6-digit code, etc).
    • A hash of the external identifier is then associated with the userā€™s CDA.
  3. Hash Local Contacts
    • In addition to generating a hash of their own identifier, the user generates a hash of all their contacts (ex. phone numbers, Twitter handles, etc) and stores this locally.
    • This local hash storage should be repeated at infrequent intervals to account for contacts added since the user first signed up for Status.
  4. Broadcast CDA and Hash
    • The user automatically sends their hash from their CDA to a contact discovery channel at regular intervals.
    • Additionally, the user subscribes to updates from this channel.
  5. Local Matching
    • Upon receiving an update from the contract discovery channel, a user compares the received hash to each of their locally stored hashes.
    • If a match is detected, the party that has detected the match knows which contact has joined Status, but does not know their primary chat key.
    • Optionally, the party that has detected the match can elect to send a mutual contact match request (very similar to the current ā€˜mutual contact requestsā€™ and including a message, but sent from the userā€™s CDA and with the additional reciprocal checking of local hashes as described below) to the matching user, revealing their own CDA and hash without revealing their own primary chat key.
  6. Match Request Responses
    • If a user receives a mutual contact match request, the received hash is compared to their local hashes, and if the match is reciprocated, the user reciprocating the match knows which of their contacts is requesting to match, but does not know their primary Status chat key.
    • The mutual contact match request recipient can either choose to deny/ignore the request or to accept the request. If the request is accepted, the users reveal their primary Status chat keys to each other, and they become mutual contacts.
    • Note itā€™s possible to also support a folder for unreciprocated match requests that include an optional message, but high degrees of spam would be expected.

Recommended Initial Parameters and Analysis [Optional]

  • Broadcast frequency: 14 days (how frequently each CDA posts its own hash in the contact discovery channel)
  • Local hash frequency: 90 days (how frequently each user is prompted to create local hashes of their contacts in connected 3rd party platforms)
  • Supported 3rd Party Platforms (note the tradeoffs with each in the Risks section below)
    • Twitter
    • SMS
    • Email?
    • Discord?
    • Telegram?

Benefits

  • Help new users ā€œpre-populateā€ their contacts so Status chat functionality is drastically more useful for new users.
  • Increase connections between existing users of Status who are already connected on other platforms.

Risks

While this proposal remains purely opt-in, meaning users can use Status without making any tradeoffs to their privacy at all, for users that do opt-in the following privacy tradeoffs are incurred:

  • Any and all Status users can see the cryptographic hash of the 3rd party platform identifier. With time and computing power, they can determine which identifiers have signed up for Status (but not their identity on Status or usage of the app unless a mutual contact match request is accepted).

    For instance, in the United States, phone numbers follow a 3-digit area code and 7-digit number format, meaning an upper bound of 10 billion unique numbers (though obviously far fewer exist in practice). Computing a rainbow table of SHA-256 hashes of every possible phone number would be trivial, meaning that in practice posting the hash of a phone number is not materially different than posting the phone number itself.

    Twitter usernames would be more difficult in theory, but again in practice by filtering to accounts in a given region, accounts that follow (or are followed by) a certain account, or accounts that have discussed certain topics, the full list of usernames could be drastically reduced, and a rainbow table created relatively cheaply.

Hash uniqueness could be increased by incorporating a salt, but as this salt would need to be public/shared in order to enable matching, it would merely require recalculating the rainbow table once.

  • A party that is able to hack, exploit, censor, or otherwise compromise verification of the 3rd party platform would be able to appear to be someone they are not. Thus contacts matched via discovery are only as trustworthy as the verification process.

    For example, a hacker that can spoof Party Aā€™s SMS can pretend to be Party A and then match with their contact Party B on Status. If Party B mutually opts in to match, they would believe they are privately communicating with Party A on Status, when in reality they are communicating with the hacker.

A security-conscious Status user can partially mitigate this risk by using Statusā€™ existing ā€œmutual contact identity verificationā€ functionality that makes it easy for Status to manually verify the identity of their mutual contacts. The mitigation is only partial because a user would have already revealed their Status chat key.

Furthermore, another class of risk to the feasibility of this proposal as a whole is related to the constraints of the verification process of a usersā€™ connected 3rd party account. As mentioned, this verification process is necessary as the primary mitigation to identify fraud, however, verification introduces these risks:

  • The cost of verification of external phone numbers (via the delivery and confirmation of a code by an automated SMS service) must be borne by some party:
    • Paid for by the user themselves: while the cost itself per user is expected to be de minimis (analogous to the cost of a gas fee on an L2), requiring users to pay adds implementation challenges and friction, which may significantly deter its usage.
    • Paid by the Status app: for Status itself to bear this cost would introduce a centralized entity as a failure point. Furthermore Status would have to generate sufficient fees elsewhere to cover the aggregate cost of verifications
  • The server that runs the software responsible for the verification process (for example Twitterā€™s OAuth API and updating a userā€™s Status profile accordingly), must be hosted and that host(s) must be trusted. If hosted by any arbitrary Status user, the verification can be compromised at the step at which a Status userā€™s profile is updated. Yet if run by Status itself, this is a centralized solution that can not scale to future decentralized approaches such as a DAO. Potential inspirations for a solution to this include keybase.io.

Relevant Links

4 Likes

(Small correction: proposal name should be fixed) Thanks for the write-up. I think this is a good topic to think about. However, Iā€™m not in favor of implementing this feature as proposed. You observe that for anyone participating there is basically no privacy against any (even slightly?) motivated adversary. Offering this feature could however signal to the user that it might be somewhat safe to use from a privacy perspective especially a not so technical user (who might still not want their contacts to be public).

I did check out the added links and there do seem to be some interesting solutions out there based on Private Set Intersection, but from some quick reading on it I donā€™t see an obvious way to overcome the privacy concerns. Maybe some cryptographers that published/coded stuff on PSI could be approached to conceptualize something that could work in a p2p/waku setting for Status?

Until a solid solution is found at a conceptual level, Iā€™d rather instead have the resources spent on the other NIPs and other ongoing dev work in Status/Logos.

Maybe a dumb question, but since one of the privacy concerns stems from sending the userā€™s external identifiers over the network, could this be alleviated by instead sending a cryptographic accumulator of the userā€™s external contacts? This way, other users could check their own external identifiers for membership against the received accumulators without ever exposing their own external identifiers.

Wondering if this was considered or if itā€™s just outright wrong, and also what would be the shortcomings to this method?

Thanks for your question.

Iā€™m not sure that approach would work in the context of something like contacts, which would not be a static list, and even existing contacts might change.

That said I think there are some very promising zk options just coming out now that could preserve privacy while also providing the necessary proof. For instance zkMe or zkPass could work well for this use case. Further research is required.

1 Like