Should we remove public chats?

nbold · December 10, 2021, 3:45pm

Status has a problem with spam. Spam in public chats.

When a new user creates a Status account and goes in to any of the 9 suggested public chats here:

This is what they will see:

(well, without the “SPAM” tags)

There are several more public channels suggested via “Join public chat”, I looked at ten picked randomly and they all had similar issues.

An important question originally posed by Carl:

how can I inspire myself to use Status?

I do not find value in the Status public chats, as they are now. Our existing countermeasures (blocking users) require continuous effort from me, and still lead to me seeing more spam messages whenever a new account is spamming.

Spam is driving people away from Status.

We are going to remove public chats when the communities features are released. Communities will resolve problems of spam and moderation internally, but will not address the problems of spam in public chats.

Spam has been an issue with Status for a long time, and it seems like we haven’t done much to deal with it. From a user’s perspective, we have been ignoring this. While this isn’t true - particularly given all the work going into communities - it’s a bad sign that users think we are ignoring their input, especially for such a longstanding issue. Many solutions have been discussed and presented, but we have not made any visible progress toward no spam.

I think we may as well deal with it now, instead of waiting for communities and letting the current landscape of public chats drive away users in the interim.

I propose that our public chats are a detriment to the user experience, and that removing public chats sooner than the release of communities will increase retention and provide a better experience for our users.

Some arguments against this:

some users may be using “hidden” public chats that are not currently afflicted with spam
removing a feature without anything given to users in exchange may be negatively received
there are other solutions to this problem that do not involve removing a large feature

Please join this discussion about what to do with public chats. Should we remove them now? Are there other solutions to the issue of spam that could be applied in a short timeframe?

[Edit:] Twitter poll results - “Should we remove public chats?” 57.7% yes, 42.3% no

Here is an outline of prior literature on the subject, with approximate summaries of some of the material.

Blog: Spam mitigation efforts

Status is committed to privacy, censorship resistance, and decentralization.
Privacy means an aversion to in-app tracking and analytics, so automated analysis of user activity is relatively infeasible.
Censorship resistance - Status wants to delegate moderation and censorship to communities, Status ideally does not censor anything.
Decentralization - solutions to spam should not introduce centralized dependencies

The community is collaborating to find solutions to this problem.
The short term solution introduced around the time of this post was to add a size constraint to messages. This was considered a hotfix.
Long term solutions should benefit the user and live up to Status values, delivering a privacy first, decentralized experience. Several ideas from the community are listed in the blog post.

Blog: Can we can spam?

spam is difficult to objectively define, in practice “whatever content a typical receiver may complain about” is usually good enough
spam is an assymetric challenge, far easier to spam than to stop spam
many platforms resolve this via Sybil resistant personal info collection (e.g. requiring a valid phone number to sign up)
proof-of-work at the sender’s end is an interesting idea (and was the initial motivation for PoW)
Challenges of spam for Status:
- blocking users is ineffective as it is cheap to create new accounts
- Status is open source, so proprietary filtering rules won’t work
- Status runs on a P2P network so any spam policy will have to be implemented in messenger clients or at relay/history nodes. There is no centralized server to do spam filtering.
- Spam should be defined, monitored, and mitigated in a decentralized way
- Is spam free speech?
Current mitigations (Oct 2020):
- proof-of-work: doesn’t really help, meaningful inconvenience to spammers is too great a UX impact for non-malicious users
- rate limiting: users cannot send more than 5 messages per second. This limit does not prevent spamming.
- messages longer than 4096 characters are automatically ignored when both sending and receiving.
- block users: requires perpetual effort from users
several possible solutions are outlined with pros and cons

Vitalik: Longer form thoughts on DoS / spam prevention

Solutions fit into the following categories:

Chats with whitelisted participation
Economic rate-limiting
Non-economic rate limiting
Shared block lists
Chats with cryptoeconomic collective moderation

Possible solutions:

subscribe to other people’s blocklists
cryptoeconomics as moderation (details)
implement a cost for posting rights (sign with a key that has >100 SNT locked up, or ZK-prove a brightid)
view only messages from accounts that have an ENS name

content-based filtering is not ideal because it has to be closed-source

Possible solution: keyword filters so users can block messages containing user-specified keywords

Complaints about spam:

https://twitter.com/InsideCryptoLan/status/1462027929131511817
https://twitter.com/FostersLaw/status/1453385028596076546
https://twitter.com/__padpad/status/1443257181823016963
https://twitter.com/CryptoUnknown/status/1391812665719222275
https://twitter.com/Glimpseheaven7/status/1380665216040542209
https://twitter.com/vanparalyk/status/877198560659472384

https://www.reddit.com/r/statusim/comments/rar8m4/something_needs_to_be_done_about_spam_public/

vbuterin · December 10, 2021, 9:32pm

How far away is Status from being able to support publicly-accessible-in-practice chat rooms using any of these ideas (I’m assuming the communities feature does this?)? I’ve been a proponent of these kinds of techniques for spam mitigation for a long time. I definitely think that public chats are near-unusable in their current form, and Status needs some kind of alternative.

I wonder if there is some middle ground that is very easy to implement and can be used in the interim. Perhaps add a “only see messages in this chat from accounts with ENS names” feature, and the next time a user tries to look at any chat they get a prompt asking them if they want to turn the restriction on or off for that particular chat?

ducheng · December 11, 2021, 2:57am

Really what I think needs to happen is we offer the people a way to automatically block users who say specific keywords that the user decides. This should be off by default, entirely up to the user, and only client-side, but it would be awesome to block anyone who says “contact me on WhatsApp” since I see a LOT of that.

I think we should remove the suggested public chats feature, but I use public chats for my own services and 3rd party software, and removing it would be bad for many groups of people who rely on the feature.

nbold · December 13, 2021, 3:54pm

It looks like communities will be late 2022. Communities will have token-based roles (for an entire community or for individual channels), and moderation privileges attached to some of these roles. This will include chats with whitelisted participation–both communities (analogous to discord server) and community channels (analogous to discord channels) can be token-gated, or certain channels within a community and not others (or none at all). And then an admin could create a whitelist via e.g. an airdrop. This will also support chat rooms where a smart contract controls moderation/roles, where any smart contract that has some sort of desirable moderating behavior can be “wrapped” with functionality to manipulate tokens (that would be measured and determine roles, e.g. “users with 10 SNT and 20 DAI or 2 ETH can talk” etc).

I don’t think shared block lists are a planned feature right now but it seems pretty feasible to include. Moderation/token-based roles will not be mandatory for each community, but they are the only currently planned prevention mechanism for spam. Eventually deposits of $X value will be optionally required (“optionally required” heh, i.e. it will be optional to add a requirement to a community) to join a community and will be refunded if and only if the user voluntarily leaves the community (burned otherwise), so spammers will have to pay a linear cost to continue spamming any such community.

I wonder if there is some middle ground that is very easy to implement and can be used in the interim. Perhaps add a “only see messages in this chat from accounts with ENS names” feature, and the next time a user tries to look at any chat they get a prompt asking them if they want to turn the restriction on or off for that particular chat?

I think an easily-implemented anti-spam mechanism would be ideal (and is my preference over removing public chats), but Status is focusing resources on delivering tokenized communities as fast as we can and people are hesitant to divert resources in the meantime.

ENS-name-filter is easily-implemented, but it’s a one-time cost to spammers who are okay with missing some users that block them, and L1 gas prices make things difficult for non-malicious users, once L2 integrations are done we could probably do something like this

nbold · December 13, 2021, 4:14pm

Such filters would definitely be a neat addition, as well as a better interim solution, this is perhaps something that could be included in the meantime if any resources can be spared.

Do you use public chats in a way that a group chat could not work as a suitable replacement? If so, what functionality do group chats lack that prevents this? Feel free to answer with as much or little detail as you are comfortable with

Alex · December 13, 2021, 5:07pm

I’m in the camp of not entirely removing public channels - for now that is, as it’s such a large feature and as you say to remove it without replacing it with a similarly weighted feature may not sit well with users that amongst other mentioned reasons.

So that leaves us with a dilemma, keep public chats and find a way of mitigating the effects of spammers using a relatively easily implementable fix either by ourselves or by an action from the user (e.g toggle on/off messages from ENS names only). The keyword idea is interesting but without too much extensive knowledge how easy is that to actually implement and deciding on the parameters would also be key to it being effective.

I get the impression for now its a necessary evil, spam in public chats has been an issue for such a long time it seems as though we are trying our best to hold out for communities. Having that said hopefully in 2022 we maybe able to divert some more resources to address it.

nbold · December 13, 2021, 6:51pm

I think we need to find a solution that makes the public channels usable (at least in a relatively short timeframe), is a serious hindrance to spam, or we should get rid of them. I don’t think we should be holding onto a feature in Status that is unusable without taking measures to essentially subvert the feature (e.g. using obscurely named/hidden public chats that have not yet been discovered by spammers).

If Status had a larger userbase, the current level of spam would be unacceptable and demand immediate action to fix. I think the current retention issues are in part a symptom of this problem, and the longer we go without some sort of reasonably functional solution, the harder it will be to attract users for us in the future as well.

If we end up not removing public chats when the communities features are released, they will still have these problems with spam (and still exacerbate retention issues and damage the reputation of Status IMO). We need a specific method to address them.

hester · December 13, 2021, 6:54pm

I’m not fully up to speed on the progress of communities beyond phase 1 being in alpha. @vbuterin you can enable it under Advanced. Although I don’t know all details, I’d say it includes whitelisted participation. Creator of the community holds a key and can accept members, similar to a private Telegram group. I’ll leave further updates for those better informed.

@nbold What I understand is that while development for communities is in progress we still need to address the issue that public chats currently are damaging retention. I see the issue of resolving the impact current public chats have on retention as a slightly different and more attainable issue to resolve than finding an intermediate anti-spam mechanism. Also, I agree with your take that Status is focusing resources on tokenized communities first and foremost. Meaning that any other actions should require as little effort as possible and be actionable immediately. Below are the most low hanging fruits the pragmatist in me can imagine. Both focus on reducing exposure to frequently spammed public chats for everyday users rather than removing public chats.

Remove hard coded list of chats
This list appears when you open the view to Join a public chat. It was added to improve discoverability in lieu of more organic topic discovery. It actually makes people start of their experience by entering the most spammed chats.

Use more chat (referral) deeplinks
Deeplinks to chats already exist, e.g. https://join.status.im/offscript, and are underused. (IMO it’s actually pretty cool that appending the url coordinates joining the same channel). A deeplink can be shared in semi-private groups to keep them sort of shielded from spam by virtue of the channel not being widely known. I.e. those relying on public chat can still use it. The caviat is that these deeplinks only work for existing users. To invite people who don’t already use Status, a registration is needed to pass through Google’s install referrer (Android only). Not ideal, but it can work for targeted campaigns.

stefantalpalaru · December 13, 2021, 6:59pm

I like this, combined with bayesian spam filtering: https://bogofilter.sourceforge.io/

People could enable one or more public databases of manually classified messages (privacy concerns here, so you can only train them on public channels) while maintaining their own private db.

It’s the same problem domain as email spam or blog comment spam and it’s mostly solved.

ducheng · December 13, 2021, 7:25pm

@nbold I really don’t want to interact with Ethereum and tokens unless I absolutely have to. Communities do NOT need tokens to be integrated to have full feature parity with Discord. I would like to keep crypto and communities/social separate.

@hester I think we should simply remove the suggested public chats, since those are the only ones being attacked by spammers. Keep the public chats feature, just remove the suggested ones so the userbase is forced to create their own public chats instead of all being centralized on a few dozen public chats that are easy to spam.

ducheng · December 13, 2021, 7:32pm

All I want is a peer-to-peer encrypted WeChat alternative, I want the crypto stuff, but I’d prefer it to be separate from communication and social.

nbold · December 13, 2021, 7:52pm

@stefantalpalaru I had basically discounted this as a centralizing solution but I really like your idea of having public databases, making it not “Status’s” spam filter, but rather publicly available spam filters (and perhaps users could import their own). Reminds me a bit of ublock origin:

@hester Definitely agree about removing the hardcoded lists of chats, they seem to be a point of concentration for spam. I suspect the obvious chats (e.g. general, #status, etc.) will still end up bombarded with spam–I don’t mean to only suggest that we remove things, but if these channels will end up spammed anyway it might behoove us to remove (most of?) them for now, at least until we have other preventative measures in place

stefantalpalaru · December 13, 2021, 7:58pm

Exactly. Allow loading these spam-classification databases from any URLs, and it’s decentralised.

nbold · December 13, 2021, 9:15pm

And if we can support this, a per-user keyword filter seems a fairly trivial addition, since we’d already have to equip clients with the tools to filter incoming messages based on the spam classification dbs, per-user filter would amount to basically or any(kw in message for kw in bad_kws) (plus local storage of user-defined keyword blacklist)

frank · December 14, 2021, 12:21am

i like status interact with Ethereum and tokens. otherwise it will like a traditional privacy app. maybe u can fork a new status.

fryorcraken · December 14, 2021, 4:28am

I think this would be a good first step so that it removes this first experience of spam chat for new users.

I also like @vbuterin’s idea of whitelisting ens name. I’d see that as a per-chat setting.
So if I join a public chat, i could enable it and maybe in the activity centre I could see the first new message from a non-ens user (similarly to someone not in my contact sending me a 1:1 chat).
If I join a chat via an invite (e.g,. public chat) then I’d want the ens whitelist feature to be off.

vbuterin · December 16, 2021, 3:17pm

The problem with block lists and pretty much anything else that’s not cryptoeconomic is that malicious users really can just keep creating a near-unlimited number of accounts and automating the process.

In fact, if I were a spammer, I would literally run a script to make a new account for every single new message I send. This would make anything blocking-based completely useless. And I think preventing this kind of attacker should be the main goal.

I’m also quite bearish on keyword filters and Bayesian anything. The problem with pretty much anything in that category is that in order to be effective, it has to be closed source (which spam filters generally are). If it’s open source, then attackers can reverse-engineer and easily figure out how to beat it.

Fundamentally, the solutions to spam that work in the centralized/closed world and the solutions that work in the decentralized/open world are just incredibly different.

This is why I do think that biting the bullet and embracing something cryptoeconomic and on-chain is going to be necessary. Sure, if it’s only $0.20 to create an account, then attackers can create an account and spam and lots of people will see messages, but at least it will be possible to block, and then shared block lists, even automatically subscribing to block lists to people you reply to enough times, etc, will be able to do the rest. And if it’s $20 to create an account, then attackers are just not going to be willing to do that.

(BTW blocking should be easier than today. “Block this user” should be a UI option right under “Reply” and “Copy”)

In the short term, I think a solution that excludes most users from public chats is okay. The status quo excludes pretty much all users. In the medium term, L2 integration can help, and in the longer term communities could experiment with forms of whitelisting that are non-crypto-based for the users who need that (eg. ZK-SNARK proof of twitter account without revealing which one).

oskarth · December 17, 2021, 4:04am

Fundamentally, the solutions to spam that work in the centralized/closed world and the solutions that work in the decentralized/open world are just incredibly different.

I definitely agree with this. I think what has happened is that there’s been a bit of a bifurcation in terms of priorities, both when it comes to timelines as well as what different teams focus on.

Status Communites Moderation

On the one hand, there’s the Status app/product POV which is 100% focused on Communities. The solution for spam there is to have moderation, and acts more like a decentralized form of Discord. People in the Status product team (mobile/desktop) can speak more to this - my understanding is that it is at an advanced dogfooding stage but still needs a few kinks to be worked out.

Cryptoeconomic private economic spam protection with RLN Relay

On the other side, there’s the cryptoeconomic design path. From a Vac/applied research POV, this is the RLN Relay effort (https://vac.dev/rln-relay). There’s still a bit more work to do here, but we are close to being able to do internal dogfooding through nim-waku (not with Status chat protocol though). Here are some bullets on current state of things:

Latest changelog for nim-waku: nwaku/CHANGELOG.md at b2273fff9a98a8918e0e8b5012aee1754ba3e943 · waku-org/nwaku · GitHub also see e.g. RLN-Relay suggested next steps · Issue #72 · vacp2p/research · GitHub
There’s also experimental work by tcb to get this supported in js-waku
Currently RLN works slightly differently in nim and js land (Bellman vs Circom), WIP to reconcile and use Circom 2.0 for both Evaluate different code bases for ZKPs · Issue #93 · vacp2p/research · GitHub and https://github.com/gakonst/ark-circom/issues/8
Smart contract integration/reconcilation with nim-waku needs more work
Still some work to do re actual L2 deployment etc re gas costs, see https://forum.vac.dev/t/on-zk-gas-costs-and-scaling-solutions-today-tomorrow-and-beyond/104
Draft paper with some open problems (not all have to be addressed for initial work) research/rln-research/Waku_RLN_Relay.pdf at master · vacp2p/research · GitHub

We also have a #rln channel in the Vac discord Discord for people who are interested. We should be able to start rough dogfooding early next year after the holidays.

Integration effort and incidental complexity

The path from Status app to even experimental RLN is quite long. First, Waku v2 has to be integrated into the app. Waku v2 integration is well underway and almost done, at least for Desktop. This is a requirement for Communities to function as intended at any reasonable scale (see https://vac.dev/waku-v1-v2-bandwidth-comparison for benchmarks). This is currently being done using go-waku, which is a minimal subset of Waku v2 to get integrated into the Status app.

In addition to above, there’s (or was) also a separate effort in terms of integrating nim-waku into the Status app. nim-waku is where all of this applied research work is happening, which is in line with the general strategic plan of building a decentralized tech stack together with Nimbus, nim-libp2p, Dagger, Portal etc. This effort, for various reasons, has currently been deprioritized by Status product teams. This is a form of incidental complexity and mismatch that makes the integration process and feedback loop less than ideal.

Status product POV

From a Status product POV, the number 1 and 2 top priority is to fix the retention problems, and the operating hypothesis is that Communites feature will do this (including with moderation).

While I’d love to see something like RLN being prioritized, I also recognize why there hasn’t been bandwidth to do so yet, especially from an engineering POV.

Status design?

One thing I’d love to see here re RLN is more exploration from a design/UX POV. I think there’s a lot of interesting things that can be done here, and it requires a bit of work to get it right. So far there hasn’t been interest/bandwidth from this, but whenever product/design is ready we should pick this up. Perhaps making it slightly more concrete would get people excited and might influence priorities, at least long term.

In the meantime, with Vac we’ll keep building out RLN Relay and make it useful, including internal dogfooding and looking at other deploy targets (other nim-waku users, js-waku?).

Status economics?

In line with above, being able to lock up SNT for RLN registration would be a great use of the token and a nice way to onboard people into the Status ecosystem as a form of decentralized identity (more like a right-to-use. There’s a lot more to talk about here, but perhaps for another topic.

A middle way?

Considering the above, I do think there are some reasonably quick wins. Being able to filter a public chat by ENS names seems like a KISS solution that’d get you 95% of the way there. I can’t speak for Status product teams here, but to me this seems like something that could be done by Status mobile/desktop without too much effort? A bit similar to how you can select audience with Twitter.

nbold · December 17, 2021, 3:47pm

Edit: also blocking should absolutely be easier than it is today

I’m also quite bearish on keyword filters and Bayesian anything. . . . in order to be effective, it has to be closed source (which spam filters generally are). If it’s open source, then attackers can reverse-engineer . . .

If users can decide the parameters of such filters, e.g. if in your Status settings you can say “I don’t want to see messages that contain any of the following keywords: . . .” (where you decide what the keywords are) or “I want to use X method of filtering spam, and I want to upload this database of spam classification data” then attackers can’t see the mechanisms they are trying to defeat, and no closed-source code is added to the Status client. I don’t see such a solution as a compromise on decentralization either.

Using such a user-tuned system, even a keyword filter alone would be a significant improvement over the current situation. And borrowing from the domain of centralized spam-filtering solutions in a way that doesn’t compromise on the decentralization of Status leads to easily implemented and better solutions than a keyword filter or an ENS name filter IMO.

In the short term, I think a solution that excludes most users from public chats is okay. The status quo excludes pretty much all users.

There are viable short term solutions that don’t exclude most users from public chats. Even letting users decide their own keyword filters seems better than a solution that only benefits people with ENS handles. We could do both, of course, but I don’t think it makes sense to give up so easily on traditional solutions.

blagoj · December 17, 2021, 4:47pm

We’ve been working on several ideas which main goal is enabling spam protection in private environments. They are all using the RLN construct, and are generally enabling the properties that @oskarth described above.

Basically to solve the spam problem in private decentralised environments we need to enable some kind of rate limiting but also to harden the possibilities of sybil attacks.

Sybil attacks can be prevented to a certain degree by increasing the barriers for entry as you’ve described, by either an economic or other form of stake (e.g social stake). We’ve researched both ways and depending on the properties of the application and it’s requirements, both could make sense.

For the economic stake, a smart contract can be used that takes registry fee. After the users are registered to the smart contract, they can chat in the app, by providing a zk-proof with each message that they’re valid member of the group without revealing who they are. The RLN ZK construct also enables rate limiting - the user that send X messages per epoch, are risking their “identity” to be revealed and their registry fee to be slashed from the smart contract.
This is basically the way how RLN Relayer should work as @oskarth described above (I also look forward to have it in prod soon).

For the social stake approach, for the purposes of hardening the sybil attack problem, InterRep can be used. Basically InterRep is a registry which enables users to link their social media accounts (twitter, github, etc) with their Ethereum address or Semaphore id commitment in anonymous manner and according to a certain rank criteria they’re categorised in a group. Basically it enables you to prove that you’re part of a certain reputable group without revealing your identity.
This can also be used with RLN to enable rate limiting. Instead of using cryptoeconomic registry fee, InterRep can be used instead. Once you’re part of a reputable InterRep group you can chat in the app, but if you spam you risk your identity commitment being revealed and being banned from the chat app permanently (your social identity profile - i.e twitter account won’t be revealed, but you won’t be able to register again with the same identity commitment associated with that twitter account).
The slashing conditions here are less severe.

We’re working on such anonymous chat application which uses RLN and InterRep for spam protection.
Although the architecture and purposes of Status is a bit different, I think that the same concepts can apply to Status.

In my opinion cryptoeconomic stake or social stake approaches can be used much sooner. There is a lot effort put into these technologies, and I think the Status app can greatly benefit of using some of them.