"Waku-based" Analytics

Half the money I spend on advertising is wasted; the trouble is I don’t know which half. - John Wanamaker

Some say that nowadays even more than 50% of the advertising spend is wasted. Is that the case with Status? I don’t think so. I would say we are relatively frugal and all our paid efforts seem to be very cost-effective… but there are still some challenges we need to overcome. One of these challenges was the inspiration for this thread.

Anyone attending our bi-weekly Metrics Review meeting (every other Friday) has seen me presenting charts like this one:

What it shows is: volume of New Peers (orange/bottom column) in proportion (purple double-line) to the volume of Mobile App Installs (the whole column - the top part of it represents App Installs that did not manifest as New Peers).

(Red area is when the Spam Attack happened)

What does it mean? It means that (in the last 30 days), every day, only about 25% of the new Mobile App Installs manifested as New Peers . It means that we “lose” up to 75% of potential New Users … every day.

How? We don’t actually know (although we could!). Some of them install the app, but never open it - that’s fine, it happens. Some open the app and go into the on-boarding process. Many don’t get through. How many? Where (what step) is the highest drop-off?

We don’t know… but I think we could. I think we could get that data for 100% of our users, anonymously => without even asking for their consent.

How?

Temporary (Analytics) Account

What if we sent requests (or even messages to data-gathering peer) during the on-boarding process using a temporary account (created upon the very first app launch + with random password + deleted/replaced with an account acctually created by the user in the onboarding process).

With messages sent for the 1st/2nd/3rd on-boarding step - we could easily calculate % of New Peers at each step => drop-off for each step => we would know where to start on-boarding optimization.

There should be no privacy concerns: such an account would be temporary (and deleted) = not tied in any way to the device or the user…
…which should allow us to gather data for every user (no risk of gathering data only for biased user segments with need for opt-in).

It would enable us to proper on-boarding A/B Testing… and grow even faster:

7 Likes

if they are anonymous messages in the network, not tied to the user or device, and we cannot track data such as ip that could be used to track the user in the future, and the network cannot identify these messages uniquely from other messages, it’s acceptable.

good thinking :slight_smile:

2 Likes

My point exactly :slight_smile:

@cammellos. @hester - wdyt?

Yes, that’s what we envisioned for the waku based analytics. @shivekkhurana initially suggested the idea and is captured here Add reporting of anonymous metrics · Issue #16 · status-im/status · GitHub . I believe there’s some designs already that @hester worked on.

Basically it would be opt-in, and we’d generate a keypair per installation, from which we’d send anonymous metrics, which one are to be defined.

This is definitely something we badly need to understand the impact of features on our user base, and it trades little in terms of privacy, being opt-in and anonymous.

1 Like

Here is the catch :slight_smile: I created this thread to discuss this particular assumption - why opt-in ? :slight_smile: It’s not only unnecessary, but it will cause extra problems.

  1. Since we are unable (and won’t be ever able) to identify the user there is no need for the user’s consent.

  2. What’s blocking our growth right now is the on-boarding completion rate (as shown above).

Here is the deal:

  • Only a fraction of users manage to get through the on-boarding process => The vast majority of users don’t get through the on-boarding process.

  • Only a fraction of users will agree to get tracked (even though it isn’t tracking really - which I’ll get to later)

=> The vast majority of users won’t accept to get tracked

I bet that the group with the " tracking consent " will mostly consist of users having no problems in getting through the on-boarding process - it’s going to be a biased group.

In that sense, the data (based on their behavior) will be utterly useless.

Not to mention it might be difficult to hit any kind of statistical significance with data collected for a fraction of a fraction of users (even though it’s app’s top of the funnel).

What we need to understand is the behavior (and most importantly conversion blockers) for the other group (that ~ 70% of users who find our onboarding process too problematic). With opt-in we most likely won’t get any data for that group.

Now, why isn’t this tracking = why in my opinion there is no need for consent (in fact, I think that asking for it might be unnecessarily confusing to people - making everything worse).

We are not creating users’ profiles containing data on their in-app behavior. We are not secretly spying on anyone. It’s not user tracking because the subject of this process isn’t the user… but steps, on-boarding process steps . It’s about making measurements per step not per user .

If in the above-the-fold part of the app we had a different image for every step of the on-boarding process (an image that would have to be downloaded from our server) - we wouldn’t call it tracking, right? Yet, the results (in terms of the numbers we need) would be equally the same.

Imagine it’s not an app, but a tourist guide that gets paid only for the tourists that get through the whole tour (consisting of few steps).

What this tourist guide knows already is:

a) how many tourists begin the tour (Google Play App Install data)

b) how many get to the end of the tour (New Peers)

c) rate of the two equal to 25%

My argument is: we don’t have to ask for permission to count tourists on each step… as we don’t wish to process this data as:

  • John Doe (Step 1, Step 2, Step 3, bounce)

but as:

  • Step 1 (100 Users), Step 2 (89 Users), Step 3 (30 Users)

In case of doubt, a soft opt-in could be to only include anonymous metrics in a build in Open testing on the Play store. I.e. People opt-in prior to install, as a Beta tester for the app on Play store. I’d promote this route in spite of some noted downsides below, as I believe we can get sufficient results faster by avoiding potential lengthy discussions on what’s acceptable in main production release.

Downsides:

  • There will be a cohort bias in people who decide to join the Beta program
  • There will be a bias in prior knowledge as we’d need to run campaigns to invite people into the Beta program (they’d go through something like this) https://play.google.com/apps/testing/im.status.ethereum

A beta track with metrics can run alongside a qualitative study (moderated and unmoderated) to understand why people drop off and point to how the flow can be improved. (Opposed to the when, where, how many that we can learn from metrics)

1 Like

I agree re:Onboarding, we should make conversion metrics mandatory, because:
1- It’s a critical path (most money spent) and
2- The user is not a customer yet (In real world, it’s like having a camera on our front door. We record everyone regardless of their permission, because we need to safe-guard our entrance)

But I suggest showing a pop-up after on-boarding is complete. Continuing the real life analogy, this is akin to letting users know that this property records everything, but your faces will be irreversibly blurred on the tape, so your actions cannot be traced back to you.

The user should know that their actions are being recorded anonymously, perhaps even have an public dashboard where everyone can see anonymous stats. I agree with @AK47’s reasoning that the metrics only measure funnel conversion and cannot be traced back to the user. But the term “metrics” is viewed negatively in general.

=> The vast majority of users won’t accept to get tracked

@hester mentioned somewhere on Discord that 80% of Metamask users opt-in to analytics (which are not even private). If we get a similar opt-in ratio, we will have enough data to figure out drop-offs.

If a majority of users deny tracking, we can make it mandatory in the subsequent iterations.

I’d prefer to inform users that we have anonymous tracking to improve the app. The alternate being tracking them anonymously, but without their knowledge, which makes it sound shady.

1 Like

I personally find requests to opt-in to some form of data collection/analytics to be a bit off-putting. I don’t think it vibes with Status and it feels very weird for a privacy-focused messenger to even consider that. Imagine the blowback Moxie would get on Twitter if Signal added some dialogue option that asked people to opt-in to some analytics?

In this case, the proposed analytics themselves are non-invasive and not that much concern from a privacy perspective, but simply the act of asking that question would likely have a negative impact.

1 Like

Relevant:

My 2p is that we don’t have to settle or commit to a strategy at this stage.
There’s broad agreement that we need to have metrics, and it’s already actionable.

We can start from the safest approach (opt-in) or only beta, and take it from there (opt-out) and eventually mandatory if we think is a good idea.

We can always evaluate how much data we get through and see how many people are opting in (roughly) and make only the necessary concessions.

@cyanlemons point about the image of collecting metrics is a valid point, and should be considered.
I would not be concerned myself if explained the metrics are anonymous and use exactly the same system we use to protect user metadata when sending messages, but I am not a representative user, and marketing will have a better understanding on the impact on our brand if doing so.

2 Likes

Having discussed this topic with Alek before, seems like there is a bit of confusion regarding his goal/aim, and AFAIU two different initiatives are being mixed in this discussion. I’ll try to summarize:

A. On-boarding flow completion rate (what Alek wants to understand & improve):

App installed → onboarding step 1 → onboarding step n → on-boarding finished

Goal:

  1. Understand, in aggregate, how many people that download the app finish the on-boarding flow (on an % basis)
  2. Come up with ideas/initiatives to improve the on-boarding flow completion rate

Since an account has not been created at this stage yet, no PII data is required to gather this information whatsoever, and the process is 100% private & anonymous, Alek argues that no opt-in is needed.

B. Post-onboarding metrics (what Cammellos & Hester have already been working on):

User finishes on-boarding → user is asked to opt-in to sharing anonymous usage data → anonymous in-app user actions are collected

Goal:

  1. Understand, in aggregate, how users use our app (and/or take specific actions) after they have finished on-boarding
  2. Identify problems/opportunities in terms of usage patterns & come up with solutions to increase activity/retention

Gathering/collecting user in-app actions would indeed require the user’s explicit permission (the opt-in).

@AK47: did I get it right?

1 Like

In my view there’s no distinction between the two (there’s only a technical distinction which we won’t go through at this stage).

Metrics would be collected from the onboarding stage and during account usage, not only after account creation (though we might start with after account creation for simplicity).

1 Like

@hester

We are getting approximately 6000+ App Installations a day (from all over the world). We had 198 994 (“unbiased”) App Installations last month.

Since onboarding is a process every user has to go through, can we optimize it based on data gathered for a small, biased cohort? Isn’t there a huge risk of unjustified extrapolation?

@cyanlemons - 100%. Not even for a split of a second we should make people wonder (doubt) whether Status is really as private and secure as we say.

@cammellos I expect it to have a negative impact on our brand. We won’t be able to explain all the
nuances to hundreds of thousands of users (which is how many app installs we got this year).

@cammellos - opt-in doesn’t seem to be safe (previous quote) and only beta may be insufficient (in which case we will fallback to the initial idea)

@simonam - exactly… but in fact I would go even a step further:

With fully-anonymized (which is the case) counting (I don’t call it tracking, because it’s user-agnostic data collection) there is no need for consent = no need to ask for permission as it would’t be connected to the user in any way.

@simonam yes sir, well put :slight_smile:

@cammellos So it seems that onboarding metrics (before account creation) aren’t the main priority right now. I’m sorry to bug you, but could you help me understand the logic behind it? :slight_smile: I don’t get what I’m missing here.

  1. DAUs is our main goal right now (I think we all agree on that)

  2. 70-75% drop-off rate is what we are seeing on the onboarding process. Nowhere else (in the app) we lose as many users. No other optimization or fix will increase DAUs as quickly and effectively as fixing the onboarding process.
    [We do have issues with retention, but that’s another story. With the intensity of our campaigns as they are right now - we will be still getting as many apps installs, but with a higher conversion rate to users => DAU increase].

This Sunday we had 5967 App Installations on Android yet only 1265 New Peers (~21%*). Why not start gathering data for onboarding since this is such a pressing problem:

  1. It increases the cost of the on-boarded user.
    The average cost of installation (CPI) in Google Ads hovers around 0.20 USD, but with a 25% conversion rate (on the on-boarding process) it’s 1.00 USD for onboarded user.
    If the conversion rate was 50% - we could afford two users for the price of one. It would automatically enable us to spend even more on our campaigns (so not only we would get more users for the same budget, but also we could spend more).

  2. More on-boarded users = more DAUs = more data for after account creation metrics.

I cannot see anything more important (or urgent) than fixing onboarding (which requires having reliable data).

What am I not seeing here? :slight_smile:

@cammellos So it seems that onboarding metrics (before account creation) aren’t the main priority right now. I’m sorry to bug you, but could you help me understand the logic behind it? :slight_smile: I don’t get what I’m missing here.

You are missing that you are reading things I have not said :slight_smile:

“Metrics would be collected from the onboarding stage and during account usage, not only after account creation (though we might start with after account creation for simplicity).”

Does not mean that one is less important then the other, it just means that is much simpler to implement one over the other, and implementing B will get 80% of A implemented, with little extra effort, and we reduce the high unknowns of implementing A, which is a significant departure from a technical perspective on how we do things (a node is only started after login, in A case that needs to be started immediately).

This is just the way we reduce risk when implementing things, we break things down in meaningful chunks if they are too big (A) so that we can get a solid foundation to build on. It’s not a reflection of priorities.

opt-in doesn’t seem to be safe (previous quote) and only beta may be insufficient (in which case we will fallback to the initial idea)

I strongly disagree, as a user of a privacy focused app, I would not be very happy that it shares information about my usage (albeit anonymously), without my consent. The fact that the user is not even informed about this, seems like really a bad image.
My point of reference is Signal, if Signal does not do it, I don’t see why we should, we should be at least as good as them (i.e ask users).

3 Likes

Although I entirely sympathize with @AK47’s desire to improve our marketing decisions, and in turn increase Status’ adoption, I do agree with @cammellos here.

Pretty much every corporation in existence has beaten the “anonymized analytics” term to death. Those words do not mean what they are supposed to mean. All sorts of privacy violations have occurred at the hands of supposedly anonymized analytics.

In our case, to the best of my knowledge, these analytics are actually anonymized as they directly dependent upon Waku’s privacy guarantees. But that’s too much technical detail to try and convince any average user. I think we’d need to tell a user if we do any type of tracking just on principle, even if the tracking provides genuine, provable anonymity.

But if we do ask for consent, it’s bad optics. I don’t think there is a way to do this without it having a negative impact on the Status brand. The best bet at onboarding analytics we can get is just measuring the number of people that respond to the Status Bot, and at which steps. For instance, if the average response rate is 30%, but then suddenly drops to 5% at a specific question, we know that people aren’t responding well to that Status Bot question and can act accordingly. Other than that, I don’t think we can do this type of anonymized analytics.

2 Likes

I wonder whether we could do something about it (in terms of how we are going to communicate it), because it’s not really tracking per se.

As you said, “anonymized analytics” has been beaten to death, but the point I’m trying to make is this: it’s not user that is the subject of this process (we call tracking here) but (onboarding) step(s).

Let’s imagine a staircase, with each step connected through a wire to one unique bulb. Whenever you stand on a step, the bulb (for that step) lights up. Counting the number of times each bulb light up isn’t tracking. It’s just counting (and “Waku-based” Analytics is just a set of wires and bulbs)

I agree that the meaning of tracking / analytics and especially anonymous analytics has been distored. The problem is that it’s seriously limiting our growth.

I started this thread hoping we could find some cleaver workaround (with this thread also becoming a public record of our thinking process, our motivations and doubs).

I suspect that the soliuton would require collaboration of marketing (communication, selling the idea) and product (technical side).

Oh, that’s given, but unfortunatelly it won’t solve (it won’t even slightly touch) the main problem.

The main problem (as mentioned above) is the onboarding process - these few steps before you get to interact with anyone (including bots).

That’s where we lose 70-75% of users. ChatBOT ( + bots in general + several other tactics) are about retention.

I see it this way:

  1. We invite our users with our ads.
  2. They accept the invitation by installing the app.
  3. They enter the community by going through the onboarding (corridor).
  4. They decide to stay (that’s retention).

70-75% of users get stuck in the corridor. They never get to see what’s inside (what’s past the corridor = they don’t get the chance to interact with anyone or anything).

Definetely it’s bad optics. Still, I don’t think we should give up.

What we know is that:
a) people do not understand word tracking
b) people do not trust in anonymous data collection / anonymous analytics
c) we need data to improve our product (in terms of activation and retention)
d) we don’t let ourselves to collect any valuable data [in the name of seomething we aren’t really sure about - our conviction that crypto-community (our current main target) cares about privacy]

We are plaing by the rules which are highly disadvantageous to us.

Why not work around it (or at least challenge the assumptions we have been making)?

What if we made these records (step counters) publicly available per analogy to: metrics.status.im (btw. from user’s perspective - how much different really Peers Counting is from Steps “counting”) to let users see what kind of data we collect?

After all, our users have no problem, whatsoever, with their activity being tracked (and their addressess exposed) on blockchain(s). Why would they oppose us counting user-agnostic light flashes?

What if we made Status users similarly percieve our steps counting?

Similarly not literally. I don’t exactly know how, that’s why we’ve got this thread.

1 Like

Twitter poll on whether or not Status should collect opt-in anonymized usage data: https://twitter.com/ethstatus/status/1339245853798768642.

1 Like

Twitter poll on whether or not Status should collect opt-in anonymized usage data: https://twitter.com/ethstatus/status/1339245853798768642.

So, less than 50% of people were confident in saying “Yes, that’s fine.” This is a bit biased, because “show results” isn’t exactly the same as “undecided,” but it’s close enough that I don’t think it would be a good move to add an option to provide usage data. I think it would damage our credibility in terms of our security and privacy as both a messenger and wallet if we were to move forward on this.

Instead, I think the best route for acquiring this very important data will be a separate beta build of Status in both the App Store and Play Store. It could be titled “Status BETA with usage tracking” and the description would say it’s a beta build that includes usage tracking to assist our developers in making more informed decisions. This way, we can entirely compile out any tracking code in the default builds (not with a runtime option, only a compile-time option), while keeping it enabled in the beta builds.

The users of beta builds will inherently be more technically inclined, so to account for this, there could be a self-reported metric for how technically apt a user is from 1 to 10. This way we could, perhaps, exclude users that are an 8 and higher from our analysis of the data.

1 Like

it’s acceptable, no opt-in, no beta, pls

1 Like

i don’t think this poll has much meaning, it has few participants and we can’t gurantee its truth. btw, it’s a wonderful discussion, i’m interested in that who is the judge can make the last deciasion? :thinking:

1 Like

This is exactly it, in web 2.0 you have “private” opt-in analytics, but the reason they are asking is because there is still a trust issue, the user is consenting them collecting identifying information.

What is “private” in web 2.0 is actually a dark anti-pattern for us, that we can run the risk of feeding into and making more socially acceptable if we don’t simultaneously re-educate the market when asking them.

Take the Vivaldi Browser for example they claim they have 0 tracking and are “private”. They even say "We Don’t Track You". But then, if you read their Privacy Policy

each installation profile is assigned a unique user ID that is stored on your computer. Vivaldi will send a message using HTTPS directly to our servers located in Iceland every 24 hours containing this ID, version, cpu architecture, screen resolution and time since last message. We anonymize the IP address of Vivaldi users by removing the last octet of the IP address from your Vivaldi client then we store the resolved approximate location after using a local geoip lookup. The purpose of this collection is to determine the total number of active users and their geographical distribution.

What’s worse is that they go on to say their revenue partners make use of cookies on you.

In other words, “we collect your identifying data, but don’t worry, trust us, we’ll take care of it”. In our case, using Waku, we have much substantially higher, independently verifiable, assurances that we cannot track the user.

1 Like