Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

R30 stats #3041

Merged
merged 12 commits into from
Apr 5, 2018
Merged

R30 stats #3041

merged 12 commits into from
Apr 5, 2018

Conversation

neilisfragile
Copy link
Contributor

create phone r30 stats

Copy link
Member

@erikjohnston erikjohnston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I forgot to mention you'll need to bump the schema number in synapse/storage/prepare_database.py

Otherwise, looks good

return count

ret = yield self.runInteraction("count_r30_users", _count_r30_users)
defer.returnValue(ret)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, you can just return self.runInteraction(..) and not annotate with @defer.inlineCallbacks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@neilisfragile
Copy link
Contributor Author

@erikjohnston I added in support for multi client - the filters were taken from our existing HA proxy scripts, though if you have any thoughts on how to improve much appreciated - also I ducked the issue for sqlite and just return the total

@erikjohnston
Copy link
Member

A query that works across both sqlite and postgres is:

SELECT platform, COALESCE(count(*), 0) FROM (
    SELECT users.name, platform, users.creation_ts * 1000, MAX(uip.last_seen)
    FROM users
    INNER JOIN (
        SELECT 
        user_id,
        last_seen,
        CASE
            WHEN user_agent LIKE '%Android%' THEN 'Android'
            WHEN user_agent LIKE '%iOS%' THEN 'iOS'
            WHEN user_agent LIKE '%Electron%' THEN 'Electron'
            WHEN user_agent LIKE '%Mozilla%' THEN 'Web'
            WHEN user_agent LIKE '%Gecko%' THEN 'Web'
            ELSE 'Unkown'
        END
        AS platform
        FROM user_ips
    ) uip
    ON users.name = uip.user_id
    AND appservice_id is NULL
    AND users.creation_ts < 1519638838
    AND uip.last_seen/1000 > 1519638838
    AND (uip.last_seen/1000) - users.creation_ts > 86400 * 30
    GROUP BY users.name, platform, users.creation_ts
) u GROUP BY platform;

Note that this doesn't search for Riot

Use iteritems over item to loop over dict
formatting
@erikjohnston
Copy link
Member

(Sytest failure looks unrelated)

@neilisfragile neilisfragile merged commit 68b0ee4 into develop Apr 5, 2018
Counts the number of 30 day retained users, defined as:-
* Users who have created their accounts more than 30 days
* Where last seen at most 30 days ago
* Where account creation and last_seen are > 30 days
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should that be "> 30 days apart"?

def count_r30_users(self):
"""
Counts the number of 30 day retained users, defined as:-
* Users who have created their accounts more than 30 days
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"more than 30 days ago"?

@ara4n
Copy link
Member

ara4n commented Apr 9, 2018

I missed this at the time (sorry), but have highlighted some typos (i think?) in the comments.

This is certainly a major improvement, although I wonder if we want to get synapses to maintain aggregate stats themselves: i.e. have a table which tracks the number of requests a given access_token does on a given day. This would then let us perform much more accurate retention analyses - i.e. by idling out users who haven't used the app after N days, and avoiding The Kink that caused all the confusion back in March (which turned out to be due to christmas being averaged out over 30 days, combined with weird artefacts on not idling out users)

@neilisfragile
Copy link
Contributor Author

Absolutely, this was just to get instantaneous R30 out quickly, since I can avoid the kink problem if the stat is generated there and then.

@neilisfragile neilisfragile mentioned this pull request Apr 10, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants