-
-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Group submission access logs #5113
Conversation
5cc2fbc
to
039b7ee
Compare
643d2c2
to
8483c46
Compare
BTW the formatting should pass once #5119 is merged |
# for submissions, the group key is hour created + user_uid | ||
# this enables us to group submissions by user by hour | ||
When( | ||
metadata__auth_type=ACCESS_LOG_SUBMISSION_AUTH_TYPE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably, we may need an index metadata__auth_type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're not doing any filtering by auth type with this query. It ends up in postgres as
SELECT "auth_user"."username",
kpi-1 | "audit_log_auditlog"."object_id",
kpi-1 | "audit_log_auditlog"."user_uid",
kpi-1 | CASE WHEN ("audit_log_auditlog"."metadata" -> 'auth_type') = '"submission"'::jsonb THEN CONCAT(((DATE_TRUNC('hour', "audit_log_auditlog"."date_created" AT TIME ZONE 'UTC'))::varchar)::text, ("audit_log_auditlog"."user_uid")::text)
kpi-1 | ELSE ("audit_log_auditlog"."id")::varchar
kpi-1 | END AS "group_key",
kpi-1 | COUNT("audit_log_auditlog"."id") AS "count",
kpi-1 | CASE WHEN ("audit_log_auditlog"."metadata" -> 'auth_type') = '"submission"'::jsonb THEN '{"auth_type": "submission-group"}'::jsonb
kpi-1 | ELSE "audit_log_auditlog"."metadata"
kpi-1 | END AS "metadata",
kpi-1 | MIN("audit_log_auditlog"."date_created") AS "date_created"
kpi-1 | FROM "audit_log_auditlog"
kpi-1 | LEFT OUTER JOIN "auth_user"
kpi-1 | ON ("audit_log_auditlog"."user_id" = "auth_user"."id")
kpi-1 | WHERE ("audit_log_auditlog"."log_type" = 'access' AND "audit_log_auditlog"."user_uid" = 'ubVyXzUy7S9VdR2PgKkbdR')
kpi-1 | GROUP BY "auth_user"."username",
kpi-1 | "audit_log_auditlog"."object_id",
kpi-1 | "audit_log_auditlog"."user_uid",
kpi-1 | 4,
kpi-1 | 6
kpi-1 | ORDER BY 7 DESC
kpi-1 | LIMIT 100
I don't think an index will do anything for us.
kpi/fields/username_hyperlinked.py
Outdated
from rest_framework import serializers | ||
|
||
|
||
class UsernameHyperlinkField(serializers.HyperlinkedRelatedField): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rename the class (and the file) to make it obvious it is expecting a dict.
Otherwise, we are using RelativePrefixHyperlinkedRelatedField
everywhere else.
What about changing the its get_url
method to be more permissive with obj
. E.g.:
class RelativePrefixHyperlinkedRelatedField(HyperlinkedRelatedField):
...
def get_url(self, obj, view_name, request, format):
if isinstance(obj, str):
value = obj
else:
value = getattr(obj, self.lookup_field, None)
return self.reverse(
view_name, kwargs={self.lookup_url_kwarg: value}, request=request, format=format
)
In the serializer, use RelativePrefixHyperlinkedRelatedField
instead of UsernameHyperlinkField
user = RelativePrefixHyperlinkedRelatedField(
view_name='user-kpi-detail',
source='user__username',
queryset=get_user_model().objects.all(),
style={'base_template': 'input.html'} # Render as a simple text box
)
Last option, keep UsernameHyperlinkField
as-is (no rename), but still adapt get_url
to support User object as described ⬆️
Serializers could call user = UsernameHyperlinkField(...)
all time.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of updating RelativePrefix... to allow it to take strings. Updating
AccessLog.objects.create(user=user1, date_created=today) | ||
AccessLog.objects.create(user=user1, date_created=yesterday) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about switching both to be sure they are created in the different order we are expecting at the end?
'kpi.authentication.OPOAuth2Authentication.authenticate', | ||
'kobo.apps.openrosa.libs.authentication.BasicAuthentication.authenticate', | ||
) | ||
def test_any_auth_for_submissions(self, authetication_method): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo? authentication_method
?
'kobo.apps.openrosa.apps.api.viewsets.xform_submission_api.XFormSubmissionApi.create', # noqa | ||
return_value=HttpResponse(status=200), | ||
): | ||
# try both v1 and v2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OpenRosa and v1 API ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean? Isn't v1 just the openrosa endpoint?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rgraber nope, they are two different things.
OpenRosa
API implements three endpoints:
/formList
/manifest
/submission
That's what Enketo and Collect are using to POST
data. They are expecting these routes.
/api/v1/submissions
is a DRF endpoint that users could use to POST
data via API.
IIRC, the former was calling a legacy django view before. It was two different part of code (which achieved the same thing). But now, they do call the same code
/<username>/submission kobo.apps.openrosa.apps.api.viewsets.xform_submission_api.XFormSubmissionApi submissions
/api/v1/submissions kobo.apps.openrosa.apps.api.viewsets.xform_submission_api.XFormSubmissionApi submissions-list
/submission kobo.apps.openrosa.apps.api.viewsets.xform_submission_api.XFormSubmissionApi submissions
Having that said, none of them are v2
. You cannot POST
to v2
.
return_value=HttpResponse(status=200), | ||
): | ||
# check v1 and v2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OpenRosa and v1 API ;-)
Checklist
📣Description
Updates the access logs endpoints to return submissions grouped together by hour to avoid floods of individual submission logs. Also fixes a bug where submissions to the v1 endpoint were not being counted as submissions.
Notes
Adds new queries to the AccessLogManager for returning access logs with submissions grouped together. The submissions are grouped by truncating the date created to the hour and combining it with the user_uid and using it as a grouping key. For non submissions, the group key is just the submission id, which is unique, and therefore none of them will be grouped together. This method results in a new group of submissions for every user every hour. For example, if user1 makes 3 submissions at 11:59 for and 3 submissions at 12:01, it will be reported as 2 different submission groups. 3 normal logins between 11:00 and 11:15 will still be reported as 3 separate events.
This requires changes to the serializer. It will now only return the minimum required fields (leaving out log_type, action, model_type, and app_name, all of which are the same for every access log).
Also fixes a bug where submissions to the v1 endpoint (XFormSubmissionApi) were not being recorded as such. To do this, it was necessary to replace the default token and basic authentication classes with the wrapped versions from kpi.authentication that have the call to create a new audit log.
Testing
Use mockobo to submit multiple submissions at once. Check
api/v2/access-logs
andapi/v2/access-logs/all
to make sure the submissions are grouped together in the response.You can also create multiple submissions in the UI. If you create them within the same hour of the day, they should be grouped together.
Related issues
Replaces #5094