-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Specify the audit log CSV format #234
Conversation
Thank you!
Unless, you are really eager, I the 3rd sub-spec would be a good time! :) |
I came to an unfortunate realization at getodk/collect#3126. The way the XPath paths are written out by Collect is idiosyncratic: group names are always followed by a position predicate ( Those paths are not incorrect so it doesn't matter for analysis tools that actually speak XPath. We do know that folks are writing their own analysis tools or doing quick visual checks, though, and for them it's important that the exact node names be consistent, I think. And ideally, that would include consistency between clients. I see a couple of options:
@MartijnR, do you have a strong preference one way or the other? A different option I'm missing? |
That sounds intriguing and surprising. Maybe I don't understand. But in any case, my preference would be to not use position predicates except for repeated groups. For repeated groups it might be helpful to always include positions (even if only 1 instance) |
Actually, this brings up another point that we had to resolve in OpenClinica's fork (though not related to logs for them). You can remove a repeat instance (e.g. the second, and then later create a new second instance, or the already existing third instance becomes the second). To deal with this (in audit log data) we added an immutable |
Indeed. Imagine that you have a method that outputs paths with position predicates for every node and then that you have a client that does navigation with page flips. It looks like the position predicates were stripped off nodes that correspond to pages -- questions or field lists. I'm kicking myself for somehow not noticing.
Yes, I agree.
Agreed.
I think we had considered the shifting repeat positions not worth a special case but it's true that it's now much more difficult to do things like calculate how much time was spent in a specific repeat instance or track answer changes in a specific repeat instance.
Can you say a bit more about this? Do you mean Enketo core keeps counters of repeat instances over the lifetime of a record and includes an ID from that count in the log? |
It's like this: <repeat enk:last-used-ordinal="4" enk:ordinal="1">
...
</repeat>
<repeat enk:ordinal="3">
...
</repeat> The ordinals are 1-based and are stored in the model/submission, so loading a submitted record for editing maintains the state. In the above example 4 P.S. OC has a higher-than-normal need for this, because each new value or new repeat is immediately submitted to the server as a fragment (and not the final complete record). I wonder if that is also meant to be trackable in the audit logs (all the intermediate states before submission). |
@MartijnR and I discussed this briefly in Slack and agreed to always include positions for repeats. This avoids the case where the first repeat instance is initially logged as Filed #248 |
This spec intends to provide sufficient information for downstream tools to process client audit logs and for additional clients to create compatible logs.
I considered linking to https://tools.ietf.org/html/rfc4180 but it prescribes CRLF linebreaks and is more verbose than I think we need. I also ended up omitting "each line should contain the same number of fields throughout the file" because it seems easy enough for downstream tools to end a row on linebreak. Should it be included?
I verified this by running
bundle exec jekyll serve
and looking at the generated site in Chrome.Perhaps I should consolidate the templates for sub specs now that we have two. Or maybe we wait for one more? 😬
CC @grzesiek2010