-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validate patient id is a relative reference #117
Conversation
var patientId = device.Patient?.GetId<Model.Patient>() ?? throw new FhirResourceNotFoundException(ResourceType.Patient); | ||
|
||
// only allow unreserved URI characters | ||
if (!Regex.IsMatch(patientId, @"^[A-Za-z0-9_.\-~]+$")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than throwing an error, I think a better approach might be to make our GetId logic more robust.
We might want to consider updating the code in Microsoft.Health.Extensions.Fhir.ModelExtensions class, mainly because we currently support only one reference format (e.g. Patient/{FHIR_RESOURCE_ID}). It's possible that a reference could be a full URL (e.g. https://fhirserver.com/Patient/{FHIR_RESOURCE_ID}) or contain a '/' prior to the Resource type name.
We could potently use regex to capture the Id and ignore the rest of the string. We could still throw an error or just return null like we currently do, if no valid Id could be extracted from the string. However, if we do throw an error, we should be very explicit with what the customer can do to fix the issue.
// Should contain 2 groups, the second group should just be the id.
var match = Regex.Match(reference, typeof(TResource).Name + @"\/([A-Za-z0-9\-\.]{1,64})");
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point I think we want to just enforce relative references for purposes in this PR, https://www.hl7.org/fhir/references.html#2.3.0.
Which are in the pattern ResourceName/identifier.
We currently make assumptions that the references can be resolved on the current FHIR service. Literal, logical (which is what is causing the error), or absolute open us up to other issues that aren't being properly accounted for.
Agreed this should be moved to the GetId method since it applies equally to any id we are trying to use today. As we support more reference types we can add support there.
In general I think it would be valuable to build checks off the URI (Device.Patient.Url or Reference.Url ) so we can validate it is relative and doesn't contain parameters. Hopefully there is exiting code for URIs that you can leverage to make these distinctions. We would be looking for a valid relative URI with no parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the latest commit I've moved the code to GetId and made it a bit more robust. It now locates a matching ResourceName/identifier if it happens to exist anywhere as long as it does not contain any reserved URI characters. We return null if we do not find a match.
Unfortunately it seems like regex is the best option. The Uri classes do have some useful methods for parsing the Uri but they unfortunately only work for absolute Uris. The Uri methods throw exceptions if the Uri is relative and are unusable. There is a workaround to just turn the Uri into an absolute Uri using some fake host name like http://foo/ResouceType/id but but that seemed ugly.
var patientId = device.Patient?.GetId<Model.Patient>() ?? throw new FhirResourceNotFoundException(ResourceType.Patient); | ||
|
||
// only allow unreserved URI characters | ||
if (!Regex.IsMatch(patientId, @"^[A-Za-z0-9_.\-~]+$")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there might be options to validate the reference in the FHIR library that I am double checking but if we keep this as is it would be good to have this as private static compiled Regex to avoid the extra perf hit to compile the expression per execution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added a compiled regex caching mechanism for GetId. I did try and look through the firely .net library but didn't find anything that we could easily re-use for our use case
var patientId = device.Patient?.GetId<Model.Patient>() ?? throw new FhirResourceNotFoundException(ResourceType.Patient); | ||
|
||
// only allow unreserved URI characters | ||
if (!Regex.IsMatch(patientId, @"^[A-Za-z0-9_.\-~]+$")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point I think we want to just enforce relative references for purposes in this PR, https://www.hl7.org/fhir/references.html#2.3.0.
Which are in the pattern ResourceName/identifier.
We currently make assumptions that the references can be resolved on the current FHIR service. Literal, logical (which is what is causing the error), or absolute open us up to other issues that aren't being properly accounted for.
Agreed this should be moved to the GetId method since it applies equally to any id we are trying to use today. As we support more reference types we can add support there.
In general I think it would be valuable to build checks off the URI (Device.Patient.Url or Reference.Url ) so we can validate it is relative and doesn't contain parameters. Hopefully there is exiting code for URIs that you can leverage to make these distinctions. We would be looking for a valid relative URI with no parameters.
|
||
namespace Microsoft.Health.Extensions.Fhir | ||
{ | ||
public static class ModelExtensions | ||
{ | ||
private static ConcurrentDictionary<string, Regex> _idMatcherRegexCache = new ConcurrentDictionary<string, Regex>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than define these dynamically and a regex per resource type can we do some simple string manipulation as a preprocessing. I.E. do a .StartsWith("ResourceType") and if true do a substring on the remaining reference . This will allow us to just have one compiled Regex statically and improve processing throughput and memory utilization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
@@ -39,9 +42,28 @@ public static Hl7.Fhir.Model.ResourceReference ToReference<TResource>(this strin | |||
/// <returns>The id for the specified resource type if it exists, null otherwise.</returns> | |||
public static string GetId<TResource>(this Hl7.Fhir.Model.ResourceReference reference) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have some tests for just the GetId extension method? Would be good to verify for the reference types we support today (Device & Patient)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added ModelExtensionsTests
Validates that the relative reference patient identifier does not contain any URI restricted characters.
If the patient identifier contains restricted URI characters, then we log a
NotSupportedException
. We want to to this validation up front because the subsequent calls to FHIR will fail and we want to make it clear that the failure occurred due to certain characters contained in the identifier.