Skip to content
This repository has been archived by the owner on May 12, 2020. It is now read-only.

Fix domain matching logic #162

Merged
merged 4 commits into from
Jan 23, 2019
Merged

Fix domain matching logic #162

merged 4 commits into from
Jan 23, 2019

Conversation

bbondy
Copy link
Member

@bbondy bbondy commented Jan 23, 2019

Domain matching logic was complex and didn't work in all cases. This caused us to miss an exception allow rule from: brave/brave-browser#2843

This rewrites the domain matching logic to be much faster on checks, but use a bit more memory.

Note that domain matching code is well tested already before this PR, so there's a lot of tests which ensure the safety of this change. New tests were also added to cover the new cases.

}
for (int i = 0; i < numNoFingerprintAntiDomainOnlyExceptionFilters; i++) {
newNoFingerprintAntiDomainOnlyExceptionFilters[i].swapData(&(noFingerprintAntiDomainOnlyExceptionFilters[i]));
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the old way of simply memcpy over here no longer works because Filter's store pointers now, and they manage their own memory. So when a Filter was destroyed, it would free the pointers twice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The perf here doesn't matter btw because this is only for parsing logic. We use deserialized dat files in production.

@bbondy bbondy force-pushed the domain-matching-hashset branch 2 times, most recently from c4ac3f7 to 3c7bc90 Compare January 23, 2019 16:21
* This Source Code Form is subject to the terms of the Mozilla Public
* License, v. 2.0. If a copy of the MPL was not distributed with this
* file, You can obtain one at http://mozilla.org/MPL/2.0/. */

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class is used in HashSet's which store the domains and antiDomains for each Filter.

}
if (data) {
delete[] data;
if (domains) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

domains and antiDomains are lazily created, so check if we need to delete them.

}
if (host) {
delete[] host;
if (!borrowed_data) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a refactoring of an early return to not have an early return. Code in this block is the same as before.

@bbondy bbondy force-pushed the domain-matching-hashset branch from 3c7bc90 to c7f2bfa Compare January 23, 2019 16:29
const char *p = input;
if (anti) {
if (len >= 1 && p[0] != '~') {
bool Filter::containsDomain(const char* domain, size_t domainLen,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the header file for a description of this. Mainly it just checks if the specified domain is inside the filter's list of domains (or anti domains).

@@ -298,7 +270,8 @@ void Filter::parseOption(const char *input, int len) {
*pFilterOption = static_cast<FilterOption>(*pFilterOption | FOThirdParty);
} else if (!strncmp(pStart, "first-party", len)) {
// Same as ~third-party
*pFilterOption = static_cast<FilterOption>(*pFilterOption | FONotThirdParty);
*pFilterOption = static_cast<FilterOption>(
*pFilterOption | FONotThirdParty);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a line length thing

@@ -387,6 +360,44 @@ bool Filter::hasUnsupportedOptions() const {
return (filterOption & FOUnsupportedSoSkipCheck) != 0;
}

bool Filter::contextDomainMatchesFilter(const char *contextDomain) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This checks if a context domain is applicable to this filter.
For example if foo.example.com is the context domain then it will check if this filter is applicable to both foo.example.com and example.com.

Copy link
Contributor

@SergeyZhukovsky SergeyZhukovsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nothing major that caught my eyes. but needs to tested well before production. Hopefully we will see if something wrong on dev builds.

@bbondy bbondy merged commit aa0485f into master Jan 23, 2019
@bsclifton bsclifton deleted the domain-matching-hashset branch April 12, 2019 06:06
@bsclifton bsclifton restored the domain-matching-hashset branch April 12, 2019 06:06
@bsclifton bsclifton deleted the domain-matching-hashset branch April 12, 2019 06:06
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants