Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PAID] [$500] Country search does not consider Latin or Spanish letters in search #29826

Closed
6 tasks done
m-natarajan opened this issue Oct 17, 2023 · 57 comments
Closed
6 tasks done
Assignees
Labels
Awaiting Payment Auto-added when associated PR is deployed to production Bug Something is broken. Auto assigns a BugZero manager. Daily KSv2 Engineering External Added to denote the issue can be worked on by a contributor

Comments

@m-natarajan
Copy link

m-natarajan commented Oct 17, 2023

If you haven’t already, check out our contributing guidelines for onboarding and email [email protected] to request to join our Slack channel!


Version Number: 1.3.85-1
Reproducible in staging?: y
Reproducible in production?: y
If this was caught during regression testing, add the test name, ID and link from TestRail:
Email or phone of affected tester (no customers):
Logs: https://stackoverflow.com/c/expensify/questions/4856
Expensify/Expensify Issue URL:
Issue reported by: @dhanashree-sawant
Slack conversation: https://expensify.slack.com/archives/C049HHMV9SM/p1697544511147309

Action Performed:

  1. Open the app
  2. Open settings->profile->personal details->address->country
  3. Search with letter 'Å' and observe that even though we have country with that letter in list, app still doesn't show that on n top of search eg: 'Åland Islands'
  4. Similarily search with letter 'é', we have multiple countries with that letter but search doesn't show them on top eg: 'São Tomé & Príncipe', 'Saint Barthélemy'

Expected Result:

App should consider latin or spanish letter in country search and display their results on top as we generally do on user search page and send message page

Actual Result:

App does not consider latin or spanish letter in country search and does not display their results on top

Workaround:

unknown

Platforms:

Which of our officially supported platforms is this issue occurring on?

  • Android: Native
  • Android: mWeb Chrome
  • iOS: Native
  • iOS: mWeb Safari
  • MacOS: Chrome / Safari
  • MacOS: Desktop

Screenshots/Videos

Android: Native
Android.native.country.search.not.specific.variation.letters.mp4
Android: mWeb Chrome
Android.chrome.search.not.change.by.variation.letters.mp4
iOS: Native
ios.native.country.search.not.specific.variation.letters.mov
iOS: mWeb Safari
ios.safari.country.search.not.specific.variation.letters.mov
MacOS: Chrome / Safari
mac.chrome.country.search.not.specific.variation.letters.mov
MacOS: Desktop
mac.desktop.country.search.not.specific.variation.letters.mov

View all open jobs on GitHub

Upwork Automation - Do Not Edit
  • Upwork Job URL: https://www.upwork.com/jobs/~01c62c8f44bacecbe2
  • Upwork Job ID: 1714381091711156224
  • Last Price Increase: 2023-10-31
  • Automatic offers:
    • situchan | Reviewer | 27762279
    • graylewis | Contributor | 27762281
    • dhanashree-sawant | Reporter | 27762282
@m-natarajan m-natarajan added External Added to denote the issue can be worked on by a contributor Daily KSv2 Bug Something is broken. Auto assigns a BugZero manager. labels Oct 17, 2023
@melvin-bot melvin-bot bot changed the title Country search does not consider Latin or Spanish letters in search [$500] Country search does not consider Latin or Spanish letters in search Oct 17, 2023
@melvin-bot
Copy link

melvin-bot bot commented Oct 17, 2023

Job added to Upwork: https://www.upwork.com/jobs/~01c62c8f44bacecbe2

@melvin-bot
Copy link

melvin-bot bot commented Oct 17, 2023

Triggered auto assignment to @strepanier03 (Bug), see https://stackoverflow.com/c/expensify/questions/14418 for more details.

@melvin-bot melvin-bot bot added the Help Wanted Apply this label when an issue is open to proposals by contributors label Oct 17, 2023
@melvin-bot
Copy link

melvin-bot bot commented Oct 17, 2023

Bug0 Triage Checklist (Main S/O)

  • This "bug" occurs on a supported platform (ensure Platforms in OP are ✅)
  • This bug is not a duplicate report (check E/App issues and #expensify-bugs)
    • If it is, comment with a link to the original report, close the issue and add any novel details to the original issue instead
  • This bug is reproducible using the reproduction steps in the OP. S/O
    • If the reproduction steps are clear and you're unable to reproduce the bug, check with the reporter and QA first, then close the issue.
    • If the reproduction steps aren't clear and you determine the correct steps, please update the OP.
  • This issue is filled out as thoroughly and clearly as possible
    • Pay special attention to the title, results, platforms where the bug occurs, and if the bug happens on staging/production.
  • I have reviewed and subscribed to the linked Slack conversation to ensure Slack/Github stay in sync

@melvin-bot
Copy link

melvin-bot bot commented Oct 17, 2023

Triggered auto assignment to Contributor-plus team member for initial proposal review - @situchan (External)

@graylewis
Copy link
Contributor

graylewis commented Oct 17, 2023

Proposal

Please re-state the problem that we are trying to solve in this issue.

The country search page doesn't prioritize search results that contain the same diacritical marks as the search term over normalized matches

What is the root of the issue

Items in the array of countries being searched get sanitized with the StringUtils.sanitizeString() function which:
Removes diacritical marks and non-alphabetic and non-latin characters from a string.

By running this on the array of countries to be searched and the search term, both searches with diacritical marks and searches without are treated the same. This is almost ideal behavior, but prioritizing exact matches with diacritical marks would be even better.

What changes do you think we should make in order to solve the problem?

In the searchCountryOptions file, where the results are generated by matching the search term against items of the countriesData array, we can add a sorting mechanism to prioritize exact matches including diacritical marks. In addition to this, we'll have to maintain the diacritical marks for the array of countries to be searched and then normalize the values as needed, rather than normalizing them all at once up front.

In code (this code is obviously sloppy, it's just for illustration purposes):

    const filteredData = countriesData.filter((country) => StringUtils.sanitizeString(country.searchValue).includes(trimmedSearchValue));

    let halfSorted = filteredData.sort((a, b) => {
        if (a.searchValue.toLowerCase().substring(2, trimmedSearchValue.length + 2) === trimmedSearchValue.toLowerCase()) {
            return -1;
        }
        if (b.searchValue.toLowerCase().substring(2, trimmedSearchValue.length + 2) === trimmedSearchValue.toLowerCase()) {
            return 1;
        }
        return 0;
    })

    let fullSorted;
    if (trimmedSearchValue !== searchValue) {
        // Diacritic detected, prioritize diacritic matches
        fullSorted = halfSorted.sort((a,b) => {
            if (    
                a.text.toLowerCase().includes(searchValue.toLowerCase())
            ) {
                return -1
            } 
            if (
                b.text.toLowerCase().includes(searchValue.toLowerCase())
            ) {
                return 1;
            }
            return 0;
        });
    } else {
        fullSorted = halfSorted.sort((a, b) => {
            if (a.value.toLowerCase() === trimmedSearchValue) {
                return -1;
            }
            if (b.value.toLowerCase() === trimmedSearchValue) {
                return 1;
            }
            return 0;
        });
    }

Updated logic to reflect updated requirements:

  1. Data is filtered based on the original search term comparison. This includes all possible matches.
  2. A sort is run to prioritize exact matches from the beginning of the country name, as requested in the updated requirements. E.g. "Bar" returns Barbados first since it's the beginning of the country name.
  3. The sanitized search term is compared to the raw search term. If they're different, then diacritics/non-latin chars are present in the search term. In this case, exact matches including diacritics are pushed to the top.
  4. If diacritics aren't present, then as is done in the existing code, we prioritize exact country code matches. E.g. "US" returns the united states before anything else. (Since country codes never contain diacritics, there's no need to run this step if diacritics are in the search term)

The conditional sort based on the presence of diacritics ensures we only run two sorts regardless of the search term.

As a side note, the call to the searchCountryOptions function should be memoized, as it currently runs every frame unnecessarily. This will more than offset any potential performance costs of running an additional sort on the search results array.

My solution in action:
Screenshot 2023-10-24 at 3 09 19 PM

Screenshot 2023-10-24 at 3 23 21 PM Screenshot 2023-10-24 at 3 23 36 PM image

What alternative solutions did you explore? (Optional)
NA

@Victor-Nyagudi
Copy link
Contributor

It seems this method removes letters with diacritics i.e. accented letters like é, so that could be affecting country search results.

/**
* Removes diacritical marks and non-alphabetic and non-latin characters from a string.
* @param str - The input string to be sanitized.
* @returns The sanitized string
*/
function sanitizeString(str: string): string {
return _.deburr(str).toLowerCase().replaceAll(CONST.REGEX.NON_ALPHABETIC_AND_NON_LATIN_CHARS, '');
}

I'm not sure if this is the intended behavior or a regression, so I asked about it in the pull request that introduced this method.

@melvin-bot melvin-bot bot added the Overdue label Oct 20, 2023
@strepanier03
Copy link
Contributor

@situchan - Friendly bump on the proposal above.

@situchan
Copy link
Contributor

@graylewis your solution will cause regression - #24344
Please try search cote d'Ivoire

@melvin-bot melvin-bot bot removed the Overdue label Oct 24, 2023
@graylewis
Copy link
Contributor

@situchan Thanks for the heads up! Editing my original proposal now with a more robust solution

@melvin-bot
Copy link

melvin-bot bot commented Oct 24, 2023

📣 It's been a week! Do we have any satisfactory proposals yet? Do we need to adjust the bounty for this issue? 💸

@melvin-bot melvin-bot bot added the Overdue label Oct 26, 2023
@strepanier03
Copy link
Contributor

Friendly bump @situchan for the updated proposal from @graylewis. Thanks!

@situchan
Copy link
Contributor

Looking for better solution

@melvin-bot melvin-bot bot removed the Overdue label Oct 26, 2023
@graylewis
Copy link
Contributor

@situchan Can I ask what you feel is lacking in my proposed solution?

@melvin-bot melvin-bot bot added the Overdue label Oct 30, 2023
@situchan
Copy link
Contributor

@graylewis
Copy link
Contributor

Hi @francoisl, I'm done with the actual code and I have an almost complete draft PR, I've just been trying to get the screen recordings for all the different platforms done and I had to do some reconfiguring to get the iOS virtual device running properly etc. I'm expecting to submit the PR tomorrow

@francoisl
Copy link
Contributor

Sounds good, thanks for the update!

@melvin-bot melvin-bot bot added Reviewing Has a PR in review Weekly KSv2 and removed Daily KSv2 labels Dec 3, 2023
@melvin-bot melvin-bot bot added Monthly KSv2 and removed Weekly KSv2 labels Dec 27, 2023
Copy link

melvin-bot bot commented Dec 27, 2023

This issue has not been updated in over 15 days. @francoisl, @graylewis, @strepanier03, @situchan eroding to Monthly issue.

P.S. Is everyone reading this sure this is really a near-term priority? Be brave: if you disagree, go ahead and close it out. If someone disagrees, they'll reopen it, and if they don't: one less thing to do!

@situchan
Copy link
Contributor

@graylewis can we get update on PR?

@graylewis
Copy link
Contributor

@situchan sorry for the wait, didn't see that you'd reviewed the PR. made the requested changes to the PR

@situchan
Copy link
Contributor

@situchan sorry for the wait, didn't see that you'd reviewed the PR. made the requested changes to the PR

@graylewis #32151 (comment)

@melvin-bot melvin-bot bot added Weekly KSv2 and removed Monthly KSv2 labels Jan 30, 2024
@melvin-bot melvin-bot bot added Weekly KSv2 Awaiting Payment Auto-added when associated PR is deployed to production and removed Weekly KSv2 labels Feb 19, 2024
@melvin-bot melvin-bot bot changed the title [$500] Country search does not consider Latin or Spanish letters in search [HOLD for payment 2024-02-26] [$500] Country search does not consider Latin or Spanish letters in search Feb 19, 2024
Copy link

melvin-bot bot commented Feb 19, 2024

Reviewing label has been removed, please complete the "BugZero Checklist".

@melvin-bot melvin-bot bot removed the Reviewing Has a PR in review label Feb 19, 2024
Copy link

melvin-bot bot commented Feb 19, 2024

The solution for this issue has been 🚀 deployed to production 🚀 in version 1.4.42-5 and is now subject to a 7-day regression period 📆. Here is the list of pull requests that resolve this issue:

If no regressions arise, payment will be issued on 2024-02-26. 🎊

For reference, here are some details about the assignees on this issue:

Copy link

melvin-bot bot commented Feb 19, 2024

BugZero Checklist: The PR fixing this issue has been merged! The following checklist (instructions) will need to be completed before the issue can be closed:

  • [@situchan] The PR that introduced the bug has been identified. Link to the PR:
  • [@situchan] The offending PR has been commented on, pointing out the bug it caused and why, so the author and reviewers can learn from the mistake. Link to comment:
  • [@situchan] A discussion in #expensify-bugs has been started about whether any other steps should be taken (e.g. updating the PR review checklist) in order to catch this type of bug sooner. Link to discussion:
  • [@situchan] Determine if we should create a regression test for this bug.
  • [@situchan] If we decide to create a regression test for the bug, please propose the regression test steps to ensure the same bug will not reach production again.
  • [@strepanier03] Link the GH issue for creating/updating the regression test once above steps have been agreed upon:

@melvin-bot melvin-bot bot added Daily KSv2 and removed Weekly KSv2 labels Feb 26, 2024
@strepanier03
Copy link
Contributor

@graylewis and @dhanashree-sawant - I've paid you both via Upwork and closed the contracts.

@situchan - I'll check in later today to see if the checklist is done and if so move forward then. If not, I'll check tomorrow morning.

@situchan
Copy link
Contributor

This is not really bug but improvement. We added enough automated tests so no need regression test.

@strepanier03
Copy link
Contributor

Got it thanks for the context @situchan - I'll handle now.

@strepanier03
Copy link
Contributor

Okay, payment sent and contract closed.

Thanks again everyone 👏

@strepanier03 strepanier03 changed the title [HOLD for payment 2024-02-26] [$500] Country search does not consider Latin or Spanish letters in search [PAID] [$500] Country search does not consider Latin or Spanish letters in search Feb 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Awaiting Payment Auto-added when associated PR is deployed to production Bug Something is broken. Auto assigns a BugZero manager. Daily KSv2 Engineering External Added to denote the issue can be worked on by a contributor
Projects
None yet
Development

No branches or pull requests

6 participants