Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Metadata initialization time, add benchmarking, check regexps during build & general housekeeping #27

Closed
wants to merge 6 commits into from

Conversation

yannleretaille
Copy link
Contributor

@yannleretaille yannleretaille commented Oct 15, 2020

Please note that these changes depend on the not yet released version 0.2.1 of regex-cache. This should fix #26
@meh @rustonaut would be great if you guys could have a look!

@@ -126,6 +126,9 @@ pub enum LoadMetadata {

/// Malformed Regex in Metadata XML database
#[error("Malformed Regex: {0}")]
Regex(#[from] regex::Error),
Regex(#[from] regex::Error),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something funky going on here with spaces/tabs.

Copy link
Contributor Author

@yannleretaille yannleretaille Oct 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that was weird, fixed

src/metadata/loader.rs Show resolved Hide resolved
@yannleretaille
Copy link
Contributor Author

changed to semantic commit messages

…g Metadata from the Database

This has two benefits:
- it results in a ~97% reduction of initiliaztion time the first time
  a phone number is validated. This is crucial for client side
  applications.
- it prevents the crate from compiling in the unlikely case that Googles
  Metadata should ever contain invalid regexps (this was tested)
In the future it should be investigated if there is other data that could be
pre-validated during the build phase

fixes whisperfish#26
Copy link
Collaborator

@rustonaut rustonaut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could merge it the way it is and do the noted changes later before publishing a new version. (@meh)

@@ -70,7 +71,7 @@ pub use crate::carrier::Carrier;
mod phone_number;
pub use crate::phone_number::{PhoneNumber, Type};

mod parser;
pub mod parser;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you made this public because of the benchmark.

But it's not supposed to be part of the public API so it probably should be #[doc(hidden] or similar.

}

/// Create a database from a loaded database.
pub fn from(meta: Vec<loader::Metadata>) -> Result<Self, error::LoadMetadata> {
pub fn from(meta: Vec<loader::Metadata>, check_regex: bool) -> Result<Self, error::LoadMetadata> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a braking change.

Furthermore the api between from, load and parse is no inconsistent.

If we anyway do a braking change we probably want to have some opaque Options type we can pass into all of from, load and parse.

@yannleretaille
Copy link
Contributor Author

yannleretaille commented Oct 18, 2020 via email

@fabricedesre
Copy link
Contributor

@yannleretaille do you think you will have time to resume work on that PR?

@atymic
Copy link

atymic commented Oct 6, 2021

Any updates on this pr?

@rubdos rubdos changed the base branch from master to main March 31, 2023 08:29
@rustonaut
Copy link
Collaborator

I'm closing this as I can't hide it in my Git overview, and it likely won't get any process, and has been superseeded by the fork AFIKT.

@rustonaut rustonaut closed this Jun 1, 2023
@rubdos
Copy link
Member

rubdos commented Jun 5, 2023

@rustonaut, we don't maintain a fork, you are writing in the live repository.

The easiest way to get it out of your dashboard is probably to reduce your access rights in the repository (if you don't intend do exercise those rights any more), or to unassign issues and PRs.

We initially wanted to keep this open, but @gferon will open an issue to track the follow-up here.

@rustonaut rustonaut requested review from rustonaut and removed request for rustonaut June 5, 2023 18:49
@rustonaut
Copy link
Collaborator

rustonaut commented Jun 5, 2023

Oh, sorry I missed that this is now a PR on whisperfish:main instead of meh:main 🙈 The problem was the reviewed-by filter (so non of the proposed steps would have helped, I would love to not use that filter but due to limitations of other filters and workflows I always end up coming back to it).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Loading default DATABASE is extremely slow in WASM
6 participants