Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default regular expressions #5

Open
psoholt opened this issue Aug 19, 2013 · 13 comments
Open

Default regular expressions #5

psoholt opened this issue Aug 19, 2013 · 13 comments

Comments

@psoholt
Copy link
Member

psoholt commented Aug 19, 2013

In CSharp we have made some common regex expressions, like e-mail and url.
So e.g. one can write VerbEx something like:

verbEx.StartOfLine().Then(CommonRegex.Email);

I think this should be implemented similar across the different languages.

It's important that if we make something like default regex words like e-mail and url, that the underlying regex is equal between the different language ports.

Other examples of commonregex that could be implemented:

email
phone
url
date
ip address
rgb color hex value
decimal number
time format

See original issue in CSharp:
VerbalExpressions/CSharpVerbalExpressions#4

@brudgers
Copy link
Member

A thought provoking post, Peder.

For me, this sort of gets at the heart of the question, "What is the
purpose of VerbalExpressions?"

And I see two reasonably valid answers.

The first is that VerbalExpressions are supposed to be a collection of
prepackaged regular expressions - e.g. phone-number, etc. There's a
practical case for this and an endless number of tasks to implement.

The second answer is that that VerbalExpressions are intended to be a more
readable way of writing regular expressions, e..g "withAnyCase" instead of
"i" and modifier syntax. In this case, there are a finite number of tasks
to implement - only those which make VerbalExpressions isomorphic with
regular expressions.

For me, the more interesting approach is the second. That doesn't make it
better. It doesn't necessarily make it appropriate for the name
"VerbalExpressions" either.

All it means is that, that is the direction I'm going with it, and if I
need to change the name of my implementation to "VerboseExpressions" that's
ok - a rose is a rose.

Ben

On Mon, Aug 19, 2013 at 4:58 AM, Peder Søholt [email protected]:

In CSharp we have made some common regex expressions, like e-mail and url.
So e.g. one can write VerbEx something like:

verbEx.StartOfLine().Then(CommonRegex.Email);

I think this should be implemented similar across the different languages.

It's important that if we make something like default regex words like
e-mail and url, that the underlying regex is equal between the different
language ports.

Other examples of commonregex that could be implemented:

email
phone
url
date
ip address
rgb color hex value
decimal number
time format

See original issue in CSharp:
VerbalExpressions/CSharpVerbalExpressions#4VerbalExpressions/CSharpVerbalExpressions#4


Reply to this email directly or view it on GitHubhttps://github.com//issues/5
.

@Foxboron
Copy link

I disagree with calling it "Verbose", we are essentially just making regex verbal, where you can "speak" or pronounce a regex without sounding wierd. It obviously means it will be verbose, but there is no need to even consider a name change. Even after going down the path to make predefines regexe's, there is no valid reason to change the name.

@brudgers
Copy link
Member

Sorry for not being clear, Morten.

What I meant was that if the direction of Verbal Expressions was toward
having phoneNumber and URL as primitives.

Then, I was fine with dropping the name "VerbalExpressions" from whatever
work in a different direction I did. That direction being toward
implementing regular expressions with a more readable isomorphic language
with primitives such as matchAtLeastOnce or beginClass.

In language terms, what are the atoms of the standard implementation? Is it
going to be like Scheme or Ansi Common Lisp?

And I am not saying that there is anything wrong with an implementation
that is different from the one I am interested in.

Ben

On Mon, Aug 19, 2013 at 8:17 AM, Morten Linderud
[email protected]:

I disagree with calling it "Verbose", we are essentially just making regex
verbal, where you can "speak" or pronounce a regex without sounding wierd.
It obviously means it will be verbose, but there is no need to even
consider a name change. Even after going down the path to make predefines
regexe's, there is no valid reason to change the name.


Reply to this email directly or view it on GitHubhttps://github.com//issues/5#issuecomment-22879472
.

@psoholt
Copy link
Member Author

psoholt commented Aug 20, 2013

I just thought it would be a good idea for people not being that used to Regex, having predefined regex. What we do is making it verbal, and I thought this would be the next step.

  • agree it would be better if one could write something like:
  • AnythingButEmail()
    or
  • MaybeUrl()

Which is more verbal.

If one should have some predefines, without using it as parameters, then it would create so many different methods, so that's why one have to write something with CommonRegex or similar (might be a better name):

  • Then(CommonRegex.Email)

@metal3d
Copy link
Member

metal3d commented Aug 20, 2013

Note that Find(), Then() etc... accepts string that are quoted when append, so a standard regexp string passed to those methods will be escpated...

Allowing specific object as parameters append a complexity and results on bad performances. Because each method should check if the given parameter is a value or a component

Please, consider to participate on other questions that are (IMHO) priorities: Not() implementation, Or() enhancement... before extending functionnalities. I did some review on implementations, and I see that only 4 languages has implemented "captures". I really think that discussions must be prior on standardisation.
But I can be wrong ;)

@psoholt
Copy link
Member Author

psoholt commented Aug 27, 2013

@metal3d In CSharp it won't be any worse performance issue as it is method overloads, but that might be the case in the javascript and in other languages?

If you look at the CSharp code we don't have the problem with the methods being escaped when using these default enums, as these are overload methods telling not to escape.

I agree it should be prioritized to discuss Not() implementation, Or() implementation and captures, but that doesn't mean I can come up with a suggestion for feature. It could always stay here in the issues list for a while (instead of forgetting it ;)

I agree with @Foxboron answer to your question @brudgers. We are essentially just making regex verbal, where you can "speak" or pronounce a regex without sounding weird. So this is just a suggestion to make it easier in some cases. But of course that could be created as a separate project and then it could be possible to combine default regex from that project with VerbalExpressions-project if one like to use both.

I just thought it would make it even easier to write regex verbal and easy.

@metal3d
Copy link
Member

metal3d commented Aug 28, 2013

@psoholt That was not exactly what I meant :) I said that making a test to know if argument is a generic rule or a value to append makes more operations and is not truly optimal. The second problem is that appending this generic rule should test if we have to "clean" the string to append or not, because "string values" are "quoted" on insert.

I really don't know if it's a good idea to append that kind of complexity.

@psoholt
Copy link
Member Author

psoholt commented Aug 28, 2013

@metal3d It does make it more complex of course, but not overwhelmingly complex. This is the Maybe() implementeation e.g.:

    public VerbalExpressions Maybe(string value, bool sanitize = true)
    {
        value = sanitize ? Sanitize(value) : value;
        value = string.Format("({0})?", value);
        return Add(value, false);
    }

    public VerbalExpressions Maybe(CommonRegex commonRegex)
    {
        return Maybe(commonRegex.Name, sanitize: false);
    }

@metal3d
Copy link
Member

metal3d commented Aug 28, 2013

As you can see, you are using a new argument. Some langage cannot use default value for argument (Cpp, Go, ...) and should make some work to accept this.
You must append "sanitze" argument to the whole methods... each call will make a test to know if sanitize has to be applied.
I think that a overriden class of VerbalExpression is better. We should keep VerbalExpression as simple as possible to let overrides to be efficient. This is my humble opinion.

@metal3d
Copy link
Member

metal3d commented Aug 28, 2013

Note that if we have "Add" method public, we can call it directly:

ve.Find("name").Add(CommonRegex.Email)

This is not "verbally" correct... but it's very simple

@metal3d
Copy link
Member

metal3d commented Aug 28, 2013

I'm sorry to send 3 comments, but I just realize that I'm not as clear as I want.

What I mean is that PCRE or other Regexp implementation has no "Common" list of expression. I think that VerbalExpression should be at the same abstraction level as PCRE. Then, developpers can extends library to have common methods

@psoholt
Copy link
Member Author

psoholt commented Aug 28, 2013

@metal3d Ok, I see your point. I will create a different common project for some default expressions, and we will keep Verbal Expression simple.

@metal3d
Copy link
Member

metal3d commented Aug 28, 2013

Ho... You can wait for other opinions. I'm not the boss :)
Le 28 août 2013 11:26, "Peder Søholt" [email protected] a écrit :

@metal3d https://github.com/metal3d Ok, I see your point. I will create
a different common project for some default expressions, and we will keep
Verbal Expression simple.


Reply to this email directly or view it on GitHubhttps://github.com//issues/5#issuecomment-23401619
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants