Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Champion "Query-foreach (do query termination)" #101

Open
5 tasks
gafter opened this issue Feb 14, 2017 · 31 comments
Open
5 tasks

Champion "Query-foreach (do query termination)" #101

gafter opened this issue Feb 14, 2017 · 31 comments
Assignees
Milestone

Comments

@gafter
Copy link
Member

gafter commented Feb 14, 2017

  • Proposal added
  • Discussed in LDM
  • Decision in LDM
  • Finalized (done, rejected, inactive)
  • Spec'ed

This is for a new statement form something like

iteration_statement
    : query_do_statement
    ;
query_do_statement
    : query_expression 'do' embedded_statement
    ;

For example

from student in students
where student.Year > 1
let gpa = gradesDb.Lookup(student).Gpa
where gpa >= 3.0
do
{
    Console.WriteLine($"{student.Name} ${gpa}");
}

See also dotnet/roslyn#1938

@gafter gafter self-assigned this Feb 14, 2017
@gafter gafter changed the title Query-foreach ("do" query termination) Champion Query-foreach ("do" query termination) Feb 15, 2017
@gafter gafter changed the title Champion Query-foreach ("do" query termination) Champion "Query-foreach (do query termination)" Feb 21, 2017
@gafter gafter added this to the X.X candidate milestone Feb 22, 2017
@gafter
Copy link
Member Author

gafter commented Feb 28, 2017

I've updated the issue, above, to give some context. I still need to check-in a proposal.

@alrz
Copy link
Member

alrz commented Mar 11, 2017

I think this #100 (comment) should be considered here since both aim to use the same keyword.

var x = 
    from item in collection
    where item.foo
    select item.bar
    do Single();

// or
var x = 
    from item in collection
    where item.foo
    do Skip(5)
    do ToDictionary(item.Key, item.Value);


// statement form?
from item in collection
where item.foo
do Distinct() into g // `do` the query operator (that proposal)
from item in g
do Console.WriteLine(item); // `do` the terminator (this proposal)

Would it be possible to do that without a query continuation?

Edit: from #793 (comment), a "do query termination" in the expression form can be useful as well,

private void Test(IEnumerable<int> numbers)
  => from number in numbers do Debug.WriteLine(number);

Edit 2: VB uses Aggregate Into clauses for passing range variable to a user defined method.

@yaakov-h
Copy link
Member

What does Console.WriteLine(${student.Name} ${gpa}) mean?

Did you mean Console.WriteLine($"{student.Name} {gpa}")?

@qrli
Copy link

qrli commented Mar 12, 2017

Yeah, do is better not only termination but also some general purpose Linq method invoker.

@weitzhandler
Copy link

weitzhandler commented Mar 20, 2017

@alrz's idea sounds great.

Here are some more discussions on the subject:

@yaakov-h

Console.WriteLine(${student.Name} ${gpa})

I hope @gafter fixes the first post, he surely referred to interpolated string and forgot to include the quotes. I'd say he should include @alrz's comment in the first post to make the post more productive.

@TonyValenti
Copy link

Hi All,
Related to this and #291 , I wanted to provide a few different ideas that I think can wrap two proposals into one. Perhaps we have two different versions of "do". A statement and an expression.

//The statement would be invoked like this and would iterate over each item in the list.
from x in Users
where x.Age >= 21
select x
do {
  Console.WriteLine(x.Name + " is an adult");
}

//The expression would be invoked like this:
var Adults =
    from x in Users
    where x.Age >= 21
    select x
    do ToList();

@Pzixel
Copy link

Pzixel commented Mar 28, 2017

I'm not sure if anyone uses this syntax instead of method chain. I think it isn't really needed feature.

@HaloFour
Copy link
Contributor

@Pzixel

That argument has already happened. Turns out, despite biases, lots of people use (and even prefer) the query syntax.

@Pzixel
Copy link

Pzixel commented Mar 28, 2017

@HaloFour it's very strange... I used to use them until I realized that they just doesn't provide enough agility: no Zip/Aggregate/SelectMany/..., and when you begin to mix them like (from x in ... ... select x).FirstOrDefault() it looks really ugly. And all colleges I ever worked with was agree with me for same reason. But, if you have an opposite statistics I just can't argue with it.

@jnm2
Copy link
Contributor

jnm2 commented Mar 28, 2017

@Pzixel That's why there are so many asks to expand LINQ syntax to cover all those missing use cases.

@Pzixel
Copy link

Pzixel commented Mar 28, 2017

@jnm2 but you are still in trouble when you have any custom extension methods over IEnumerable like ForEach (even when I read Eric's article it still pretty common to see it in virtually all projects), multiple EmptyIfNull or AssertIsNotEmpty and so on... LINQ syntax just won't know anything about them and thus we still have the same mix of query-syntax and methods calls.

@jnm2
Copy link
Contributor

jnm2 commented Mar 28, 2017

@Pzixel Actually, no, there's a proposal for custom extension methods to participate in LINQ syntax too.

@rexxiang
Copy link

rexxiang commented Mar 30, 2017

As far as I know, c# compiler handle LINQ keywords magically.

Just like this.

class Program {
  static void Main(string[] args) {
    var foo = new Foo() { A = 1, B = "foo" };

    var q =
      from f in foo // even we can use LINQ keyword for object is not inherited from IEnumerable
      where f.A > 0
      select f.B;

    System.Console.WriteLine(q);
  }
}

public class Foo {
  public int A { get; set; }
  public string B { get; set; }
}

public static class FooExtensions {
  public static TResult Select<TResult>(this Foo source, System.Func<Foo, TResult> selector) {
    if (source == null || selector == null) return default(TResult);
    return selector(source);
  }

  public static TResult SelectMany<TIntermediate, TResult>(this Foo source, System.Func<Foo, TIntermediate> selector, System.Func<Foo, TIntermediate, TResult> projector) {
    if (source == null || selector == null || projector == null) return default(TResult);

    var intermediate = selector(source);
    if (intermediate == null) return default(TResult);

    return projector(source, selector(source));
  }

  public static Foo Where(this Foo source, System.Func<Foo, bool> predicate) {
    if (source == null || predicate == null) return null;
    return predicate(source) ? source : null;
  }
}

The code just show the compiler actual know how to map the LINQ keyword to special extension method.

But it only works for built in LINQ keywords, if have a attribute for an extension method, to tell compiler what the keyword is and what type of extension method is(Select/SelectMany/Where/Join/Group...), then compiler can do something like do keyword and more.

BTW, f# already has a attribute CustomOperationAttribute to extend query expressions's keywords

[<AttributeUsage(AttributeTargets.Method, AllowMultiple = false)>]
[<Sealed>]
type 
CustomOperationAttribute
 =
 class
  new CustomOperationAttribute : string -> CustomOperationAttribute
  member this.AllowIntoPattern : bool with get, set
  member this.IsLikeGroupJoin : bool with get, set
  member this.IsLikeJoin : bool with get, set
  member this.IsLikeZip : bool with get, set
  member this.MaintainsVariableSpace : bool with get, set
  member this.MaintainsVariableSpaceUsingBind : bool with get, set
  member this.Name : string
  member this.IsLikeGroupJoin : bool with get, set
  member this.IsLikeJoin : bool with get, set
  member this.IsLikeZip : bool with get, set
  member this.JoinConditionWord : string with get, set
  member this.MaintainsVariableSpace : bool with get, set
  member this.MaintainsVariableSpaceUsingBind  : bool with get, set
 end

@tuespetre
Copy link

@gafter I have almost completed a prototype for #708 and I have been thinking about this one as well. The way I am going about #708 is to synthesize local methods during lowering (LocalRewriter.) Then the other lowering passes will lower the synthesized local methods so that kind of logic is not duplicated. I went for local methods because I wanted to avoid the overhead of closures and they seem more 'correct' in concept, if that makes any sense.

Looking at this issue now, the local method approach seems even more like a good idea because local methods can be iterators, i.e. one could write yield return within the do block (I do think it's important that it should be a block statement) thus allowing some nice composition between the query syntax and some custom iterator logic. What do you think about that? Will there be any confusion in allowing that?

@danielsig
Copy link

I believe I speak in behalf of a large volume of linq lovers when I say
Alt Text

@Pzixel
Copy link

Pzixel commented Sep 26, 2018

@danielsig it's a language garbage that could be easily resolved via introducing a variable. There is nothing that needs to be implemented. If you ever were deconstructing tons singleline expression that contained 100k SLOC of code then you probably know that introducing more variables is better for readability, debugability and other vital core properties.

@danielsig
Copy link

danielsig commented Sep 26, 2018

@Pzixel

  1. Since this new proposed addition to the query syntax will not impact the method syntax I assumed people who prefer the method syntax (e.g. you) would not think that I was speaking in their behalf.
  2. I do not write large linq queries, most of them are 3 - 6 short lines like those in the examples given by others on this thread. Whenever I have large queries, I break them down like you just said. Although I would like to do that without adding parentheses around every single one of them.
  3. A comment is more readable then a temp variable.
  4. You're the only one who downvoted this suggestion. Take a hint - most people here are not like you!

@Pzixel
Copy link

Pzixel commented Sep 26, 2018

  1. It's has nothing to deal with the query syntax. The old one "write a temp variable then use foreach" fits all covered scenarios. There is literately no benefit.
  2. It's always a good idea to save the query in the variable, at least for debugging reasons: it its viewable through watch window unlike the inline query
  3. No, it's actually the opposite, read Clean Code chapter 4 for further explanations. It's disputable, but I find it quite reasonable.
  4. So what? I just describe my opinion, and fact on which it is based. If everybody disagree with me - so be it. I'm fine to be able to not follow the majority just because it is.

@danielsig
Copy link

@Pzixel

"write a temp variable then use foreach" fits all covered scenarios.

Oh yeah, you're right, every possible scenario where linq is used, the programmer actually wants to iterate over every element in a foreach statement. Great observation genius.

You prefer a syntax that is not affected by this proposal.
Your coding style will not be affected by this proposal.
Why are you arguing about this?

@scalablecory
Copy link

I disagree with @Pzixel and think his communication style is a bit abrasive, but I don't think we should be discouraging anyone from presenting an opinion on a proposal.

@Pzixel
Copy link

Pzixel commented Sep 26, 2018

@scalablecory sorry for inconvenience if any. Well, this position have been explained clear enough so I won't bother you anymore.

Oh yeah, you're right, every possible scenario where linq is used, the programmer actually wants to iterate over every element in a foreach statement. Great observation genius.

You could provide any example when it's not enough... Oh wait, you can't, because there is no any.

You prefer a syntax that is not affected by this proposal.
Your coding style will not be affected by this proposal.
Why are you arguing about this?

Because I'm writing C# and I have to know all C# features in order to be not surprised when I change my project or my job. And duplicating of existing features makes me sad. Eric Lippert has written wonderful articles about duplicating functionallity, especially foreach vs ForEach, which looks pretty similar to this proposal. So if it gets introduces in C# then I have to learn it even if I never use it, because my colleagues may. And I say that I see no benefit here over having a local variable. If you can provide any - then I'd just change my mind. But instead I just get ad hominem arguments so quickly, in second response. That's not how you can convince anybody.

In current form kinda do-monad for LINQ, but pretty limiting and useless. I'd rather like to have full do-monad or HKT in the language, but team priorities are different right now.


Being said, I don't want to harsh anybody, so I'm done with this text. You're free contact me in gitter if you wish. I won't violate CoC or anything else, I think I have written enough, if I'm alone thinking that way then who am I to ask changing the course? :) I brought some arguments, if you think they do not deserve to be considered then they don't.

@danielsig
Copy link

@scalablecory I'm actually very happy about Pzixel second comment where he said

I used to use them until I realized that they just doesn't provide enough agility

Which shows how broken the query syntax actually is.
Since people abandon the query syntax because it's ugly, it should be improved, not abandoned.

I agree with you that we shouldn't discourage people from expressing their opinion. I believe that I have not discouraged Pzixel in any way. If anything he's very much encouraged right now ;)

@danielsig
Copy link

danielsig commented Sep 26, 2018

@Pzixel
With the new syntax

var hasEmployees = from employee in context.Employees
                   where employee.CompanyID == companyID
                   do Any();

Using foreach

var temp = from employee in context.Employees
           where employee.CompanyID == companyID
           select employee;
bool hasEmployees = false;
foreach(var employee in temp)
{
    hasEmployees = true;
    break;
}

You are right that foreach can fit all covered scenarios.
But isn't that like saying that goto fits all covered scenarios? Just saying.

@Pzixel
Copy link

Pzixel commented Sep 27, 2018

@danielsig I have said something different about foreach, so I'l try to repeat: foreach could replace do-blocks. The only benefit is you have let keyword instead of explicitely passing values via Select and anonymous classes. So it's a good thing.

But generally I was talking about introducing temp variable. It's not fair to compare builtin Any function with manual loop and saying "wooh, that's so verbose". You can write:

var employees = from employee in context.Employees
                where employee.CompanyID == companyID;
                select employee;
var hasEmployees = employees.Any();

Or just use method chain syntax

var hasEmployees = context.Employees.Where(x => x.CompanyID == companyID).Any();

@theunrepentantgeek
Copy link

theunrepentantgeek commented Sep 30, 2018

My inner pedant wants to come out and play ...

var hasEmployees = context.Employees.Where(x => x.CompanyID == companyID).Any();

"wooh, that's so verbose". You can write:

var hasEmployees = context.Employees.Any(x => x.CompanyID == companyID);

😀 😉

@Pzixel
Copy link

Pzixel commented Sep 30, 2018

@theunrepentantgeek yep, I didn't write it because R# always fix it for me 😄

@cihanyakar
Copy link

@Pzixel , @theunrepentantgeek

As a small reminder VB.net already has this feature:

Dim hasEmployees = Aggregate employee In context.Employees
			Where employee.CompanyID = companyID
			Into Any

https://docs.microsoft.com/en-us/dotnet/visual-basic/language-reference/queries/aggregate-clause

@gafter gafter removed their assignment Mar 13, 2020
@gafter gafter self-assigned this Apr 3, 2020
@333fred 333fred modified the milestones: X.X candidate, X.0 candidate Oct 14, 2020
@GSPP
Copy link

GSPP commented Dec 23, 2021

Any chance of revisiting this?

I find myself avoiding the LINQ query syntax because, very often, I need to wrap the whole query in a (query).ToList(). This makes that syntax uneconomical to use. Also, if you need to insert a special operator anywhere in the flow (such as Distinct or Skip) then suddenly the whole query breaks down and must be rewritten.

And this is a pity because the syntax is so nice to use and so powerful (for example, the let feature is tedious to emulate manually).

@AdamSobieski
Copy link

This issue is relevant to a discussion (#5900) about language-integrated rule definitions, where different kinds of rules could be defined in C# using Where, ForEach, and SelectMany.

During the course of the discussion, @HaloFour indicated that syntax like:

var rule1 = from c in customers
            where c.Name == "John Doe"
            do
            {
              DoSomething1();
              DoSomething2();
              DoSomething3();
            };

var rule2 = from c in customers
            from a in accounts
            where c.IsPreferred
            where a.IsActive
            where a.Owner == c
            do
            {
              c.DoSomething1();
              c.DoSomething2();
              c.DoSomething3();
            };

could map so as to utilize ForEach and could make use of statement/block lambda expressions.

As others have expressed in this issue, I'm excited about the possibility of adding do { ... } syntax to LINQ.

With respect to ways that ForEach could be implemented, here are my two cents:

public static IEnumerable<TSource> ForEach<TSource>(this IEnumerable<TSource> source, Action<TSource> action)
{
    if (source == null) throw new ArgumentNullException(nameof(source));
    if (action == null) throw new ArgumentNullException(nameof(action));

    foreach (var item in source)
    {
        action(item);
        yield return item;
    }
}

public static IQueryable<TSource> ForEach<TSource>(this IQueryable<TSource> source, Expression<Action<TSource>> action)
{
    if (source == null) throw new ArgumentNullException(nameof(source));
    if (action == null) throw new ArgumentNullException(nameof(action));

    return source.Provider.CreateQuery<TSource>(Expression.Call(null, GetMethodInfo(Queryable.ForEach, source, action), new Expression[] { source.Expression, Expression.Quote(action) }));
}

@Unknown6656
Copy link
Contributor

If possible, I'd like to see not only do, but also a whole lot more query ""keywords"" added to the LINQ Query syntax.
I often find myself having to write something like this:

var items = (from ... in ...
             where ...
             let... = (from ... in ...
                       form ... in ...
                       where ...
                       select ...).Average() // <-- 
             where ...
             group ... by ... into ...
             let... = (from ... in ...
                       where ...
                       select ...).Sum() // <-- 
             orderby ... descending
             select new
             {
                 ...
             }).Take(7).ToArray(); // <-- 

Where some expressions might be .ToArray/.ToList/.ToDictionary/.Take/.Skip/.FirstOrDefault/etc. (see the lines marked with <--).

I really love the LINQ query syntax, as it is immediately obvious what exactly it does -- as opposed to the fluent-style method-chaining syntax equivalent (Yes, I know that you can express nested LINQ queries using helper functions blah blah blah, but it is far less obvious to a reader what exactly you are doing). However, having to wrap everything in parenthesises breaks the reader's flow and overall readability.

I'd therefore much rather have the LDT to generalize this proposal and add a language construct to expand the current set of LINQ query contextual keywords. I imagine something of the following (only spitballing):

[LINQExtension(keyword = "every", extensionType = LINQExtensionType.QueryBodyClause)]
public static IEnumerable<T> EveryNth<T>(IEnumerable<T> collection, int stride) =>
    collection.Where((_, i) => (i % stride) == 0);

[LINQExtension(keyword = "median", extensionType = LINQExtensionType.QueryTerminator)]
public static T Median<T>(IEnumerable<T> collection)
    where T : IComparable<T>
{
    if (collection.ToList() is { Count: > 0 } list)
    {
        list.Sort();
        return list[(int)(list.Count * .5)];
    }
    else
        throw new InvalidOperationException("collection is empty");
}

[LINQExtension(keyword = "zipwith", extensionType = LINQExtensionType.QueryBodyClause)]
public static IEnumerable<(T, U)> Zip<T, U>(IEnumerable<T> collection, IEnumerable<U> other)
{
    using var t = first.GetEnumerator();
    using var u = second.GetEnumerator();

    while (t.MoveNext() && u.MoveNext())
        yield return (t.Current, u.Current);
}

[LINQExtension(keyword = "do", extensionType = LINQExtensionType.QueryTerminator)]
public static void Do<T>(IEnumerable<T> collection, Action<T> action)
{
    foreach (T item in collection)
        action(item);
}

Which could be used as follows:

var collection = ....;
var med = from x in collection
            every 3
            median;

from x in collection
zipwith my_other_collection
do x =>
{
    ....
};

My concept proposes that LINQ-contextual keywords may be registered using some kind of attribute, which also provides some information about the keyword itself (whether it is a LINQ query terminator, a LINQ query body clause, etc.). Whether the "custom" LINQ keyword accepts an additional parameter should be determined by the method signature itself. However, for a custom LINQ-contextual keyword to be used, the method must fulfil the following requirements:

  1. Marked as public and static and be in a publicly accessible class.
  2. Be generic with at least one generic type parameters. The first generic type parameter must be the parameter T, which matches the IEnumerable<T> first method parameter.
  3. Annotated with the [LINQExtension]-attribute.
  4. Have at least one, but no more than two method parameters. The first method parameter must be of the type IEnumerable<T>.
  5. The method must return a IEnumerable<> if the LINQ extension is of the type LINQExtensionType.QueryBodyClause.
  6. The method may return any value (or none at all) if the LINQ extension is of the type LINQExtensionType.QueryTerminator.

I understand that implementing such a system is a huge feat, however, one could restrict this feature to the .NET runtime at first, and then open the usage of [LINQExtension] to third-party developers and assemblies.

@cordasfilip
Copy link

@Unknown6656
Couldn't this be acceptable with source generators in some way? That probably won't require that crazy of a feature to implement?
Let's say that your generator method receives the lowered version of the query or something like that and you produce the output tree like with source generators.
Strangely enough with the best feature new feature of .net 7 StringSyntax that no one is talking about for some reason you could write a proper parser that will support this and could compile it to linq. Same way regex is doing it in .net 7.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests