Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to implement Full outer join using MoreLinq? #332

Closed
gdoron opened this issue Jul 11, 2017 · 5 comments
Closed

How to implement Full outer join using MoreLinq? #332

gdoron opened this issue Jul 11, 2017 · 5 comments
Labels

Comments

@gdoron
Copy link

gdoron commented Jul 11, 2017

Hi,

First, thanks a lot for maintaining this amazingly helpful library!

How can I implement a full outer join two unordered sequences using MoreLinq?
I saw you implemented FullGroupJoin but I'm don't want the grouping stuff.

I'm looking for something like in this SO answer: https://stackoverflow.com/a/13503860/601179 , but with better implementation (using yield return for example)

I've tried OrderedMerge but had troubles using it as the sequences aren't ordered (as I don't care about the order) and don't implement IComparer

var panelAndScrapeCombined = panelDataWithChange.Values.OrderedMerge(
    scrapingRecords,
    panel => new {panel.Domain, panel.RefererFlag, panel.Referer},
    scrape => new {scrape.Domain, scrape.RefererFlag, scrape.Referer},
    panel => new KeywordAnalysisRecord
    {
        Domain = panel.Domain,
        Visits = panel.Visits,
        Change = panel.Change,
        DestUrl = "",
        Position = null,
        PreviousMonthVisits = panel.PreviousMonthVisits,
        SearchEngineFamily = panel.SearchEngineFamily,
        Referer = panel.Referer,
        RefererFlag = RefererFlag
    },
    scrape => new KeywordAnalysisRecord
    {
        Domain = scrape.Domain,
        Visits = 0,
        Change = null,
        DestUrl = scrape.DestUrl,
        Position = scrape.Position,
        PreviousMonthVisits = null,
        SearchEngineFamily = scrape.SearchEngineFamily,
        Referer = scrape.Referer,
        RefererFlag = RefererFlag
    },
    (panel, scrape) => new KeywordAnalysisRecord
    {
        Domain = panel.Domain,
        Visits = panel.Visits,
        Change = panel.Change,
        DestUrl = scrape.DestUrl,
        Position = scrape.Position,
        PreviousMonthVisits = panel.PreviousMonthVisits,
        SearchEngineFamily = panel.SearchEngineFamily,
        Referer = panel.Referer,
        RefererFlag = RefererFlag
    },);

Is there a way of doing that?

Many thanks!

@fsateler
Copy link
Member

You can do a full outer join by using FullGroupJoin:

var grouped = panelDataWithChange.Values.FullGroupJoin(
    scrapingRecords, 
    panel => new {panel.Domain, panel.RefererFlag, panel.Referer},
    scrape => new {scrape.Domain, scrape.RefererFlag, scrape.Referer}
    );

var joined = from g in grouped
                from panel in g.First.DefaultIfEmpty()
                from scrape in g.Second.DefaultIfEmpty()
                select new KeywordAnalysisRecord {
                    Domain = g.Key.Domain,
                    Referer = g.Key.Referer,
                    RefererFlag = g.Key.RefererFlag,
                    Visits = panel?.Visits ?? 0,
                    Change = panel?.Change,
                    DestUrl = scrape?.DestUrl ?? "",
                    Position = scrape?.Position,
                    PreviousMonthVisits = panel?.PreviousMonthVisits,
                    SearchEngineFamily = panel?.SearchEngineFamily ?? scrape?.SearchEngineFamily,
                 };

Note the ?. operator because both scrape and panel can be null. I've chosen to simply use ?? to choose between the preferred options but you could use more complicated logic if necessary.

@gdoron
Copy link
Author

gdoron commented Jul 11, 2017

@fsateler
Thanks for the expedite answer Felipe!
I'm curious, is there an (significant?) overhead to using FullGroupJoin whereas only FullJoin is needed?

Thanks a lot again!

@gdoron
Copy link
Author

gdoron commented Jul 11, 2017

Also, just as a suggestion, regardless if there's a penalty or not, it would be really nice having a FullOuterJoin that gets a result selector with record and not IEnumerable, even if it's a simple wrapper around FullGroupJoin

IMHO.

@gdoron
Copy link
Author

gdoron commented Jul 12, 2017

Wait @fsateler, won't your code using FullGroupJoin except from the joining will also do a distinct based on the key?
That's not exactly how FullOuterJoin should behave.
(It's a valid results for my use case as I don't expect to have duplicates, but I guess it's not the case for many others)

@atifaziz
Copy link
Member

@gdoron Full outer join is being tracked as #238. An implementation is now proposed in #350 where your feedback and testing would be welcome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants