[ENH] doc whether renumber preserves edge order #922

lmeyerov · 2020-06-01T21:21:14Z

Describe the solution you'd like
Docs clarify whether renumbering preserves edge orders.

(Ideally they do, because this simplifies then plugging in different edge weights without manually swizzling.)

Additional context
Encountered in various renumber calls in 0.13, e.g.:
https://docs.rapids.ai/api/cugraph/stable/api.html?highlight=renumber#cugraph.structure.renumber.renumber_from_cudf

The text was updated successfully, but these errors were encountered:

ChuckHastings · 2020-06-22T13:40:52Z

FYI - it does preserve the ordering.

I am updating this code in 0.15, I will add this to the documentation.

ChuckHastings · 2020-07-16T21:24:03Z

Let me change my answer.

In version 0.14 and prior single column integer renumbering preserved ordering. Single column non-integer and Multi column renumbering did not guarantee that the order was preserved.

The new renumbering in 0.15 does not guarantee to preserve ordering of the input. This is documented in the new python API methods.

If ordering matters, you can add an extra column to the input Dataframe numbering the rows. The output of the 0.15 numbering is always a data frame and preserves the contents of the original Dataframe that aren't part of the source or destination definitions. You can then use sort_values on the extra column you added to reconstruct the original order.

I considered making that part of the implementation, but sorting is expensive and in most cases unnecessary.

lmeyerov · 2020-07-16T21:51:54Z

Maybe make sorted=False a parameter?

cc @kkraus14 as you felt the pain of tickets stemming from similar design decisions here in cudf

My reasoning:

A lot of usage will be sorted, so make typical cases easy: have df, use cugraph to enrich, get back df, continue
By making an arg, makes it way easier to optimize later without breaking user code
I'd prefer a default of sorted=True -- let people go faster via opt-in, but safe by default, as what's annoying-level for us is horrible for others. but i'm not spending my request points on this :)

ChuckHastings · 2020-07-16T22:01:20Z

Makes sense. I will evaluate the mechanics of doing this and update the PR accordingly.

lmeyerov added the ? - Needs Triage Need team to review and classify label Jun 1, 2020

BradReesWork self-assigned this Jun 16, 2020

BradReesWork added doc Documentation and removed doc Documentation labels Jun 16, 2020

BradReesWork added this to the 0.15 milestone Jun 16, 2020

BradReesWork added doc Documentation and removed ? - Needs Triage Need team to review and classify labels Jun 16, 2020

ChuckHastings mentioned this issue Jun 22, 2020

[REVIEW] Renumbering refactor, add multi GPU support #963

Merged

BradReesWork closed this as completed in #963 Jul 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] doc whether renumber preserves edge order #922

[ENH] doc whether renumber preserves edge order #922

lmeyerov commented Jun 1, 2020

ChuckHastings commented Jun 22, 2020

ChuckHastings commented Jul 16, 2020

lmeyerov commented Jul 16, 2020

ChuckHastings commented Jul 16, 2020

[ENH] doc whether renumber preserves edge order #922

[ENH] doc whether renumber preserves edge order #922

Comments

lmeyerov commented Jun 1, 2020

ChuckHastings commented Jun 22, 2020

ChuckHastings commented Jul 16, 2020

lmeyerov commented Jul 16, 2020

ChuckHastings commented Jul 16, 2020