Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Matplotlib revamp (again) with proper offsetting and scaling with aspect ratios and zoom/pan #669

Merged
merged 25 commits into from
Jul 31, 2023

Conversation

iosonofabio
Copy link
Member

@iosonofabio iosonofabio commented May 18, 2023

There have been a few requests centered around the ability to combine data units and pixels/figure units in matplotlib plots.

tl;dr: I thought this would be more messy but once I started playing with matplotlib internals I found them amenable to reason - I think. Good news.

That is not needed in Cairo because it produces static images, therefore the conversion between figure and data units is fixed at the beginning. In matplotlib, however, plots are interactive, therefore the transformation needs callbacks.

This PR aims at extending matplotlib with a subclass of Collection, to be used for edges, that plots the lines/arrows with awareness of the vertex marker bounding boxes, both at the beginning and as they are updated later on.

At the moment, a working stub of the class is there. The logic is correct as verified on my laptop, but it does not work for directed edges (arrows), just undirected (lines). To include directed edges, we currently have separate Artists for the arrow shaft and head, which is not ideal. It would probably be wise to combine those into a single artist and go from there.

Next steps:

  • Figure out which Collection to subclass best
  • Keep in mind the vertex is an arbitrary Patch in general
  • Use sin/cos instead of the rectangular Bbox for the vertex marker
  • Adapt test result images after manually checking they look fine
  • Since we are here, also ensure the offsets for vertex labels are supported and working correctly
  • Bug with loops -> best to move the entire drawing code (curved etc.) into the EdgeCollection, to be triggered at each redraw

Bug/peculiarity note:

  • Changing vertex sizes by accessing PatchCollection.get_paths()[i].vertices directly leads to weird bugs in the edge drawer since the vertex builder has not changed -> edges do not know how big vertices have become. Changing the paths does trigger a "stale" on the PatchCollection artist, hence a redraw. Ideally, we would have a hook callback that then triggers a redraw of the EdgeCollection, but that sounds messy.
  • Keeping edge and drawer builder in sync with the Collection artists can be tricky, but it's now implemented via a callback mechanism.

NOTE: in a previous discussion #665, two solutions were mentioned. This is closer to solution #2, but we try hard to keep it as general as possible to enable later additions (e.g. pie charts in place etc). Pie charts would not work ATM in this PR, but a slight adaptation (Patches instead of straight Paths), which is planned for this PR itself, would make that quite easy to do.

This will still take a few weeks to sort out, as it needs to be done with care.

@iosonofabio
Copy link
Member Author

iosonofabio commented May 29, 2023

This is now working as expected and actually ended up being the dreaded option 3, i.e. extending matplotlib's plotting functionality. Briefly:

  • vertex layout is in data coordinates, but vertex/edge/arrowhead sizes are in figure points (similar to pixels). This means the size of the dots stays the same even though people zoom in.
  • edges start/end at vertex boundary, with some accuracy. It's not perfect (except for circles, in which case it is perfect!) - but good enough for now and a relatively straightforward improvement if users really need it in the future
  • text can be offset properly from the vertex location and this offset is now independent of zoom level, as it should be
  • vertices, edges, and groups are implemented as mpl.collections.Collection subclasses, meaning they should scale well to large graphs. That said, the whole thing needs recomputation for each new zoom, so don't zoom in/out too often for huge graphs ;-)
  • no more fixing aspect ratio to 1, and it now behaves well with any aspect ratio (i.e. circles stay circular, fixing matplotlib aspect ratio of vertices (e.g. circles) #665

Remaining items on the todo list:

  • swapping the correct test figures
  • refactoring bits and pieces of code, I am currently 90% there
  • checking that the doc examples build fine with this update

Overall I'm quite proud of this, it touches pretty deep into matplotlib and it was not trivial to achieve.

@iosonofabio
Copy link
Member Author

I now:

  • fixed a bug with loops and curved edges
  • fixed a bug with saving to PDF
  • simplified the code here and there
  • added a test for the loops/curved edges
  • adapted vertex sizes in the documentation to look decent in the new units
  • adapted visualization.rst in the docs to use the new structure

This is getting quite close to review I think.

@iosonofabio iosonofabio marked this pull request as ready for review June 3, 2023 09:57
@ntamas
Copy link
Member

ntamas commented Jun 4, 2023

Phew, lots of changes here, give me some time to review it :) Is it okay to start the review now or are you still working on it?

@iosonofabio
Copy link
Member Author

iosonofabio commented Jun 4, 2023 via email

class EdgeCollection(PatchCollection):
def __init__(self, *args, **kwargs):
kwargs["match_original"] = True
self._visual_vertices = kwargs.pop("visual_vertices", None)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and equivalent in the other collections, there is another way to do this, i.e. passing the parent GraphArtist straight as an attribute of this class, and grab individual properties when needed

@iosonofabio
Copy link
Member Author

Ping 😉 happy to do some explaining here or via email if you guys prefer, or I can merge and deal with the resulting mayhem afterwards 😉

@ntamas
Copy link
Member

ntamas commented Jul 9, 2023

Finally had some time to go through the PR this weekend. Fixed a typo, but otherwise it looks good to me based on my limited understanding of Matplotlib's internals.

The reference images need to be updated after I've merged main into this branch; I'd appreciate it if you could do it. Other than that, my only concern is that this is essentially a breaking change w.r.t. the usage of vertex_size so I'd rather leave it to 0.11. There will probably be a 0.10.6 soon as I've fixed a memory management issue recently that deserves a release. After 0.10.6 I don't know whether there will be a new patch release for the 0.10.x series or not, but still it would be better to merge this into a dev branch.

So, let me know when the reference images are updated, and if all tests pass, I'll open a dev branch and merge this there.

@ntamas ntamas self-assigned this Jul 9, 2023
@ntamas ntamas added this to the 0.11 milestone Jul 9, 2023
@iosonofabio
Copy link
Member Author

iosonofabio commented Jul 10, 2023

sounds good, thank you. I just switched to a new laptop and half the toolchain is missing :-P so it's going to take me a few days. I'll also try to integrate the avoiding loops if not too messy.

edit: and since this is for 0.11, I will also return the GraphArtist instead of the ax in the function.

@iosonofabio
Copy link
Member Author

I pushed a version of the loops logic in 97f120c, here's how it looks like:

loops_example

It uses 3-quarter circles for wedges > 90 degrees and cubic Bezier for smaller wedges. Probably a little overkill, but looks decent to me. ATM the loop size scales with vertex size, do you guys think that's ok? Alternatively, I could:

  • set a fixed loop size, or
  • scale the loop size with the overall graph layout dimensions, or
  • add a parameter to igraph.plot called e.g. edge_loop_size defaulting to 30 or so. Then we would need to document it.

Any preferences?

@szhorvat
Copy link
Member

Looks nice, but if these are directed graphs, the loops should have arrowheads as well.

@iosonofabio
Copy link
Member Author

I thought about it, but making the curved arc/Bezier look good despite a potentially big arrowhead is tricky:

  • if your derivative is flush with the arrowhead direction at the head base, the loop can look crooked
  • if you try to keep the shape somewhat circular, the arrow will not point towards the vertex center anymore

python-igraph-loops-issues

Since we don't support "partially directed" graphs, I thought this makes for a cleaner design without any ambiguity.

In terms of scaling, I'm leaning towards making an option with default around 20-30, that way users can tweak it if they need very specific situations.

@szhorvat
Copy link
Member

szhorvat commented Jul 12, 2023

ATM the loop size scales with vertex size, do you guys think that's ok?

I think that is a very good default. When the vertex size increases, the loop size must increase for loops to be visible.

It would be nice for loop sizes to be adjustable independently for each loop separately, but that's a nice to have, not a must. It has some applications, for example visualizing edge weights by thickness. It's nicer (though not a must) for very thick loops to be larger. This would e.g. improve Fig 1b here.

If you do implement per-loop size adjustment, it's still a question what the value should represent from the choices you listed (size relative to individual vertex, to figure size, or fixed unit). The applications I can think of would benefit from making this independent of vertex size. But when sizing is left to be computed automatically, it's really nice to make it scale with vertex size ... I don't have answer for you.

Regarding arrowheads

I have seen visualizations where arrows did not have arrows in directed graphs, but very rarely. In my experience it is much more common to add the arrows. I think that there should be a way to add the arrows even if you decide not to do it by default. Many people will insist that their publication figures should have arrows on the loops. This includes myself: I want those arrows for publication figures. For exploratory analysis of course it does not matter.

  • if you try to keep the shape somewhat circular

Personally I don't like the circular loops, but this is subjective, so I did not comment. With arrows, it's nice to have Bezier curves as the arrowhead should point more towards the vertex.

What I don't understand in your example images is why the loops with arrows are not symmetric. Is this a restriction of matplotlib? This is what it looks like with Mathematica:

image image

An interesting feature of Mathematica that makes this feasbile is arrowhead offsetting. The Bezier curves are defined starting at the centerpoint of each vertex, but arrowheads are offset by the vertex radius. I don't know if this is easy with matplotlib. Even if it is, it's not a full solution as vertex shapes may be different from circular. And then it gets really complicated.

image image

I don't think we can afford to make all this possible (??) but it's good to be aware of how far it's possible to go in principle ...

@iosonofabio
Copy link
Member Author

If everyone wants it, we should add them 😅

In your second, third and fourth figure look at the bottom left arrow: clearly not pointing towards the center of the vertex. Perhaps people will not be bothered by that?

For the scaling with vertex, what about we keep a special value, say the string "vertex", to mean scaling with vertex size?

@iosonofabio
Copy link
Member Author

I added arrows to directed loops in 7131d2e, result below:

loops_example

What do you think?

I tried to add the "vertex" special argument, but those attributes are type-checked so that failed. I therefore implemented an alternative logic: when edge_loop_size is negative, it means that number times the size of the corresponding vertex (e.g. edge_loop_size=-2 means the loop has twice the size than the corresponding vertex. Does that seem reasonable or just a mess?

@ntamas
Copy link
Member

ntamas commented Jul 13, 2023

Yay @iosonofabio , this looks awesome!

@iosonofabio
Copy link
Member Author

Thanks, I think we are slowly getting there. @ntamas do you think the negative number "magic" is ok?

@szhorvat
Copy link
Member

szhorvat commented Jul 13, 2023

What do you think?

I really like this version.

when edge_loop_size is negative, it means that number times the size of the corresponding vertex (e.g. edge_loop_size=-2 means the loop has twice the size than the corresponding vertex. Does that seem reasonable or just a mess?

No comment from me on the interface, but I think it's really great that both absolute and relative sizing are possible.

EDIT:

Just for comparison, whenever a size or coordinate is needed, in Mathematica I can write:

  • 0.23 (a number) for the default interpretation, usually plot coordinates;
  • Scaled[0.23] for relative sizing/coordinates, though what it's relative to may vary. Usually the plot size.
  • Offset[23] is basically 23 printer's points (or pixels) regardless of the plot size.

Is there something similar in matplotlib? Would it be Pythonic to define a wrapper class for integers? ig.Scaled(2) is twice the size while 2 is just two units? Since I rarely use Python, I can't say. It's just an idea.

@szhorvat
Copy link
Member

This is definitely not the subject of this PR, but after looking at too many network plots today, I wanted to make two suggestions:

IMO these should be low priority, but they're worth considering.

@ntamas
Copy link
Member

ntamas commented Jul 14, 2023

do you think the negative number "magic" is ok?

Not quite happy with it; I like the idea of a Scaled() wrapper that @szhorvat proposed better, but I don't know how well that fits into the existing codebase. If it's too complicated, let's just roll with the negative number hack.

@iosonofabio
Copy link
Member Author

I like the custom units, but matplotlib implements transformations in quite a different way so I'm thinking of a way to do it that does not involve rewriting half of mpl's code.

The current customization of container artists and subclassing collections is already out there in terms of patching basic functionality.

@iosonofabio
Copy link
Member Author

Hi guys, I've been thinking about this. At this point in time, I am swamped by other things so I don't have enough bandwidth to flesh out a new system of coordinates within igraph plotting. I guess option B (keep negative numbers for now) is the best I can do for now.

If any of you wants to take over I don't mind, otherwise I'll try to polish the few remaining things so tests pass and then we can merge into dev (?)

@iosonofabio
Copy link
Member Author

Ok, I merged again from main and made sure tests pass locally. I'll make some fixes if tests don't pass remotely.

I'd say it's time to merge this, because I won't have contiguous time in the coming months and might end up forgetting the logic before it's merged.

@ntamas: develop is currently 160 commits behind main, so not sure what you would like to do. Easiest would be to ffw develop into main and then merge this PR into that branch?

@ntamas
Copy link
Member

ntamas commented Jul 30, 2023

Fast-forwarded develop. I'll rebase this on develop tomorrow and then do the merge. Thanks for your work on this!

@iosonofabio
Copy link
Member Author

Awesome. I'll reach out to the matplotlib folks to see if the coordinate systems we mentioned might be something they are interested in from their end.

Thank you

@ntamas
Copy link
Member

ntamas commented Jul 31, 2023

@iosonofabio I'm going to merge this but can you also summarize the changes in the CHANGELOG.md file of the develop branch? I believe you know better what exactly was done.

@ntamas ntamas merged commit 9a65ad9 into igraph:main Jul 31, 2023
@iosonofabio
Copy link
Member Author

did this go into develop or main?

@ntamas
Copy link
Member

ntamas commented Jul 31, 2023

oooh dammit! thanks! I'll fix this

@iosonofabio
Copy link
Member Author

git commit --amend --force --force --forget --forgive --didntmean --instead --please --use --branch develop

@ntamas
Copy link
Member

ntamas commented Jul 31, 2023

Okay, mischief managed.

@iosonofabio
Copy link
Member Author

I edited the CHANGELOG. It's kind of a mess to list everything, feels like writing a will. Does it look reasonable?

@ntamas
Copy link
Member

ntamas commented Jul 31, 2023

Yes, seems good, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants