Implement lightweight, centralized object storage #1021

hannobraun · 2022-08-31T09:29:08Z

Not too long ago, we used to have the Shape data structure which represented a single shape. It stored the shape's objects and provided a whole lot of infrastructure to manage them. Shape was overly complex, and was removed after simpler ways to do the things it did were found (#697). Since then, we no longer have any form of centralized storage of objects.

Even back then, I suspected that we'd need to grow back some form of centralized object storage eventually. Although much lighter-weight, and without the problematic parts that Shape had. I've been thinking about this in the back of my mind, and this issue is the result of that.

Why do we need centralized object storage?

So why do we need it? Centralized object storage gives us one thing we can't get any other way: We'll be able to refer to existing objects using a handle, which is kind of a reference to the object in the store.

But how is that an advantage? There are two ways that I can see:

It's lighter-weight. In principle, objects reference other objects. In practice, right now, it's not a reference. Instead, the referencing object owns a copy of the referenced object, resulting in much duplication.
It allows to distinguish between object identity and object equality. This allows for more stringent data structures that could prevent some bugs.

Advantage 1. doesn't really matter right now. Yes, it would be nice to slim down the data structures somewhat (I don't have numbers, but I suspect that they are many times bigger than they'd need to be), but that doesn't cause any problems right now. Long-term, it might make a difference in performance.

Advantage 2. is a bit more substantial, in that it gives us back a capability that we lost when removing Shape. Back then, I said that object identity wasn't as important as I thought it would be, but I'm no longer sure. Accidentally creating an object that is identical to an already existing one, when you should just use the already existing one instead, might be a sign of sloppy code that could cause other problems.

Although, to be honest, I can't point to evidence of that right now. This is just an impression that has grown over time, as I'm working on the kernel code. But in any case, centralized object storage gives us the option of being strict about object identity, while still allowing for escape valves where necessary (in the form of a method like store.get_handle_for_or_insert(object)).

How is that going to be different from `Shape`?

How can we make sure not to repeat the same mistakes though? Here's where I see the differences between the new object store and Shape:

It's going to be a global store for all shapes, instead of a per-shape store. This avoids most of the redundancy that was inherent to Shape.
Most of Shape's problematic use cases have already been solved in other ways. It simply won't be necessary to rebuild them in the new API.
The new API will strictly be an append-only data structure.
It will use a lighter-weight handle type. Since there will be no mutation of objects, the handle type can just implement Deref, which will make it convenient to use.

Conclusion

Centralized object storage will allow us to de-duplicate the currently highly redundant data structures, likely reducing their size by a significant factor. It also gives us the option for being more strict about object identity, without forcing us into anything.

Neither of these solve any problems that are particularly pressing right now, making this issue a low priority. Other than that, it seems like a clear advantage though, so I'm sure it will happen eventually.

Progress

Work on this issue has started. Centralized object storage exists, but not all objects are being st. The following list reflects the current state of the implementation:

The text was updated successfully, but these errors were encountered:

hannobraun · 2022-09-13T14:46:53Z

This issue just entered the critical path towards #42 (via #1079 and #993). Adding to the active milestone and assigning myself.

@dwcarr: I know you expressed interest in working on this, and I don't mean to step on your toes. But considering that this issue is now blocking my main priority (#42), and that I'm probably in the best position to address it in a timely manner (Fornjot being my full-time job and all), I'm taking it. Please let me know, if you want to collaborate in any way.

dwcarr · 2022-09-15T13:37:52Z

@dwcarr: I know you expressed interest in working on this, and I don't mean to step on your toes.

No worries. I am having a down week (brain fog!!!), so don't wait on me. My thought on this are that we should have a centralized data store, along with a state manager that is the sole place where the data can be modified. This would make the future implementation of sketch constraints more straightforward as well, as constraints would be implemented as callbacks on data changes from within that state manager.

hannobraun · 2022-09-16T07:48:53Z

Thank you for the feedback!

I'll have to think about that state manager some more. Right now, I don't see how it fits into what we have, since objects are immutable and therefore don't change, and I meant to carry that forward by making the store append-only. (The old Shape storage was mutable, as I implied but didn't spell out in the issue description. That didn't work so well, in my opinion.)

This whole model kind of follows naturally from the code-based approach (for now), as the model is a program that needs to be evaluated when it changes, and from each evaluation you get a new model. I can see reasons why that wouldn't work well anymore going forward, once models become complex enough so re-evaluating them all at once becomes impractical. I have that filed away in my head under "not today's problem", like so many other things 😄

dwcarr · 2022-09-16T14:28:04Z

Right, I keep getting tripped up by the code-based approach and how that affects the design workflow. I do see a better way now, I think, though I'm still having trouble with how "removal" operations, boolean or otherwise, can work with an append-only data store. My fog is slowly lifting, so this stuff will get clearer for me.

hannobraun · 2022-09-16T14:54:35Z

Before I continue, let me emphasize that I don't want to come across as dictating how things must be done. As I wrote elsewhere, there is a limit to how far I can plan ahead without sabotaging progress on what's needed right now. And given the complexity of the subject matter (and my own limits of knowledge and experience), this planning horizon doesn't go very far.

So if I express an opinion about something, it should be understood within that context. I might strongly believe that something is the right choice right now, but I never expect anything to be the "final" design.

Having said that, let me try to address your confusion:

I do see a better way now, I think, though I'm still having trouble with how "removal" operations, boolean or otherwise, can work with an append-only data store. My fog is slowly lifting, so this stuff will get clearer for me.

To modify an object, you add a new version of it to the store, as well as new versions of all objects that reference it (and all objects that reference them, recursively). Basically, it's a copy-on-write situation.

The objects in the store form a graph, and a shape is just a pointer into that graph.

LegNeato · 2022-09-19T12:52:40Z

You might want to look into https://github.com/TimelyDataflow/differential-dataflow

hannobraun · 2022-09-19T13:12:14Z

@LegNeato Wow, that looks really interesting! I can't claim to understand what's going on there, but it's definitely worth a close look, once we have the need to incrementally update our data. I've added it to our list of useful resources.

hannobraun · 2022-09-19T13:32:07Z

I've implemented the infrastructure required for centralized object storage in #1108, and also added a store for GlobalCurves in that pull request. Since that should be enough to un-block #1079, I'm unassigning myself from this issue, and will remove it from the milestone I'm currently working on.

I've updated the issue description with a list that displays the current implementation status. We can continue work on this at our leisure.

hannobraun · 2022-09-30T12:16:40Z

My main priority (#42) is currently blocked by #1162. To help address this issue, I'm going to expand the scope of centralized object storage, by adding HalfEdge and all objects it references (directly or indirectly).

Assigning myself and adding to current milestone.

hannobraun · 2022-09-30T13:41:44Z

I've integrated Surface in #1163 and updated the list in the issue description.

So far surface equality is still used in validation code. Surface identity should be used instead, as that would make the validation stricter and prevent possibly bugs. I have that change in a local branch, but I'm getting test failures. I will look into that next week.

hannobraun · 2022-10-06T15:02:52Z

All validation code has been updated to rely on surface identity instead of equality. Moving on to integrating Curve into the centralized object storage now.

hannobraun · 2022-10-06T15:53:58Z

And Curves are done (#1176)! After spending days getting Surfaces fully integrated (including the stricter validation code), I expected more of a battle 😂

hannobraun · 2022-10-19T13:07:20Z

GlobalVertex and SurfaceVertex have been added to the centralized object storage. All validation code has been enhanced to make use of this, making it much stricter by making any comparisons of these objects using object identity instead of object equality.

I will now continue with adding Vertex or GlobalEdge to the centralized object storage.

hannobraun · 2022-10-20T15:21:51Z

Vertex, GlobalEdge, and HalfEdge are now fully integrated into the centralized object storage. Those went pretty quick! In the case of Vertex and HalfEdge, because those don't tend to get shared much between other objects, giving little opportunity for object duplication issues to spring up. In the case of GlobalEdge, I suspect it's rather the lack of validation code that could detect such issues (#1129).

This concludes my current round of work on this issue. We now have enough objects in storage for me to continue with my plan to address #1162.

This was some really interesting work. Putting all those different objects into centralized storage allowed the validation code that worked with those objects to become much more strict. As a result, I've fixed many potential and some confirmed bugs. This also made any code that's constructing objects much more complicated. I invented a whole new API to make it possible at all, but I have yet to figure out how to express these new concepts in a way, that doesn't turn a whole lot of object construction code into a huge mess.

I think this is fine for now. Despite being really messy and hard to understand, that code is not fragile. It's supported by the object validation code that uncovered all those issues in the first place. It will be interesting to see how things progress from here. I'm confident that it's possible to express that code in a much cleaner way, but the high-level concepts to do so have yet to be discovered.

hannobraun added type: development Work to ease development or maintenance, without direct effect on features or bugs topic: core Issues relating to core geometry, operations, algorithms labels Aug 31, 2022

hannobraun mentioned this issue Sep 13, 2022

Consider revamping GlobalCurve #1079

Closed

6 tasks

hannobraun added this to the Straight edges, flat faces, simple sketches, full CSG milestone Sep 13, 2022

hannobraun self-assigned this Sep 13, 2022

hannobraun mentioned this issue Sep 16, 2022

Un-derive Copy from various object types #1097

Merged

hannobraun mentioned this issue Sep 19, 2022

Implement centralized object storage #1108

Merged

hannobraun removed their assignment Sep 19, 2022

hannobraun removed this from the Straight edges, flat faces, simple sketches, full CSG milestone Sep 19, 2022

This was referenced Sep 23, 2022

Convert edge builder API into partial edge API #1135

Merged

Sweep code is generating duplicate GlobalCurves #1162

Closed

hannobraun self-assigned this Sep 30, 2022

hannobraun added this to the Straight edges, flat faces, simple sketches, full CSG milestone Sep 30, 2022

hannobraun mentioned this issue Sep 30, 2022

Integrate Surface into centralized storage #1163

Merged

hannobraun mentioned this issue Oct 6, 2022

Integrate Curve into centralized object storage #1176

Merged

This was referenced Oct 14, 2022

Fix another object duplication issue #1222

Merged

Fix last known SurfaceVertex duplication issue #1233

Merged

Fix all known GlobalVertex duplication issues #1238

Merged

This was referenced Oct 20, 2022

Integrate Vertex into centralized object storage #1246

Merged

Integrate GlobalEdge into centralized object storage #1247

Merged

Integrate HalfEdge into centralized object storage #1248

Merged

hannobraun removed their assignment Oct 20, 2022

hannobraun closed this as completed in #1255 Oct 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement lightweight, centralized object storage #1021

Implement lightweight, centralized object storage #1021

hannobraun commented Aug 31, 2022 •

edited

Loading

hannobraun commented Sep 13, 2022

dwcarr commented Sep 15, 2022

hannobraun commented Sep 16, 2022

dwcarr commented Sep 16, 2022

hannobraun commented Sep 16, 2022

LegNeato commented Sep 19, 2022

hannobraun commented Sep 19, 2022

hannobraun commented Sep 19, 2022

hannobraun commented Sep 30, 2022

hannobraun commented Sep 30, 2022

hannobraun commented Oct 6, 2022

hannobraun commented Oct 6, 2022

hannobraun commented Oct 19, 2022

hannobraun commented Oct 20, 2022

Implement lightweight, centralized object storage #1021

Implement lightweight, centralized object storage #1021

Comments

hannobraun commented Aug 31, 2022 • edited Loading

Why do we need centralized object storage?

How is that going to be different from Shape?

Conclusion

Progress

hannobraun commented Sep 13, 2022

dwcarr commented Sep 15, 2022

hannobraun commented Sep 16, 2022

dwcarr commented Sep 16, 2022

hannobraun commented Sep 16, 2022

LegNeato commented Sep 19, 2022

hannobraun commented Sep 19, 2022

hannobraun commented Sep 19, 2022

hannobraun commented Sep 30, 2022

hannobraun commented Sep 30, 2022

hannobraun commented Oct 6, 2022

hannobraun commented Oct 6, 2022

hannobraun commented Oct 19, 2022

hannobraun commented Oct 20, 2022

hannobraun commented Aug 31, 2022 •

edited

Loading

How is that going to be different from `Shape`?