Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Index Actor Events by ID #11594

Open
Stebalien opened this issue Jan 23, 2024 · 9 comments
Open

Index Actor Events by ID #11594

Stebalien opened this issue Jan 23, 2024 · 9 comments

Comments

@Stebalien
Copy link
Member

Currently, we index actor events by the emitter's f4 address. This will have to change in #11540 to accommodate non-EVM actors, but really, we simply shouldn't be resolving addresses when we index events.

Instead:

  1. We should record the ID in the index (resolve nothing). The ActorID plus the tipset is the "canonical" identifier for the actor.
  2. When querying for events, resolve addresses to IDs using the latest tipset, then query. As long as we limit ourselves to querying for non-reverted events, there shouldn't be any issue of ID reuse.
  3. When subscribing, we'll have to resolve the filter's addresses to IDs at every tipset.

Unlike the current system, this will work correctly with every address type, will result in more performant indexing, and will reduce the size of the database.

Unfortunately, the correct way to do this involves a migration.

@masih
Copy link
Member

masih commented Mar 6, 2024

@rvagg if you haven't started working on this, I might as well take this one too considering I am working on #11640 ?

@rvagg
Copy link
Member

rvagg commented Mar 7, 2024

@masih I had a look at my code and it wasn't quite as terrible as I imagined. I've tidied it up a little and made it all pass with the latest v1.26 @ #11694. It's just a draft and shouldn't necessarily be the way this ticket gets done. I have some other things to get done in the meantime so I've removed myself from assignee here but could come back to this if you don't happen to get to it in the course of what you're working on. I started looking at the migration problem but didn't get too far. I really want good tests for migrations and there are none in there yet.

@masih
Copy link
Member

masih commented Mar 15, 2024

@Stebalien Question: What is the correct thing to do when migrating the reverted events emitter_addr? We can't resolve those address to actor IDs right? Which means some default value for emitter, e.g. 0 for events that cannot be migrated?

@Stebalien
Copy link
Member Author

We can't resolve those address to actor IDs right?

  1. Reverted events exist in reverted tipsets. If we still have the tipset, we should be able to resolve the addresses relative to those tipsets.
  2. It might just be faster to re-index from the events in the receipts.

But, honestly, I'd just delete the reverted events from the database in this case. I'm not quite sure why we record them anyways given that, unless I'm mistaken, we only return reverted events from the "subscribe" API. I guess we should add "remove reverted events from the database" to the TODO list...

@masih
Copy link
Member

masih commented Mar 19, 2024

remove reverted events from the database

Since we are iterating over the events as part of the migration for this issue anyway, I can add reverted event removal right there. It should be straightforward

@masih
Copy link
Member

masih commented Mar 19, 2024

On second thought, the event collection also needs to be updated to remove storing reverted events. This is a big-ish enough work to make a reasonable size PR on its own. Let's do that in a separate PR.

@jennijuju
Copy link
Member

@aarshkshah1992 was this address in the new chainindexer api by any chance?

@rvagg
Copy link
Member

rvagg commented Oct 8, 2024

Having some discussion in https://github.com/filecoin-project/lotus/pull/12421/files#r1756198386, but because GitHub's linking into inline comments in PRs is kind of impermanent and often doesn't take you to a discussion, here's a screenshot:

Screenshot 2024-10-08 at 1 01 15 PM

I would like a resolution on this one and am concerned about going forward without it in ChainIndexer if we believe this is the right way ahead. I'm just not sure now that it is the right thing to do.

@rvagg
Copy link
Member

rvagg commented Oct 14, 2024

Being closed finally in #12421 by storing both the actor id and the DelegatedAddress if there is one and then querying as such.

@masih masih removed their assignment Oct 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: 🔎 Awaiting review
Development

Successfully merging a pull request may close this issue.

5 participants