Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Query API #2272

Closed
holic opened this issue Feb 16, 2024 · 2 comments
Closed

Proposal: Query API #2272

holic opened this issue Feb 16, 2024 · 2 comments
Assignees

Comments

@holic
Copy link
Member

holic commented Feb 16, 2024

We're rethinking how to query for MUD data both in the client and from the indexer. It's designed in such a way to support incremental/streaming updates in an efficient way, so that we can offer a global automatic indexer with filtering and streaming.

Goals

  • expressive queries
    • filtering of keys/values, using operators like >, >=, in, etc.
    • joins across tables (relationships)
  • efficient streaming/incremental updates given above query constraints
  • can support existing use cases (table+key filters, RECS queries, etc.)

Reference

For all examples, let's assume a MUD config that looks like:

const tables = {
  Position: {
    keySchema: {
      player: "address"
    },
    valueSchema: {
      x: "int32",
      y: "int32"
    }
  },
  Health: {
    keySchema: {
      player: "address"
    },
    valueSchema: {
      health: "uint256"
    }
  },
  Inventory: {
    keySchema: {
      player: "address",
      item: "uint8"
    },
    valueSchema: {
      amount: "uint32"
    }
  },
  Score: {
    keySchema: {
      player: "address",
      match: "uint256"
    },
    valueSchema: {
      score: "uint256"
    }
  },
  Winner: {
    keySchema: {
      match: "uint256"
    },
    valueSchema: {
      player: "address"
    }
  },
  Terrain: {
    keySchema: {
      x: "int32",
      y: "int32"
    },
    valueSchema: {
      type: "uint8"
    }
  }
}

Proposal

I'm assuming the folks reading this understand TypeScript enough to internalize the constraints of the query API. It's designed to be JSON serializable so is a little more verbose than what might be exposed in client libs. It'll be used in a few different contexts with tooling on top: vanilla JS, React hooks, SSE API.

type Query = {
  from: TableSubject[];
  except?: TableSubject[];
  where?: QueryCondition[];
  records?: TableSubject[];
};

type TableSubject = {
  tableId: Hex;
  subject: string[];
};

type QueryCondition = ComparisonCondition | InCondition;

type ComparisonCondition = {
  left: TableField,
  op: '<' | '<=' | '=' | '>' | '>=' | `!=`,
  right: TableField | LiteralValue,
};

type InCondition = {
  left: TableField,
  op: 'in',
  right: LiteralValue[],
}

type TableField = {
  tableId: Hex;
  field: string;
};

type LiteralValue = string | number | boolean;

Subjects

The query API spec above refers to a "subject" in several places. You can think of a subject as a field (or fields) on which a relationship (or relationships) is based. Using SQL to demonstrate:

SELECT Position.player
FROM Position
JOIN Inventory ON Inventory.player = Position.player

The subject in this case is player, or rather the tuple (player), for both Position and Inventory tables. Subjects can span multiple fields and all subjects in a query must share the same underlying tuple of ABI types.

A query returns a list of matching subject tuples for all tables in from and all conditions in where.

Any field of any table in from can be used as a query condition in where, not just subject fields. And you can use except to exclude subjects that exist in a given table (like RECS' Not operator).

Getting records

By default, queries only return the list of matched subject tuples. To return the records for a table(s), you can specify the corresponding table subjects in records. This is a separate field because we intend for loading records to be a separate operation internally. This means you also return records for tables that were not part of the query, as long as there's a suitable subject for each table that matches the query subject(s).

Examples

Please note that the examples below are pseudocode using the query API above. I'm also using table names instead of hex table IDs for readability.

We're planning ship this with a strongly-typed query client for better UX and to simplify writing queries, but want to keep the discussion focused on the unlocks + constraints of the API design itself before we flesh out the query client, async + streaming interfaces, etc.

  1. Give me all players with a position.

    // in RECS today
    runQuery([Has(Position)]);
    // new query API
    const query: Query = {
      from: [
        { tableId: 'Position', subject: ['player'] },
      ],
    };
  2. Give me all players at position (3, 5).

    // in RECS today
    runQuery([HasValue(Position, { x: 3, y: 5 })]);
    // new query API
    const query: Query = {
      from: [
        { tableId: 'Position', subject: ['player'] },
      ],
      where: [
        { left: { tableId: 'Position', field: 'x' }, op: '=', right: 3 },
        { left: { tableId: 'Position', field: 'y' }, op: '=', right: 5 },
      ],
    };
  3. Give me all players within the bounds of (-5, -5) and (5, 5).

    // not possible in RECS today
    // need to fetch a bigger data set and manually filter
    const entities = runQuery([Has(Position)]);
    entities.filter((entity) => {
      const position = getComponentValue(Position, entity);
      return position.x >= -5 && position.x <= 5 && position.y >= -5 && position.y <= 5;
    });
    // new query API
    const query: Query = {
      from: [
        { tableId: 'Position', subject: ['player'] },
      ],
      where: [
        { left: { tableId: 'Position', field: 'x' }, op: '>=', right: -5 },
        { left: { tableId: 'Position', field: 'x' }, op: '<=', right: 5 },
        { left: { tableId: 'Position', field: 'y' }, op: '>=', right: -5 },
        { left: { tableId: 'Position', field: 'y' }, op: '<=', right: 5 },
      ],
    };
  4. Give me the position of all players that are still alive.

    // in RECS today with manually mapping to values
    const entities = runQuery([Has(Position), Has(Health), NotValue(Health, 0)]);
    entities.map((entity) => getComponentValue(Position, entity));
    // new query API
    const query: Query = {
      from: [
        { tableId: 'Position', subject: ['player'] },
        { tableId: 'Health', subject: ['player'] },
      ],
      where: [
        { left: { tableId: 'Health', field: 'health' }, op: '!=', right: 0 },
      ],
      records: [
        { tableId: 'Position', subject: ['player'] },
      },
    };
  5. Give me the inventory of all players with a score above 50.

    // not possible in RECS today due to composite keys
    // new query API
    const query: Query = {
      from: [
        { tableId: 'Inventory', subject: ['player'] },
        { tableId: 'Score', subject: ['player'] },
      ],
      where: [
        { left: { tableId: 'Score', field: 'score' }, op: '>', right: 50 },
      ],
      records: [
        { tableId: 'Inventory', subject: ['player'] },
      },
    };
  6. Give me all players standing on a grass tile.

    // not possible in RECS today
    // new query API
    // this is done with two queries, because all subjects in a query must match
    // and there's no common subject between the tables we're querying and filtering on
    const players = query({
        from: [
          { tableId: 'Position', subject: ['player'] },
        ],
      })
      // queries return subject tuples, so we map it to just the player addresses to use below
      .map(([player]) => player);
    
    const query: Query = {
      from: {
        { tableId: 'Position', subject: ['x', 'y'] },
        { tableId: 'Terrain', subject: ['x', 'y'] },
      },
      where: [
        { left: { tableId: 'Position', field: 'player'}, op: 'in', right: players },
        { left: { tableId: 'Terrain', field: 'type'}, op: '=', right: 'grass' },
      ],
      records: [
        { tableId: 'Position', subject: ['x', 'y'] },
      ],
    };

Notes

  • This is a step towards MUD client apps (or sections of) being able to query just for the data it needs when it needs it, rather than having to keep the entire world's state in memory.
  • The query API is designed to be easily understood by folks who work in relational databases (and maps well to our table and record abstractions), while still being flexible to support ECS-style queries in an efficient way. We'll build ECS tooling on top to abstract away the relational-centered bits.
  • Queries are intentionally designed to not "select from" a particular table. Instead, all "joined" tables are defined together. This lets us internally optimize these queries to start from the smallest tables first and filter the result set from there. This is only possible because everything is assumed to be ANDed together.
  • The query constraints defined by the API are optimized for highly efficient incremental updates. We'll write more about this in the future, but we evaluated many different kinds of queries, their complexity in both initial query and incremental updates, and found this API to be a good balance of expressivity and fast/efficient streaming.
  • For more complex queries, we expect queries to be composed and piped together (as shown in the last example). In the future, we may provide APIs to define multiple queries that efficiently do this pipelining for you.
@holic
Copy link
Member Author

holic commented Mar 1, 2024

Should we constrain subjects to just static/fixed length field types?

@holic
Copy link
Member Author

holic commented Mar 14, 2024

The spec in this proposal now lives in https://github.com/latticexyz/mud/blob/main/packages/query/src/api.ts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

No branches or pull requests

1 participant