Proposal: Query API #2272

holic · 2024-02-16T13:03:58Z

We're rethinking how to query for MUD data both in the client and from the indexer. It's designed in such a way to support incremental/streaming updates in an efficient way, so that we can offer a global automatic indexer with filtering and streaming.

Goals

expressive queries
- filtering of keys/values, using operators like >, >=, in, etc.
- joins across tables (relationships)
efficient streaming/incremental updates given above query constraints
can support existing use cases (table+key filters, RECS queries, etc.)

Reference

For all examples, let's assume a MUD config that looks like:

const tables = {
  Position: {
    keySchema: {
      player: "address"
    },
    valueSchema: {
      x: "int32",
      y: "int32"
    }
  },
  Health: {
    keySchema: {
      player: "address"
    },
    valueSchema: {
      health: "uint256"
    }
  },
  Inventory: {
    keySchema: {
      player: "address",
      item: "uint8"
    },
    valueSchema: {
      amount: "uint32"
    }
  },
  Score: {
    keySchema: {
      player: "address",
      match: "uint256"
    },
    valueSchema: {
      score: "uint256"
    }
  },
  Winner: {
    keySchema: {
      match: "uint256"
    },
    valueSchema: {
      player: "address"
    }
  },
  Terrain: {
    keySchema: {
      x: "int32",
      y: "int32"
    },
    valueSchema: {
      type: "uint8"
    }
  }
}

Proposal

I'm assuming the folks reading this understand TypeScript enough to internalize the constraints of the query API. It's designed to be JSON serializable so is a little more verbose than what might be exposed in client libs. It'll be used in a few different contexts with tooling on top: vanilla JS, React hooks, SSE API.

type Query = {
  from: TableSubject[];
  except?: TableSubject[];
  where?: QueryCondition[];
  records?: TableSubject[];
};

type TableSubject = {
  tableId: Hex;
  subject: string[];
};

type QueryCondition = ComparisonCondition | InCondition;

type ComparisonCondition = {
  left: TableField,
  op: '<' | '<=' | '=' | '>' | '>=' | `!=`,
  right: TableField | LiteralValue,
};

type InCondition = {
  left: TableField,
  op: 'in',
  right: LiteralValue[],
}

type TableField = {
  tableId: Hex;
  field: string;
};

type LiteralValue = string | number | boolean;

Subjects

The query API spec above refers to a "subject" in several places. You can think of a subject as a field (or fields) on which a relationship (or relationships) is based. Using SQL to demonstrate:

SELECT Position.player
FROM Position
JOIN Inventory ON Inventory.player = Position.player

The subject in this case is player, or rather the tuple (player), for both Position and Inventory tables. Subjects can span multiple fields and all subjects in a query must share the same underlying tuple of ABI types.

A query returns a list of matching subject tuples for all tables in from and all conditions in where.

Any field of any table in from can be used as a query condition in where, not just subject fields. And you can use except to exclude subjects that exist in a given table (like RECS' Not operator).

Getting records

By default, queries only return the list of matched subject tuples. To return the records for a table(s), you can specify the corresponding table subjects in records. This is a separate field because we intend for loading records to be a separate operation internally. This means you also return records for tables that were not part of the query, as long as there's a suitable subject for each table that matches the query subject(s).

Examples

Please note that the examples below are pseudocode using the query API above. I'm also using table names instead of hex table IDs for readability.

We're planning ship this with a strongly-typed query client for better UX and to simplify writing queries, but want to keep the discussion focused on the unlocks + constraints of the API design itself before we flesh out the query client, async + streaming interfaces, etc.

Give me all players with a position.

// in RECS today
runQuery([Has(Position)]);

// new query API
const query: Query = {
  from: [
    { tableId: 'Position', subject: ['player'] },
  ],
};

Give me all players at position (3, 5).

// in RECS today
runQuery([HasValue(Position, { x: 3, y: 5 })]);

// new query API
const query: Query = {
  from: [
    { tableId: 'Position', subject: ['player'] },
  ],
  where: [
    { left: { tableId: 'Position', field: 'x' }, op: '=', right: 3 },
    { left: { tableId: 'Position', field: 'y' }, op: '=', right: 5 },
  ],
};

Give me all players within the bounds of (-5, -5) and (5, 5).

// not possible in RECS today
// need to fetch a bigger data set and manually filter
const entities = runQuery([Has(Position)]);
entities.filter((entity) => {
  const position = getComponentValue(Position, entity);
  return position.x >= -5 && position.x <= 5 && position.y >= -5 && position.y <= 5;
});

// new query API
const query: Query = {
  from: [
    { tableId: 'Position', subject: ['player'] },
  ],
  where: [
    { left: { tableId: 'Position', field: 'x' }, op: '>=', right: -5 },
    { left: { tableId: 'Position', field: 'x' }, op: '<=', right: 5 },
    { left: { tableId: 'Position', field: 'y' }, op: '>=', right: -5 },
    { left: { tableId: 'Position', field: 'y' }, op: '<=', right: 5 },
  ],
};

Give me the position of all players that are still alive.

// in RECS today with manually mapping to values
const entities = runQuery([Has(Position), Has(Health), NotValue(Health, 0)]);
entities.map((entity) => getComponentValue(Position, entity));

// new query API
const query: Query = {
  from: [
    { tableId: 'Position', subject: ['player'] },
    { tableId: 'Health', subject: ['player'] },
  ],
  where: [
    { left: { tableId: 'Health', field: 'health' }, op: '!=', right: 0 },
  ],
  records: [
    { tableId: 'Position', subject: ['player'] },
  },
};

Give me the inventory of all players with a score above 50.

// not possible in RECS today due to composite keys

// new query API
const query: Query = {
  from: [
    { tableId: 'Inventory', subject: ['player'] },
    { tableId: 'Score', subject: ['player'] },
  ],
  where: [
    { left: { tableId: 'Score', field: 'score' }, op: '>', right: 50 },
  ],
  records: [
    { tableId: 'Inventory', subject: ['player'] },
  },
};

Give me all players standing on a grass tile.

// not possible in RECS today

// new query API
// this is done with two queries, because all subjects in a query must match
// and there's no common subject between the tables we're querying and filtering on
const players = query({
    from: [
      { tableId: 'Position', subject: ['player'] },
    ],
  })
  // queries return subject tuples, so we map it to just the player addresses to use below
  .map(([player]) => player);

const query: Query = {
  from: {
    { tableId: 'Position', subject: ['x', 'y'] },
    { tableId: 'Terrain', subject: ['x', 'y'] },
  },
  where: [
    { left: { tableId: 'Position', field: 'player'}, op: 'in', right: players },
    { left: { tableId: 'Terrain', field: 'type'}, op: '=', right: 'grass' },
  ],
  records: [
    { tableId: 'Position', subject: ['x', 'y'] },
  ],
};

Notes

This is a step towards MUD client apps (or sections of) being able to query just for the data it needs when it needs it, rather than having to keep the entire world's state in memory.
The query API is designed to be easily understood by folks who work in relational databases (and maps well to our table and record abstractions), while still being flexible to support ECS-style queries in an efficient way. We'll build ECS tooling on top to abstract away the relational-centered bits.
Queries are intentionally designed to not "select from" a particular table. Instead, all "joined" tables are defined together. This lets us internally optimize these queries to start from the smallest tables first and filter the result set from there. This is only possible because everything is assumed to be ANDed together.
The query constraints defined by the API are optimized for highly efficient incremental updates. We'll write more about this in the future, but we evaluated many different kinds of queries, their complexity in both initial query and incremental updates, and found this API to be a good balance of expressivity and fast/efficient streaming.
For more complex queries, we expect queries to be composed and piped together (as shown in the last example). In the future, we may provide APIs to define multiple queries that efficiently do this pipelining for you.

The text was updated successfully, but these errors were encountered:

holic · 2024-03-01T15:11:04Z

Should we constrain subjects to just static/fixed length field types?

holic · 2024-03-14T22:14:52Z

The spec in this proposal now lives in https://github.com/latticexyz/mud/blob/main/packages/query/src/api.ts

github-project-automation bot added this to MUD Feb 16, 2024

holic added this to the MUD v2 stable release milestone Feb 16, 2024

holic moved this to In progress in MUD Feb 16, 2024

holic self-assigned this Feb 16, 2024

This was referenced Feb 21, 2024

consider flattening schema definitions in MUD config #2287

Closed

prototype client store with query API #2300

Closed

MUD indexer - Issue while load testing #2313

Closed

holic removed this from the MUD v2 stable release milestone Mar 1, 2024

holic mentioned this issue Mar 11, 2024

feat(store-sync): initial query client #2355

Merged

holic closed this as completed Mar 14, 2024

github-project-automation bot moved this from In progress to Done in MUD Mar 14, 2024

holic mentioned this issue Mar 21, 2024

Add custom filters when doing queries #1363

Closed

holic mentioned this issue Apr 10, 2024

Addition of option to fetch from indexer even for synchronization after initialize #2628

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: Query API #2272

Proposal: Query API #2272

holic commented Feb 16, 2024 •

edited

Loading

holic commented Mar 1, 2024

holic commented Mar 14, 2024

Proposal: Query API #2272

Proposal: Query API #2272

Comments

holic commented Feb 16, 2024 • edited Loading

Goals

Reference

Proposal

Subjects

Getting records

Examples

Notes

holic commented Mar 1, 2024

holic commented Mar 14, 2024

holic commented Feb 16, 2024 •

edited

Loading