Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mongodb: how to describe a json field which can have different types in _schema #8352

Closed
Ognian opened this issue Jun 22, 2021 · 4 comments · Fixed by #8584
Closed

mongodb: how to describe a json field which can have different types in _schema #8352

Ognian opened this issue Jun 22, 2021 · 4 comments · Fixed by #8584
Labels
enhancement New feature or request

Comments

@Ognian
Copy link

Ognian commented Jun 22, 2021

I'm using trino with mongodb. I'm trying to write a _schema entry for a mongodb collection. The entries in this collection are described with json schema:

...
  "delta": {
    "$schema": "http://json-schema.org/draft-04/schema#",
    "type": "array",
    "items": {
      "type": "object",
      "properties": {
        "op": { "type": "string" },
        "path": { "type": "string" },
        "value": {
          "anyOf": [
            { "type": "string" },
            { "type": "number" },
            { "type": "integer" },
            { "type": "boolean" },
            { "type": "object" },
            { "type": "array" }
          ],
        }
      },
      "additionalProperties": false
    }
  }
...

the entry would look like

          {
            "name": "delta",
            "type": "array(row(\"op\" varchar,\"path\" varchar,\"value\" json))",
            "hidden": false
          }

So the interesting point here is that the field value can have on of the types listed below.
The only way to express this was to use the json type, BUT this is the error I got:

[65536] Query failed (#20210621_212459_01468_5jhjw): Unhandled type for Slice: json io.trino.spi.TrinoException: Unhandled type for Slice: json

The question is how to define a field which can have more than one type?
Funny enough if I change from json to varchar it looks like it works, but it makes no sense to me...

Thanks
Ognian

@ebyhr
Copy link
Member

ebyhr commented Jun 23, 2021

As far as I know, there is no simple solution except for your varchar workaround. Let me label as enhancement.

@ebyhr ebyhr added the enhancement New feature or request label Jun 23, 2021
@ebyhr ebyhr self-assigned this Jun 24, 2021
@Ognian
Copy link
Author

Ognian commented Jun 24, 2021

@ebyhr it gets even funnier:
when using superset as the frontend and having an address field with subfields street, zip, city, country.
If I define it in _schema of type: row("street" varchar,"zip" varchar,"city" varchar,"country" varchar)
the it is displayed as an array:
image
BUT If I define it in _schema of type: varchar
image
which display's it correct
BUT of course this is a superset error since if I try to select address.zip it errors...
apache/superset#15364

@academy-codex
Copy link
Contributor

Looks interesting. @ebyhr If you're not taking it up I can have a look.

@ebyhr
Copy link
Member

ebyhr commented Jun 25, 2021

@academy-codex I implemented half of this enhancement locally, but please feel free to take it. MongoPageSource.writeSlice() is the relevant method. The point is json generated by Document.toJson() is little different from Trino json response.

edit: As we talked offline on Jul 17, l will send a PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Development

Successfully merging a pull request may close this issue.

3 participants