Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug Report: false negative sql_mode=only_full_group_by error message from vtgate #9889

Closed
Santiclause opened this issue Mar 14, 2022 · 6 comments · Fixed by #10069 or #10139
Closed
Labels
Component: Query Serving Type: Enhancement Logical improvement (somewhere between a bug and feature) Type: Regression

Comments

@Santiclause
Copy link
Contributor

Overview of the Issue

Issuing a query with aggregation which selects a non-aggregated column which is also not in the group by clause will fail with the error Expression of SELECT list is not in GROUP BY clause and contains nonaggregated column [...] which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by. This error message indicates that the problem is with sql_mode, but there is no sql_mode setting involved here; this error happens at the vtgate before it reaches any tablets, and occurs irrespective of your actual sql_mode setting on your tablets (we do not have this only_full_group_by setting set).

It also appears that the query has to have table aliasing in order to run into this problem.

Reproduction Steps

Example vschema:

{
  "tables": {
    "foo": {
      "column_vindexes": [
        {
          "column": "id",
          "name": "foo_id_hash"
        }
      ]
    },
    "bar": {
      "column_vindexes": [
        {
          "column": "foo_id",
          "name": "foo_id_hash"
        }
      ]
    }
  },
  "vindexes": {
    "foo_id_hash": {
      "type": "hash"
    }
  },
  "sharded": true
}

Example schema:

CREATE TABLE `foo` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `name` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
CREATE TABLE `bar` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `foo_id` bigint(20) unsigned NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

Observe that the following query succeeds:

SELECT foo.id, foo.name, COUNT(bar.id)
FROM foo
LEFT JOIN bar on bar.foo_id = foo.id
GROUP BY foo.id;

Yet this query, identical in all respects other than table alias, fails:

SELECT f.id, f.name, COUNT(bar.id)
FROM foo f
LEFT JOIN bar on bar.foo_id = f.id
GROUP BY f.id;

Binary Version

`Version: 13.0.1-SNAPSHOT (Git revision  branch '') built on Wed Mar  9 02:26:01 UTC 2022 by root@buildkitsandbox using go1.17.8 linux/amd64` built manually on commit hash `bc4a9606e1899a4223b80edfb212f32af81000ac`

Operating System and Environment details

k8s on GKE

Log Fragments

mysql> SELECT foo.id, foo.name, COUNT(bar.id)
    -> FROM foo
    -> LEFT JOIN bar on bar.foo_id = foo.id
    -> GROUP BY foo.id;
+------------------+------+---------------+
| id               | name | COUNT(bar.id) |
+------------------+------+---------------+
| 2000000000000000 | fizz |             1 |
+------------------+------+---------------+
1 row in set (0.08 sec)

mysql> SELECT f.id, f.name, COUNT(bar.id)
    -> FROM foo f
    -> LEFT JOIN bar on bar.foo_id = f.id
    -> GROUP BY f.id;
ERROR 1055 (42000): Expression of SELECT list is not in GROUP BY clause and contains nonaggregated column 'f.`name`' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
@Santiclause Santiclause added Needs Triage This issue needs to be correctly labelled and triaged Type: Bug labels Mar 14, 2022
@harshit-gangal
Copy link
Member

harshit-gangal commented Mar 29, 2022

VTGate runs in this mode and will not allow such queries down.
We plan to support this query but currently it is not supported.

@harshit-gangal harshit-gangal added Type: Enhancement Logical improvement (somewhere between a bug and feature) and removed Type: Bug Needs Triage This issue needs to be correctly labelled and triaged labels Mar 29, 2022
@Santiclause
Copy link
Contributor Author

Sorry, I'm not sure I understand - the query, semantically, is fundamentally supported, as I mention in the issue report. It worked as-is in version 11, even. The bug is that, after upgrading to version 13, introducing table aliases to the query breaks the support. Are you saying that table aliases are not currently supported in this circumstance?

@frouioui
Copy link
Member

Hello @Santiclause, this has been fixed on main through #10069. I opened a backport to also fix this bug on the release-13.0 branch (#10079).

@msolters
Copy link
Contributor

Hi @frouioui, we have tried compiling main with this PR, as well as including #10074

In both cases the issue continues to exist.

@msolters
Copy link
Contributor

msolters commented Apr 22, 2022

TO follow up on this: the following query structure continues to error with both #10069 and #10074 compiled in:

SELECT cID, lID 
FROM m
GROUP BY lID

In this situation, cID is the (only) vindex, but not the PK of m.

yielding

ERROR 1055 (42000): Expression of SELECT list is not in GROUP BY clause and contains nonaggregated column 'cID' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by

This happens even if we include the hint /*vt+ PLANNER=v3 */.

@msolters
Copy link
Contributor

Hi @frouioui we compiled and tested this branch on our staging vtgates as you instructed: vitessio/vitess/tree/fix-9889-only-full-group-by-upstream

And it seems to successfully eliminate the problem as described in this issue! Thank you so much for your work in resolving this bug!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Query Serving Type: Enhancement Logical improvement (somewhere between a bug and feature) Type: Regression
Projects
None yet
6 participants