-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use AccessControl functionality to filter columns from SELECT *… #9991
Conversation
Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. In order for us to review and merge your code, please submit the signed CLA to [email protected]. For more information, see https://github.com/trinodb/cla. |
@rtotarocuebiq how does this PR relate to @youngchen7's #7893? |
@findepi It is an alternative implementation that does not need to change the access control signature, so there is not an alternative method to "filterColumns". Moreover it cover the usecase when you need to partition data between user but you would not to show the discrimination column. |
can you elaborate on that use-case, or exemplify? |
The need is to have columns that contain technical information related
so you want give access to a user just to a subset of data but at the same time you would not to show those columns to users. Will exist also users that have right for accessing to the whole table. |
That's a reasonable requirement, and should be possible today, regardless of @rtotarocuebiq do we have existing tests for that? |
@findepi yes in core/trino-main/src/test/java/io/trino/sql/query/TestFilterInaccessibleColumns.java method testRowFilterOnNotAccessibleColumn |
@@ -67,6 +67,7 @@ | |||
public static final String QUERY_MAX_EXECUTION_TIME = "query_max_execution_time"; | |||
public static final String QUERY_MAX_PLANNING_TIME = "query_max_planning_time"; | |||
public static final String QUERY_MAX_RUN_TIME = "query_max_run_time"; | |||
public static final String QUERY_HIDE_INACCESSIBLE_COLUMNS = "query_hide_inaccessible_columns"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this should be exposed as a session property
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any suggestions about?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kokosing i understand you're concerned about the case where inaccessible columns are hidden with a config, but session property overrides and makes them visible (although still inaccessible) again. That would be a security problem
OTOH, i am thinking about the case where server uses default configuration, and user or group of users would want to leverage this new functionality. Providing session toggle would be nice.
I propose we do provide session toggle, but with validation preventing query_hide_inaccessible_columns be set to false, when config is true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is about the query semantic. I don't think user should be allowed to change what given query means at will. I think it may cause other issues. For example auditors may no longer know what give query means.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kokosing I got your point, you prefer to set it at server config level. Can you give me suggestion on where add the property? (i.e. ServerConfig or AccessControlConfig) and which is the better way for accessing to the value from StatementAnalyzer ?
return hideUnaccessibleColumns; | ||
} | ||
|
||
@Config("query.hide-inaccessible-columns") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ConfigDescription
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Config("query.filter-restricted-columns")
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i like "hide" more.
also "restricted" could be understood as referring to inaccessible & masked columns, whereas we don't want to hide masked columns
core/trino-main/src/main/java/io/trino/sql/analyzer/Analysis.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/analyzer/Analysis.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/analyzer/Analyzer.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/analyzer/Analyzer.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/analyzer/StatementAnalyzer.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/sql/query/TestFilterInaccessibleColumns.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/sql/query/TestFilterInaccessibleColumns.java
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/SystemSessionProperties.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/sql/query/TestFilterInaccessibleColumns.java
Outdated
Show resolved
Hide resolved
@@ -67,6 +67,7 @@ | |||
public static final String QUERY_MAX_EXECUTION_TIME = "query_max_execution_time"; | |||
public static final String QUERY_MAX_PLANNING_TIME = "query_max_planning_time"; | |||
public static final String QUERY_MAX_RUN_TIME = "query_max_run_time"; | |||
public static final String QUERY_HIDE_INACCESSIBLE_COLUMNS = "query_hide_inaccessible_columns"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is about the query semantic. I don't think user should be allowed to change what given query means at will. I think it may cause other issues. For example auditors may no longer know what give query means.
tableColumnMap.asMap() | ||
.forEach((key, value) -> { | ||
//Allow to create a rowfilter on a not accessibile column | ||
if (analysis.hasRowFilter(key, identity.getUser())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about column masking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm saving columns accessed via row filter for excluding them from the access control if them are not accessed also directly in the statement. I didn't see a similar usecase for masking. Do you have a specific use case in mind so that I can add a test about?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Consider a case where not accessible columns are used for column masking for example:
CASE not_accessible_column
WHEN 'allowed' THEN actual_column
WHEN 'masked' THEN mask(actual_column)
ELSE null
END
Please add tests for cases like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, to support it I moved the filtering logic on io.trino.sql.analyzer.StatementAnalyzer.Visitor#analyzeRowFilter
and io.trino.sql.analyzer.StatementAnalyzer.Visitor#analyzeColumnMask
. Added test cases
@@ -3188,6 +3189,26 @@ private void analyzeSelectAllColumns( | |||
} | |||
} | |||
|
|||
private List<Field> filterInaccessibleFields(List<Field> fields) | |||
{ | |||
if (SystemSessionProperties.isHideInaccesibleColumns(session)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
invert condition and remove else block (keyword)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
I moved the property hide-inaccessible-columns on AccessControlConfig and aligned the pull request with master |
tableColumnMap.asMap() | ||
.forEach((key, value) -> { | ||
//Allow to create a rowfilter on a not accessibile column | ||
if (analysis.hasRowFilter(key, identity.getUser())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Consider a case where not accessible columns are used for column masking for example:
CASE not_accessible_column
WHEN 'allowed' THEN actual_column
WHEN 'masked' THEN mask(actual_column)
ELSE null
END
Please add tests for cases like that.
field.getOriginColumnName().isEmpty() || | ||
field.getOriginTable().isEmpty() || | ||
field.isHidden() || | ||
!accessControl.filterInaccessibleColumns( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's define filterInaccessibleColumns
in StatementAnalyzer
. It seem that it is more semantic related thing than access.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
.filter(field -> | ||
field.getOriginColumnName().isEmpty() || | ||
field.getOriginTable().isEmpty() || | ||
field.isHidden() || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need above conditions? getVisibleFields
already checks if field is hidden. Also I would prefer to avoid other conditions to as they may change semantic of how things work today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I removed the condition
@@ -3188,6 +3188,21 @@ private void analyzeSelectAllColumns( | |||
} | |||
} | |||
|
|||
private List<Field> filterInaccessibleFields(List<Field> fields) | |||
{ | |||
ImmutableList<Field> authorizedFields = fields.stream() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit. inline authorizedFields
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
return hideInaccesibleColumns; | ||
} | ||
|
||
@Config("access-control.hide-inaccessible-columns") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move it to FeaturesConfig
and add @ConfigDescription
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I moved it in FeaturesConfig and make accessibile in Metadata
core/trino-main/src/test/java/io/trino/sql/query/TestFilterInaccessibleColumns.java
Show resolved
Hide resolved
accessControl.deny(privilege("columns.data_type", SELECT_COLUMN)); | ||
// SHOW COLUMNS and DESCRIBE fail since data_type isn't visible | ||
assertThatThrownBy(() -> assertions.query("SHOW COLUMNS FROM nation")).hasMessage("Access Denied: Cannot select from columns [ordinal_position, extra_info, table_schema, column_name, data_type, comment, table_name] in table or view local.information_schema.columns"); | ||
assertThatThrownBy(() -> assertions.query("DESCRIBE nation")).hasMessage("Access Denied: Cannot select from columns [ordinal_position, extra_info, table_schema, column_name, data_type, comment, table_name] in table or view local.information_schema.columns"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This query should not fail. If you are able to query table then surely you should be able to to DESCRIBE
it. Columns that got hidden should be simply filtered here. Same comment for SHOW COLUMNS FROM nation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added specific test for Describe
and SHOW COLUMNS
, this test check that are not accessible if I deny privilege on Information_schema
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get this test. You are revoking access to a metadata table which should be always accessible. I think it does not make sense. Please have tests where you only manipulate access on test tables like nation
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't put enough attention to analysis or statement analyzer yet. I would like first to reach a state where tests are expressing the thing we would like to have.
Notice that row filter and column masking is like a view, where the user given with it is like owner. We should still do access control checks for them. Please take a look how view access control works.
return hideInaccesibleColumns; | ||
} | ||
|
||
@Config("experimental.hide-inaccessible-columns") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you have experimental.
here? I think this features is not going to affect a lot of users as it will be disabled by default so there should be no change for them.
/** | ||
* Return true if columns must be hidden if inaccessible, default value is false | ||
*/ | ||
default boolean isHideInaccesibleColumns() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this simply belongs to StatementAnalyzer
. It is about the syntax not metadata. Metadata is not changed when the toggle is enabled or not. We just handle it differently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
core/trino-main/src/test/java/io/trino/sql/query/TestFilterInaccessibleColumns.java
Outdated
Show resolved
Hide resolved
accessControl.deny(privilege("columns.data_type", SELECT_COLUMN)); | ||
// SHOW COLUMNS and DESCRIBE fail since data_type isn't visible | ||
assertThatThrownBy(() -> assertions.query("SHOW COLUMNS FROM nation")).hasMessage("Access Denied: Cannot select from columns [ordinal_position, extra_info, table_schema, column_name, data_type, comment, table_name] in table or view local.information_schema.columns"); | ||
assertThatThrownBy(() -> assertions.query("DESCRIBE nation")).hasMessage("Access Denied: Cannot select from columns [ordinal_position, extra_info, table_schema, column_name, data_type, comment, table_name] in table or view local.information_schema.columns"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get this test. You are revoking access to a metadata table which should be always accessible. I think it does not make sense. Please have tests where you only manipulate access on test tables like nation
.
core/trino-main/src/test/java/io/trino/sql/query/TestFilterInaccessibleColumns.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/sql/query/TestFilterInaccessibleColumns.java
Outdated
Show resolved
Hide resolved
Thx for the suggestion I was misunderstanding the Identity meaning in the View Expression so now I simplified my PR a lot. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks much better now. Actually no changes in analyzer.
core/trino-main/src/main/java/io/trino/sql/analyzer/AnalyzerFactory.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/analyzer/StatementAnalyzer.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/sql/query/TestFilterInaccessibleColumns.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/sql/query/TestFilterInaccessibleColumns.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/sql/query/TestFilterInaccessibleColumns.java
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/sql/query/TestFilterInaccessibleColumns.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/analyzer/StatementAnalyzer.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/analyzer/StatementAnalyzer.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/sql/query/TestFilterInaccessibleColumns.java
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/sql/query/TestFilterInaccessibleColumns.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/sql/query/TestFilterInaccessibleColumns.java
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/sql/query/TestFilterInaccessibleColumns.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/analyzer/StatementAnalyzer.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/analyzer/StatementAnalyzer.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/analyzer/StatementAnalyzer.java
Outdated
Show resolved
Hide resolved
return fields; | ||
} | ||
//collect fields by table | ||
ListMultimap<QualifiedObjectName, Field> tableFieldsMap = Multimaps.newListMultimap(new HashMap<QualifiedObjectName, Collection<Field>>(), () -> new LinkedList<>()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this that complex?
As I understand these fields always come from the same relation (maybe you do checkArgument
to assert that), so do we need to use multimap here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case of join query like:
SELECT * FROM nation,customer WHERE customer.nationkey = nation.nationkey AND nation.name = 'FRANCE' AND customer.name='Customer#000001477'
I receive here fields from multiple tables
and in case of queries like:
SELECT * FROM (select 'test')
SELECT * FROM (SELECT concat(name,'-test') FROM nation WHERE name = 'FRANCE')
I receive fields without OriginaleTable/OriginaleColumn
so I must manage also those cases.
I'll add tests for that.
core/trino-main/src/test/java/io/trino/sql/query/TestFilterInaccessibleColumns.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, few minor comments. Thank you! Please rebase and squash commits. Make sure that commit message is well formatted and explains that change you are posting. Also please add this new property to the docs.
private static void validateHideInaccesibleColumns(boolean value, boolean defaultValue) | ||
{ | ||
if (defaultValue == true && value == false) { | ||
throw new TrinoException(INVALID_SESSION_PROPERTY, "hide_inaccessible_columns cannot be set to false if default value is true"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not much user friendly error message, it is very technical. Maybe:
hide_inaccessible_columns cannot be disabled with session property when it was enabled with configuration
Also use HIDE_INACCESSIBLE_COLUMNS
constant
Please make sure you have tests for this.
new NodeMemoryConfig(), | ||
new DynamicFilterConfig(), | ||
new NodeSchedulerConfig())); | ||
Assertions.assertThatNoException().isThrownBy(() -> sessionPropertyManager.validateSystemSessionProperty(SystemSessionProperties.HIDE_INACCESSIBLE_COLUMNS, "true")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static import?
core/trino-main/src/test/java/io/trino/sql/query/TestFilterInaccessibleColumns.java
Show resolved
Hide resolved
Set<String> accessibleColumns = accessControl.filterColumns( | ||
session.toSecurityContext(), | ||
table.asCatalogSchemaTableName(), | ||
tableFields.stream().map(field -> field.getOriginColumnName().get()).collect(Collectors.toSet())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please extract variable for column names, also please use toImmutableSet()
.collect(toImmutableList())); | ||
}); | ||
|
||
//filter at the end for preserving the global order |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove the comment
3a52cdd
to
6ebaddb
Compare
5a9ae45
to
d7d2bb7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rebase and squash commits (FYI https://github.com/trinodb/trino/blob/master/.github/DEVELOPMENT.md#git-merge-strategy)
Set<String> accessibleColumns = accessControl.filterColumns( | ||
session.toSecurityContext(), | ||
table.asCatalogSchemaTableName(), | ||
tableFields.stream().map(field -> field.getOriginColumnName().get()).collect(toImmutableSet())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
tableFields.stream()
.map(field -> field.getOriginColumnName().get())
.collect(toImmutableSet()));
List<Field> accessibleFields = new ArrayList<>(); | ||
|
||
//collect fields by table | ||
ListMultimap<QualifiedObjectName, Field> tableFieldsMap = Multimaps.newListMultimap(new HashMap<QualifiedObjectName, Collection<Field>>(), () -> new LinkedList<>()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ListMultimap<QualifiedObjectName, Field> tableFieldsMap = ArrayListMultimap.create();
0f49c8c
to
358a3af
Compare
@@ -124,6 +124,18 @@ In addition, Trino :doc:`provides an API </develop/system-access-control>` that | |||
allows you to create a custom access control method, or to extend an existing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mosabua Can you please double check wording in the docs here?
358a3af
to
9c13ff7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only reviewed for docs and suggested changes. Also need confirmation if this is a global property or a catalog property.
ca95ebf
to
50e0858
Compare
a32b85c
to
e5ccd69
Compare
With hide-inaccessible-columns configuration toggle now it is possible to exclude non accessible from SELECT * statement
e5ccd69
to
7a1509d
Compare
Merged, thank you! |
I Implemented the feature without modify the access control and making it configurable (by default is off).
The implementation allow also to create a rowfilter on a column that is inaccessible. This could be useful when you need to partition data between user but you would not to show the discrimination column.
fixes #7461