v2.0.0
v2.0.0
v2.0.0 is the first major release since v1.0.0 was released last year. This release focuses on implementing the new policy engine based authorisation system design.
Overview
Before v2.0.0, we had started introducing the Open Policy Agent (OPA) as the central policy engine serving authorisation decisions at a handful of API endpoints. However, the following problems prevent us from rolling out the design to the whole system:
- Lack of abstraction at the policy level. It is hard to reuse the policy logic when it comes to the situation where:
- One resource resides in more than one storage engine. e.g. dataset metadata are stored both in PostgreSQL & Elasticsearch)
- Or one resource might belong to a subset defined by another type of resource. e.g. a dataset record is a record that carries
dcat-dataset-string
aspect. - Or the decision of one resource depends on another resource.
- Policy files are created to serve the decision of the designated operation of the resource (e.g. read). The system is required to look up a pre-defined field (e.g.
authnReadPolicyId
) to pick a responsible policy file. - Didn't fully leverage the partial evaluation feature of the OPA, and the performance issue that comes with it when making group decisions. e.g. get all available records
- Inaccurate OPA AST parser (e.g. not support
negated
) and local query translator
To solve those issues, we introduced a new authorisation system design & implementation that:
- Introduce a new decision API endpoint, a single entry point policy model and operation URI (e.g.
object/record/read
) to ensure decision requests are properly routed. e.g.:- When making group decisions over a superset resource (e.g.
object/record
), a joined decision will be given, including all subset resources policies (e.g.object/dataset
,object/distribution
etc.). - A dataset record will always be governed by the policy of resource
object/dataset
, no matter whether we query it via registry API (require the permission ofobject/record/record
operation) or dataset search API (require the permission ofobject/dataset/read
operation) as the resourceobject/dataset
is a subset of resourceobject/record
.
- When making group decisions over a superset resource (e.g.
- Eliminate the need to manually specify the policy file (e.g. via
authnReadPolicyId
field) with the help of the single entry point policy model. Make it possible to create user-defined resources & operations. - Fully leverage the partial evaluation feature of the OPA. Only requires one decision, even for making group decisions. e.g. get all records that the user is allowed to see.
- More accurate OPA AST parser with performance improvement. e.g. auto-filtering duplicated rules & impossible rules and recognising rules that can be evaluated further
More Powerful APIs with fine-grained access control
Thanks to the new authorisation system design. We are able to re-implement fine-grained access control to ALL our existing APIs.
Many APIs (e.g. indexer APIs) that previously were not accessible outside the cluster can be accessible outside the cluster as we are able to grant access to non-admin users that enable more use cases.
With the help of the new policy mode, we still are able to make sure all APIs are compatible with existing plugins & sub-systems. Although all requests between sub-systems are now all governed by the policy engine now, requests from existing sub-systems will still work as existing admin users still have access to any resources.
However, authors of existing plugins/sub-systems might consider updating the code and using an account with the least privilege to communicate with the core APIs as it's possible now with fine-grained access control of APIs.
New Settings Panel
v2.0.0 also comes with a new settings panel for all signed-in users. Depending on your access, you might have access to different sections. e.g. all users should at least be able to access the "My Account" tab and users with the admin role can access all tabs. The settings panel UI is supplied to simplify the common admin tasks such as: users, roles, permissions, resources and operations management.
For a brand new system, to assign the admin role to the first user, please refer to this doc to set a user as admin using acs-cmd
command line tool.
Github Container Registry
Since v2.0.0, we also publish all our helm chart & docker images to Github Container Registry (in addition to existing HTTP helm chart repo & docker hub docker image releases ).
If you want to try our helm chart via OCI registry or prefer using a different docker registry other than the docker hub, you can give it a try.
Breaking Changes & Compatibility
- Setting a policy for a record via the
authnReadPolicyId
field is not supported anymore.- Please manage access to registry records via built-in permission & constraints. See permissions/role section of our architecture doc.
- You can still supply your policy files to extend the authorization model. However, your policy must comply with the single entry point policy model. e.g. a policy governing resources
object/resourceB
should be put into packageobject.resourceB
. You might also want to read the decision API doc
isAdmin
field on the user will no longer grant the user admin permission.- All scripts / command line tools now require Node 14
- Previously
dataset-access-control
aspect has been renamed toaccess-control
aspectorgUnitOwnerId
field has been renamed toorgUnitId
custodianOrgUnitId
field has been moved topublishing
aspect
Migration
If you are on version <= v1.3.1, you should be able to migrate to v2.0.0 simply by deploying the v2.0.0 Helm Chart.
Please note: any authorisation model implemented with external policy files will need to reimplement on top of v2 authorisation model.
If you are any v2 alpha versions (e.g. v2.0.0-alpha.8), you will have to completely uninstall Magda before upgrading to v2.0.0 as the SQL migration files won't tell the difference between v2.0.0 and any of the v2 alpha versions.
Changes since v1.3.1
- #3231 Upgraded to Open Policy Agent v0.33.x
- #3253 Add New /auth/opa/decision Endpoint
- move pre-defined role ids into a single constant file in magda-typescript-common
- related #3250: Rewrote the OPA AST parser for better evaluation & reference handling
- related #3250: Added common policy entry point: entrypoint/allow.rego
- #3303 rename
dataset-access-control
aspect toaccess-control
aspect - #3256 Unified policy for securing both registry records & Elasticsearch datasets
- #3304 Implement authorisation enforcement for all registry API services according to the more generic auth model defined in #3256
- #3306 Added admin web UI for auth management
- Added new APIs for managing auth objects
- #3308 Policy enforcement on auth objects APIs
- Add
records/:id/aspects
&records/:id/aspects/count
APIs to registry - related #3250: Rewrite decision enforcement logic for search API
- #3326 Build OPA docker image with builtin policies & Run OPA as a sidecar
- #3330 Fine-tune Metadata creation tool workflow for the new auth model
- #3332 Policy Enforcement on Storage API using new auth model
- #3333 Rename access-control aspect orgUnitOwnerId field to orgUnitId
- #3335 Reshape Storage API (in progress)
- #3340 OPA AST parser takes too long to evaluate the large response
- Increase registry default request timeout to 60s (from 30s)
- Bump Dataset index version to 49 & publishers index version to 7
- Fixed registry API generated ts client patchRecord API response type
- add
allowAutoCrawlOnStartingUp
option to indexer - add
/v0/status/live
&/v0/status/ready
endpoint to indexer - Related to #3315, introduce
ServiceRunner
to launch an integrated test & dev environment - #3315 rewrite auth-related integration tests and execute tests using
ServiceRunner
rather than mini k8s clusterkind
- #3345 Add dataset index / delete by ID API endpoints to the indexer
- Add
debug
switch to registry API, search API & Auth API - Related to #3331, remove any logic related to obsolete
authnReadPolicyId
record field - Related to #3331, remove obsolete policy files
object/registry/*
- Added APIs for managing API keys
- #3345 Add Adhoc dataset indexing APIs, secure with new auth model and expose via Gateway with other indexer APIs
- "My dataset section": Add Dataset deletion function & performance improvement
- Fixed: helm chart version update script incorrectly skip updating dependencies version when
versionUpdateExclude
config in package.json is empty - Fixed: incorrect scss-compiler-config data might be generated if non-default docker image config is used.
- #3360: Recategorize auth object-related API docs & more auth object-related APIs for convenience of future UI development.
- Fixed:
set-scss-vars
script will use proper imagePullSecrets to generate the UI CSS update job - Allow the
no cache
behaviour ofwhoami
&getUserById
api to be turned off - #3269 Backend support for Access Group
- #3366 New Registry API capability: new filter operator, merge mode, group operations and more
- Add
target
&rel
fields to footer link item schema - Make sure the dataset creation tool always creates
dcat-dataset-strings
aspect for draft datasets - Add
reversePageTokenOrder
parameter to registry APIsGET /records
,GET /records/summary
&GET /records/:recordId/history
- #3163 Fixed Registry history API incorrectly reports more events available on the last page
- #3383 Registry get records API didn't group aspectOrQuery correctly
- Upgrade to Node 14
- Upgrade bcrypt to 5.0.1
- Fixed auth policy: make sure datasets without
publishing
aspect will be considered published - #3348 secure openfaas gateway with new auth model & update API docs
- #3317 Use multi-stage builds to build magda-scss-compile
- #3337 Release docker images & helm charts to Github container registry as well
- #3349 secure content API with new auth model
- #3350 secure admin API (managing connectors) with new auth model
- #3351 Secure tenant API with new auth model
- Related to #3331, remove legacy middlewares
- #3352 Remove usage of MUST_BE_ADMIN middleware in auth API
- #3380 Support Graceful Pod Shutdown
- Upgrade to typescript 4.2.4 & api-extractor 7.15.2
- Upgrade react-scripts to 4.0.3 & craco to 6.4.5
- Make region index initial creation execute in the background without blocking the indexer
- #3343 Consolidate admin/settings UI functionalities
- #3395 Allow users to change the publishing status of the published dataset
- Fixed metadata creation tool/editor cache handling issue & workflow improvements.