Skip to content

v2.0.0

Compare
Choose a tag to compare
@t83714 t83714 released this 16 Aug 13:14
· 629 commits to next since this release

v2.0.0

v2.0.0 is the first major release since v1.0.0 was released last year. This release focuses on implementing the new policy engine based authorisation system design.

Overview

Before v2.0.0, we had started introducing the Open Policy Agent (OPA) as the central policy engine serving authorisation decisions at a handful of API endpoints. However, the following problems prevent us from rolling out the design to the whole system:

  • Lack of abstraction at the policy level. It is hard to reuse the policy logic when it comes to the situation where:
    • One resource resides in more than one storage engine. e.g. dataset metadata are stored both in PostgreSQL & Elasticsearch)
    • Or one resource might belong to a subset defined by another type of resource. e.g. a dataset record is a record that carries dcat-dataset-string aspect.
    • Or the decision of one resource depends on another resource.
  • Policy files are created to serve the decision of the designated operation of the resource (e.g. read). The system is required to look up a pre-defined field (e.g. authnReadPolicyId) to pick a responsible policy file.
  • Didn't fully leverage the partial evaluation feature of the OPA, and the performance issue that comes with it when making group decisions. e.g. get all available records
  • Inaccurate OPA AST parser (e.g. not support negated) and local query translator

To solve those issues, we introduced a new authorisation system design & implementation that:

  • Introduce a new decision API endpoint, a single entry point policy model and operation URI (e.g. object/record/read) to ensure decision requests are properly routed. e.g.:
    • When making group decisions over a superset resource (e.g. object/record), a joined decision will be given, including all subset resources policies (e.g. object/dataset, object/distribution etc.).
    • A dataset record will always be governed by the policy of resource object/dataset, no matter whether we query it via registry API (require the permission of object/record/record operation) or dataset search API (require the permission of object/dataset/read operation) as the resource object/dataset is a subset of resource object/record.
  • Eliminate the need to manually specify the policy file (e.g. via authnReadPolicyId field) with the help of the single entry point policy model. Make it possible to create user-defined resources & operations.
  • Fully leverage the partial evaluation feature of the OPA. Only requires one decision, even for making group decisions. e.g. get all records that the user is allowed to see.
  • More accurate OPA AST parser with performance improvement. e.g. auto-filtering duplicated rules & impossible rules and recognising rules that can be evaluated further

More Powerful APIs with fine-grained access control

Thanks to the new authorisation system design. We are able to re-implement fine-grained access control to ALL our existing APIs.

Many APIs (e.g. indexer APIs) that previously were not accessible outside the cluster can be accessible outside the cluster as we are able to grant access to non-admin users that enable more use cases.

With the help of the new policy mode, we still are able to make sure all APIs are compatible with existing plugins & sub-systems. Although all requests between sub-systems are now all governed by the policy engine now, requests from existing sub-systems will still work as existing admin users still have access to any resources.

However, authors of existing plugins/sub-systems might consider updating the code and using an account with the least privilege to communicate with the core APIs as it's possible now with fine-grained access control of APIs.

New Settings Panel

v2.0.0 also comes with a new settings panel for all signed-in users. Depending on your access, you might have access to different sections. e.g. all users should at least be able to access the "My Account" tab and users with the admin role can access all tabs. The settings panel UI is supplied to simplify the common admin tasks such as: users, roles, permissions, resources and operations management.

CleanShot 2022-08-16 at 22 21 28@2x

For a brand new system, to assign the admin role to the first user, please refer to this doc to set a user as admin using acs-cmd command line tool.

Github Container Registry

Since v2.0.0, we also publish all our helm chart & docker images to Github Container Registry (in addition to existing HTTP helm chart repo & docker hub docker image releases ).

If you want to try our helm chart via OCI registry or prefer using a different docker registry other than the docker hub, you can give it a try.

Breaking Changes & Compatibility

  • Setting a policy for a record via the authnReadPolicyId field is not supported anymore.
    • Please manage access to registry records via built-in permission & constraints. See permissions/role section of our architecture doc.
    • You can still supply your policy files to extend the authorization model. However, your policy must comply with the single entry point policy model. e.g. a policy governing resources object/resourceB should be put into package object.resourceB. You might also want to read the decision API doc
  • isAdmin field on the user will no longer grant the user admin permission.
  • All scripts / command line tools now require Node 14
  • Previously dataset-access-control aspect has been renamed to access-control aspect
    • orgUnitOwnerId field has been renamed to orgUnitId
    • custodianOrgUnitId field has been moved to publishing aspect

Migration

If you are on version <= v1.3.1, you should be able to migrate to v2.0.0 simply by deploying the v2.0.0 Helm Chart.

Please note: any authorisation model implemented with external policy files will need to reimplement on top of v2 authorisation model.

If you are any v2 alpha versions (e.g. v2.0.0-alpha.8), you will have to completely uninstall Magda before upgrading to v2.0.0 as the SQL migration files won't tell the difference between v2.0.0 and any of the v2 alpha versions.

Changes since v1.3.1

  • #3231 Upgraded to Open Policy Agent v0.33.x
  • #3253 Add New /auth/opa/decision Endpoint
  • move pre-defined role ids into a single constant file in magda-typescript-common
  • related #3250: Rewrote the OPA AST parser for better evaluation & reference handling
  • related #3250: Added common policy entry point: entrypoint/allow.rego
  • #3303 rename dataset-access-control aspect to access-control aspect
  • #3256 Unified policy for securing both registry records & Elasticsearch datasets
  • #3304 Implement authorisation enforcement for all registry API services according to the more generic auth model defined in #3256
  • #3306 Added admin web UI for auth management
  • Added new APIs for managing auth objects
  • #3308 Policy enforcement on auth objects APIs
  • Add records/:id/aspects & records/:id/aspects/count APIs to registry
  • related #3250: Rewrite decision enforcement logic for search API
  • #3326 Build OPA docker image with builtin policies & Run OPA as a sidecar
  • #3330 Fine-tune Metadata creation tool workflow for the new auth model
  • #3332 Policy Enforcement on Storage API using new auth model
  • #3333 Rename access-control aspect orgUnitOwnerId field to orgUnitId
  • #3335 Reshape Storage API (in progress)
  • #3340 OPA AST parser takes too long to evaluate the large response
  • Increase registry default request timeout to 60s (from 30s)
  • Bump Dataset index version to 49 & publishers index version to 7
  • Fixed registry API generated ts client patchRecord API response type
  • add allowAutoCrawlOnStartingUp option to indexer
  • add /v0/status/live & /v0/status/ready endpoint to indexer
  • Related to #3315, introduce ServiceRunner to launch an integrated test & dev environment
  • #3315 rewrite auth-related integration tests and execute tests using ServiceRunner rather than mini k8s cluster kind
  • #3345 Add dataset index / delete by ID API endpoints to the indexer
  • Add debug switch to registry API, search API & Auth API
  • Related to #3331, remove any logic related to obsolete authnReadPolicyId record field
  • Related to #3331, remove obsolete policy files object/registry/*
  • Added APIs for managing API keys
  • #3345 Add Adhoc dataset indexing APIs, secure with new auth model and expose via Gateway with other indexer APIs
  • "My dataset section": Add Dataset deletion function & performance improvement
  • Fixed: helm chart version update script incorrectly skip updating dependencies version when versionUpdateExclude config in package.json is empty
  • Fixed: incorrect scss-compiler-config data might be generated if non-default docker image config is used.
  • #3360: Recategorize auth object-related API docs & more auth object-related APIs for convenience of future UI development.
  • Fixed: set-scss-vars script will use proper imagePullSecrets to generate the UI CSS update job
  • Allow the no cache behaviour of whoami & getUserById api to be turned off
  • #3269 Backend support for Access Group
  • #3366 New Registry API capability: new filter operator, merge mode, group operations and more
  • Add target & rel fields to footer link item schema
  • Make sure the dataset creation tool always creates dcat-dataset-strings aspect for draft datasets
  • Add reversePageTokenOrder parameter to registry APIs GET /records, GET /records/summary & GET /records/:recordId/history
  • #3163 Fixed Registry history API incorrectly reports more events available on the last page
  • #3383 Registry get records API didn't group aspectOrQuery correctly
  • Upgrade to Node 14
  • Upgrade bcrypt to 5.0.1
  • Fixed auth policy: make sure datasets without publishing aspect will be considered published
  • #3348 secure openfaas gateway with new auth model & update API docs
  • #3317 Use multi-stage builds to build magda-scss-compile
  • #3337 Release docker images & helm charts to Github container registry as well
  • #3349 secure content API with new auth model
  • #3350 secure admin API (managing connectors) with new auth model
  • #3351 Secure tenant API with new auth model
  • Related to #3331, remove legacy middlewares
  • #3352 Remove usage of MUST_BE_ADMIN middleware in auth API
  • #3380 Support Graceful Pod Shutdown
  • Upgrade to typescript 4.2.4 & api-extractor 7.15.2
  • Upgrade react-scripts to 4.0.3 & craco to 6.4.5
  • Make region index initial creation execute in the background without blocking the indexer
  • #3343 Consolidate admin/settings UI functionalities
  • #3395 Allow users to change the publishing status of the published dataset
  • Fixed metadata creation tool/editor cache handling issue & workflow improvements.