Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unity/configure notebook multi vs hosts #9

Closed

Conversation

YuriJin-Unity
Copy link

Added support for multiple istio hosts

Vivian Pan and others added 26 commits December 1, 2021 12:34
* Fix(manifests): Upgrade rbac.authorization.k8s.io from v1beta1 to v1 (kubeflow#6261)

* proposal: Extend Notebook Controller to expose idleness for Jupyter (kubeflow#6295)

* proposal: Extend Notebook Controller to expose idleness for Jupyter (kubeflow#6270)

Provide a design doc as a proposal for extending Notebook Controller to
expose idleness for Jupyter. Our proposal is in markdown format and follows
the guidelines of the kubeflow/components/proposal/README.md guide.

You can view the kubeflow#6270 issue in the following link:
kubeflow#6270

Signed-off-by: Athanasios Markou <[email protected]>

* review: change the title of the proposal

Change the title of the proposal to only include the
proposed new feature. The new title of the proposal
will now be "Expose Idleness Information for Jupyter
Notebooks".

* review: rename the proposal markdown file

We want to give a more specific name to the markdown
which contains the proposal. Since this proposal
emphasizes on a feature regarding the Jupyter Notebooks,
the new name will be:

20220121-jupyter-notebook-idleness.md

* Synchronize jupyter-web-application role with clusterrole (kubeflow#6241)

* Update role.yaml

* Update role.yaml

* Update cluster-role.yaml

* Kubeflow Roadmap update - with 1.5 details (kubeflow#6266)

* Kubeflow Roadmap update - with 1.5 details

These proposed changes include: identifying that 1.4.1 has been delivered, provides themes for 1.5 and provides details of major features in 1.5 by working group.   This is an initial proposal that needs review by the working group leads.

* correct formatting in KFP features

Moved KFP features under KFP Control Flow doc

* updating KFP section

updating KFP references with updates from KFP team

* Updated the 1.5 release date to March

updated the 1.5 release date to March

* Update ROADMAP.md

change Hyperparameter leader election to Katib leader election

Co-authored-by: Andrey Velichkevich <[email protected]>

* Update ROADMAP.md

improve description and details of feature for metrics collector

Co-authored-by: Andrey Velichkevich <[email protected]>

* Update Katib description for Early stopping in 1.5

updating with Andrey's suggestion (but without the world proper).   * Validation for Early Stopping algorithm settings helps users to proper reduce model overfitting

Co-authored-by: Andrey Velichkevich <[email protected]>

* notebooks: Extend Notebook Controller to expose idleness for Jupyter (kubeflow#6297)

* notebooks: Update image's tag in make

Modify Makefile to update properly the TAG
based on the git TAG.

Signed-off-by: Athanasios Markou <[email protected]>
Reviewed-by: Kimonas Sotirchos <[email protected]>

* notebooks: Expose last-activity

Extend the notebook-controller to:
* cull idle Notebook Servers based on their new `last-activity`
  annotation
* expose the last activity of each Notebook Server as an annotation
  on the metadata of the corresponding CR object

Modify notebook_controller.go to:
* update the Last Activity of each Notebook Server that has a
  Running pod
* delete the Last Activity Annotation for every Notebook Server
  that does not have a Running pod

Extend culler.go to:
* perform culling based on the new `last-activity` annotation and
  not based on the `/api/status` endpoint.
* update the last activity of a Notebook Server, based on the
  kernels' execution states.

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Athanasios Markou <[email protected]>

* notebooks: Introduce a DEV env var

We introduce a DEV ENV var to allow admins
develop and test on their local machine their
custom Notebook Controller.
We provide information and instructions inside
the components/notebook-controller/README.md.

Signed-off-by: Athanasios Markou <[email protected]>
Reviewed-by: Kimonas Sotirchos <[email protected]>

* notebooks: Add unit tests for last-activity

* Introduce new tests for allKernelsAreIdle()
* Extend the tests for NotebookIsIdle() and for
  NotebookNeedsCulling().

Signed-off-by: Athanasios Markou <[email protected]>
Reviewed-by: Kimonas Sotirchos <[email protected]>

* review: UpdateNotebookLastActivityAnnotation()

Ensure that UpdateNotebookLastActivityAnnotation() does not return
"true". This function should not return any value.

Signed-off-by: Athanasios Markou <[email protected]>

* jwa: Rework the Storage API of the web app (kubeflow#6321)

* wa(back): Add helper for deserializing JSON obj

In some cases we might need to construct Python k8s lib objects from the
JSONs that are provided by clients. I.e. the UI will be sending a PVC
object in json format, so the backend will need to create the
corresponding client.V1PersistentVolumeClaim object and submit it.

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Ilias Katsakioris <[email protected]>

* wa(back): Serialization helper

Add helper function for converting a k8s-client object into a dict that
can be sent as an HTTP response.

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Ilias Katsakioris <[email protected]>

* wa(back): Add dry run to Notebooks and PVCs

The backend will need to be able to create objects with dry-run, in
order to ensure they are valid. The backend will need to check that both
the Notebook and the PVCs can be created beforehand.

This way we avoid the scenario where we create PVCs but the Notebook
fails to be created, and the PVCs are never garbage collected.

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Ilias Katsakioris <[email protected]>

* wa(back): Update kubernetes to 0.17

In order to support dry-run we must use the 0.17 version of the Python
k8s client.

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Ilias Katsakioris <[email protected]>

* wa(back): Extend api module to patch pvcs

The backend will need to be able to PATCH PVCs in order to set the
ownerReference to the Notebook that mounts the PVCs.

Ref: arrikto/dev/issues/386#issuecomment-856700392

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Ilias Katsakioris <[email protected]>

* jwa(back): Work with new Volumes API

The backend API should not add any more layers of abstractions on top of
the K8s API. The backend should expect the client/UI to be sending the
entire PVC spec of a new PVC.

Refs: arrikto/dev/issues/386

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Ilias Katsakioris <[email protected]>

* jwa(back): Add unittests for new volumes API

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Ilias Katsakioris <[email protected]>

* jwa(back): Extend the PVC info returned

We want to show both the access mode and size of the existing PVCs, when
a user clicks on the dropdown to select which PVC to mount.

The backend will need to provide this information to the frontend. We
don't want to send the K8s list of PVCs since this will result in a lot
of unnecessary data to be sent.

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Ilias Katsakioris <[email protected]>

* jwa(front): Add proxy config for Rok

When developing the Rok flavor locally we will need to be able to open
the Rok chooser. This can be done by using Angular/webpack proxy to
bring the exposed rok service and the app under the same domain.

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Tasos Alexiou <[email protected]>

* jwa(front): Remove card from form

The form of the app should not be a big card, but a normal form.

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Tasos Alexiou <[email protected]>

* jwa(front): Install AceModule for yaml editing

Install AceModule to allow users to edit yamls of objects.

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Tasos Alexiou <[email protected]>

* wa(front): Change the styling of form sections

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Tasos Alexiou <[email protected]>

* jwa(front): Create common volume components

Component for:
* New PVC and configuring its spec
* Attaching an existing PVC in a Notebook

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Tasos Alexiou <[email protected]>

* jwa(front): Update Rok form for new Volume API

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Tasos Alexiou <[email protected]>

* jwa(front): Mark inputs as dirty when restoring Lab

When the UI autofills the form with values from a JupyterLab snapshot
then it should mark the touched fields as dirty. This way if a field has
errors the UI will make that input red.

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Tasos Alexiou <[email protected]>

* jwa: Update ConfigMap in manifests

Signed-off-by: Kimonas Sotirchos <[email protected]>

* jwa(front): Fix format errors

Signed-off-by: Kimonas Sotirchos <[email protected]>

* profiles: Update the permissions for notebook idleness (kubeflow#6335)

Extend the Profiles Controller to give permissions to Notebooks
controller for making GET requests to notebook's /api/kernels endpoint.

Refs https://github.com/kubeflow/kubeflow/blob/master/components/proposals/20220121-jupyter-notebook-idleness.md

Signed-off-by: Kimonas Sotirchos <[email protected]>

* notebooks: Graceful handling of events (kubeflow#6338)

* notebooks: Handle events gracefully

The controller is not exiting the reconciliation loop after it has
re-emitted a Pod/STS Event as a Notebook Event. This results in the
controller to later on try and GET a Notebook with the name of the Event
that triggered the reconciliation loop.

The controller should exit the reconciliation function once it has
emitted the event.

Signed-off-by: Kimonas Sotirchos <[email protected]>

* notebooks: Don't reconcile on deleted events

We don't want to trigger the reconciliation function when an event gets
deleted.

If a Notebook would be deleted then the underlying events would
be deleted as well, which results in the reconcile function to get
triggered and try to GET Events and Notebooks with the name of the
deleted event.

Signed-off-by: Kimonas Sotirchos <[email protected]>

* notebooks: Fix endless restarts (kubeflow#6341)

* notebooks: Update notebook if timestamp changed

We don't want to be updating the spec of the notebook if the timestamp
hasn't changed, since this will lead to constant updates and
reconciliation loops.

Signed-off-by: Kimonas Sotirchos <[email protected]>

* notebooks: Use a deep-copy of the notebook spec

The controller should use a deep-copy of the notebook spec when
calculating the spec for the StatefulSet. If not then we could
update the notebook object without wanting it, since the spec could have
been changed when calculating the STS spec.

Signed-off-by: Kimonas Sotirchos <[email protected]>

* notebooks: Add prefix env var only if missing

The controller should be setting OR updating the NB_PREFIX env var.
Previously it would always blindly append it to the spec, which could
result in double entries for the same env var.

Signed-off-by: Kimonas Sotirchos <[email protected]>

* releasing: Update tags for v1.5.0-rc.1 (kubeflow#6343)

Signed-off-by: Kimonas Sotirchos <[email protected]>

Co-authored-by: Andrey Velichkevich <[email protected]>
…ields" (kubeflow#6195)

error comparison between pointer and pointer in "CopyStatefulSetFields"
Configure the dashboard to use the KServe app instead of the KFServing
0.6.1 one.

Signed-off-by: Kimonas Sotirchos <[email protected]>
The controller should not trigger the reconcile loop when an Event is
deleted. Previously the controller would run the reconciliation loop on
any event deletion.

This commit updates it to not run the loop for ANY event.

Signed-off-by: Kimonas Sotirchos <[email protected]>
WIP: Attempt to respect all namespace admin role bindings
…ies/kubeflow into unity/configure-notebook-multi-vs-hosts

# Conflicts:
#	components/access-management/kfam/api_default.go
#	components/notebook-controller/config/manager/kustomization.yaml
#	components/notebook-controller/config/manager/manager.yaml
#	components/notebook-controller/config/manager/params.env
#	components/notebook-controller/controllers/notebook_controller.go
This reverts commit f22cb86.
This reverts commit 3364fbb.
@YuriJin-Unity
Copy link
Author

Ongoing issue: kubernetes-sigs/controller-runtime#2720

@aubrianna-zhu
Copy link

@YuriJin-Unity can you rebase the changes on existing-changes? so the commit history is cleaner since it's hard to maintain our changes in this repo 😬

kimwnasptd and others added 8 commits March 25, 2024 15:13
* Fix(manifests): Upgrade rbac.authorization.k8s.io from v1beta1 to v1 (kubeflow#6261)

* proposal: Extend Notebook Controller to expose idleness for Jupyter (kubeflow#6295)

* proposal: Extend Notebook Controller to expose idleness for Jupyter (kubeflow#6270)

Provide a design doc as a proposal for extending Notebook Controller to
expose idleness for Jupyter. Our proposal is in markdown format and follows
the guidelines of the kubeflow/components/proposal/README.md guide.

You can view the kubeflow#6270 issue in the following link:
kubeflow#6270

Signed-off-by: Athanasios Markou <[email protected]>

* review: change the title of the proposal

Change the title of the proposal to only include the
proposed new feature. The new title of the proposal
will now be "Expose Idleness Information for Jupyter
Notebooks".

* review: rename the proposal markdown file

We want to give a more specific name to the markdown
which contains the proposal. Since this proposal
emphasizes on a feature regarding the Jupyter Notebooks,
the new name will be:

20220121-jupyter-notebook-idleness.md

* Synchronize jupyter-web-application role with clusterrole (kubeflow#6241)

* Update role.yaml

* Update role.yaml

* Update cluster-role.yaml

* Kubeflow Roadmap update - with 1.5 details (kubeflow#6266)

* Kubeflow Roadmap update - with 1.5 details

These proposed changes include: identifying that 1.4.1 has been delivered, provides themes for 1.5 and provides details of major features in 1.5 by working group.   This is an initial proposal that needs review by the working group leads.

* correct formatting in KFP features

Moved KFP features under KFP Control Flow doc

* updating KFP section

updating KFP references with updates from KFP team

* Updated the 1.5 release date to March

updated the 1.5 release date to March

* Update ROADMAP.md

change Hyperparameter leader election to Katib leader election

Co-authored-by: Andrey Velichkevich <[email protected]>

* Update ROADMAP.md

improve description and details of feature for metrics collector

Co-authored-by: Andrey Velichkevich <[email protected]>

* Update Katib description for Early stopping in 1.5

updating with Andrey's suggestion (but without the world proper).   * Validation for Early Stopping algorithm settings helps users to proper reduce model overfitting

Co-authored-by: Andrey Velichkevich <[email protected]>

* notebooks: Extend Notebook Controller to expose idleness for Jupyter (kubeflow#6297)

* notebooks: Update image's tag in make

Modify Makefile to update properly the TAG
based on the git TAG.

Signed-off-by: Athanasios Markou <[email protected]>
Reviewed-by: Kimonas Sotirchos <[email protected]>

* notebooks: Expose last-activity

Extend the notebook-controller to:
* cull idle Notebook Servers based on their new `last-activity`
  annotation
* expose the last activity of each Notebook Server as an annotation
  on the metadata of the corresponding CR object

Modify notebook_controller.go to:
* update the Last Activity of each Notebook Server that has a
  Running pod
* delete the Last Activity Annotation for every Notebook Server
  that does not have a Running pod

Extend culler.go to:
* perform culling based on the new `last-activity` annotation and
  not based on the `/api/status` endpoint.
* update the last activity of a Notebook Server, based on the
  kernels' execution states.

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Athanasios Markou <[email protected]>

* notebooks: Introduce a DEV env var

We introduce a DEV ENV var to allow admins
develop and test on their local machine their
custom Notebook Controller.
We provide information and instructions inside
the components/notebook-controller/README.md.

Signed-off-by: Athanasios Markou <[email protected]>
Reviewed-by: Kimonas Sotirchos <[email protected]>

* notebooks: Add unit tests for last-activity

* Introduce new tests for allKernelsAreIdle()
* Extend the tests for NotebookIsIdle() and for
  NotebookNeedsCulling().

Signed-off-by: Athanasios Markou <[email protected]>
Reviewed-by: Kimonas Sotirchos <[email protected]>

* review: UpdateNotebookLastActivityAnnotation()

Ensure that UpdateNotebookLastActivityAnnotation() does not return
"true". This function should not return any value.

Signed-off-by: Athanasios Markou <[email protected]>

* jwa: Rework the Storage API of the web app (kubeflow#6321)

* wa(back): Add helper for deserializing JSON obj

In some cases we might need to construct Python k8s lib objects from the
JSONs that are provided by clients. I.e. the UI will be sending a PVC
object in json format, so the backend will need to create the
corresponding client.V1PersistentVolumeClaim object and submit it.

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Ilias Katsakioris <[email protected]>

* wa(back): Serialization helper

Add helper function for converting a k8s-client object into a dict that
can be sent as an HTTP response.

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Ilias Katsakioris <[email protected]>

* wa(back): Add dry run to Notebooks and PVCs

The backend will need to be able to create objects with dry-run, in
order to ensure they are valid. The backend will need to check that both
the Notebook and the PVCs can be created beforehand.

This way we avoid the scenario where we create PVCs but the Notebook
fails to be created, and the PVCs are never garbage collected.

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Ilias Katsakioris <[email protected]>

* wa(back): Update kubernetes to 0.17

In order to support dry-run we must use the 0.17 version of the Python
k8s client.

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Ilias Katsakioris <[email protected]>

* wa(back): Extend api module to patch pvcs

The backend will need to be able to PATCH PVCs in order to set the
ownerReference to the Notebook that mounts the PVCs.

Ref: arrikto/dev/issues/386#issuecomment-856700392

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Ilias Katsakioris <[email protected]>

* jwa(back): Work with new Volumes API

The backend API should not add any more layers of abstractions on top of
the K8s API. The backend should expect the client/UI to be sending the
entire PVC spec of a new PVC.

Refs: arrikto/dev/issues/386

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Ilias Katsakioris <[email protected]>

* jwa(back): Add unittests for new volumes API

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Ilias Katsakioris <[email protected]>

* jwa(back): Extend the PVC info returned

We want to show both the access mode and size of the existing PVCs, when
a user clicks on the dropdown to select which PVC to mount.

The backend will need to provide this information to the frontend. We
don't want to send the K8s list of PVCs since this will result in a lot
of unnecessary data to be sent.

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Ilias Katsakioris <[email protected]>

* jwa(front): Add proxy config for Rok

When developing the Rok flavor locally we will need to be able to open
the Rok chooser. This can be done by using Angular/webpack proxy to
bring the exposed rok service and the app under the same domain.

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Tasos Alexiou <[email protected]>

* jwa(front): Remove card from form

The form of the app should not be a big card, but a normal form.

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Tasos Alexiou <[email protected]>

* jwa(front): Install AceModule for yaml editing

Install AceModule to allow users to edit yamls of objects.

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Tasos Alexiou <[email protected]>

* wa(front): Change the styling of form sections

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Tasos Alexiou <[email protected]>

* jwa(front): Create common volume components

Component for:
* New PVC and configuring its spec
* Attaching an existing PVC in a Notebook

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Tasos Alexiou <[email protected]>

* jwa(front): Update Rok form for new Volume API

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Tasos Alexiou <[email protected]>

* jwa(front): Mark inputs as dirty when restoring Lab

When the UI autofills the form with values from a JupyterLab snapshot
then it should mark the touched fields as dirty. This way if a field has
errors the UI will make that input red.

Signed-off-by: Kimonas Sotirchos <[email protected]>
Reviewed-by: Tasos Alexiou <[email protected]>

* jwa: Update ConfigMap in manifests

Signed-off-by: Kimonas Sotirchos <[email protected]>

* jwa(front): Fix format errors

Signed-off-by: Kimonas Sotirchos <[email protected]>

* profiles: Update the permissions for notebook idleness (kubeflow#6335)

Extend the Profiles Controller to give permissions to Notebooks
controller for making GET requests to notebook's /api/kernels endpoint.

Refs https://github.com/kubeflow/kubeflow/blob/master/components/proposals/20220121-jupyter-notebook-idleness.md

Signed-off-by: Kimonas Sotirchos <[email protected]>

* notebooks: Graceful handling of events (kubeflow#6338)

* notebooks: Handle events gracefully

The controller is not exiting the reconciliation loop after it has
re-emitted a Pod/STS Event as a Notebook Event. This results in the
controller to later on try and GET a Notebook with the name of the Event
that triggered the reconciliation loop.

The controller should exit the reconciliation function once it has
emitted the event.

Signed-off-by: Kimonas Sotirchos <[email protected]>

* notebooks: Don't reconcile on deleted events

We don't want to trigger the reconciliation function when an event gets
deleted.

If a Notebook would be deleted then the underlying events would
be deleted as well, which results in the reconcile function to get
triggered and try to GET Events and Notebooks with the name of the
deleted event.

Signed-off-by: Kimonas Sotirchos <[email protected]>

* notebooks: Fix endless restarts (kubeflow#6341)

* notebooks: Update notebook if timestamp changed

We don't want to be updating the spec of the notebook if the timestamp
hasn't changed, since this will lead to constant updates and
reconciliation loops.

Signed-off-by: Kimonas Sotirchos <[email protected]>

* notebooks: Use a deep-copy of the notebook spec

The controller should use a deep-copy of the notebook spec when
calculating the spec for the StatefulSet. If not then we could
update the notebook object without wanting it, since the spec could have
been changed when calculating the STS spec.

Signed-off-by: Kimonas Sotirchos <[email protected]>

* notebooks: Add prefix env var only if missing

The controller should be setting OR updating the NB_PREFIX env var.
Previously it would always blindly append it to the spec, which could
result in double entries for the same env var.

Signed-off-by: Kimonas Sotirchos <[email protected]>

* releasing: Update tags for v1.5.0-rc.1 (kubeflow#6343)

Signed-off-by: Kimonas Sotirchos <[email protected]>

Co-authored-by: Andrey Velichkevich <[email protected]>
This reverts commit f22cb86.
This reverts commit 3364fbb.
…s-hosts' into unity/configure-notebook-multi-vs-hosts
@YuriJin-Unity YuriJin-Unity deleted the unity/configure-notebook-multi-vs-hosts branch March 25, 2024 22:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants