Support CSI start AlluxioFuse process in separate pod #15221

ssz1997 · 2022-03-29T06:02:31Z

What changes are proposed in this pull request?

Making CSI launch a separate pod running AlluxioFuse process, instead of launcing AlluxioFuse process in the CSI nodeserver container

Why are the changes needed?

If nodeserver container or node-plugin pod for any reason is down, we lose Alluxio Fuse process and it's very cumbersome to bring it back. With a separate Fuse pod, CSI pod won't affect Fuse process.

Solves #14917

Does this PR introduce any user facing changes?

Removed javaOptions from csi section in values.yaml. Alluxio properties in helm chart should be organized in one place, not in properties and in csi.
Add property mountInPod in csi section. If set to true, Fuse process is launched in the separate pod.

ssz1997 · 2022-03-31T00:30:49Z

@Binyang2014 Feel free to review

jiacheliu3

Honestly I know nothing about CSI as of now, so I could only review the helm chart part :(

jiacheliu3 · 2022-03-31T14:51:49Z

integration/kubernetes/helm-chart/alluxio/values.yaml

+          cpu: 4
+          memory: 8G
        requests:
-          cpu: "1"
-          memory: "1G"
+          cpu: 10m
+          memory: 300Mi


I guess this is a FUSE daemon which is a JVM? Will a low request and high limit cause the JVM heap resize? Would it be better to allocate more resources at first? Did you get this 300Mi from some test or estimation?

This is a little complicated. Should've commented beforehand. We now support two modes. One is Fuse daemon is in this container, which requires more resources. The other mode is Fuse daemon is in another pod. In this case this container requires much less resources. The mode depends on the property mountInPod, so I'm not sure what's the best way to allocate resources

If it's modal based on mountInPod I'd stick with the default being the greedy request, and then conditionally use some lower value in the template file (i.e: if .Values.csi.mountInPod).

Makes a lot of sense. Thanks

Actually let's just keep the resources as they were. If there's no fuse process in the container, 1 cpu and 1G memory is probably excessive but I don't think it's gonna hurt

jiacheliu3 · 2022-03-31T14:52:05Z

integration/kubernetes/helm-chart/alluxio/values.yaml

  # for csi client
  clientEnabled: false
  accessModes:
-    - ReadWriteMany
+    - ReadWriteOnce


Any implications of changing this?

Our csi controller actually only supports ReadWriteOnce. It will error out if the accessMode is any other.

alluxio/integration/docker/csi/alluxio/controllerserver.go

Lines 145 to 149 in 947632c

for _, cap := range req.VolumeCapabilities {

if cap.GetAccessMode().GetMode() != supportedAccessMode.GetMode() {

return &csi.ValidateVolumeCapabilitiesResponse{Message: "Only single node writer is supported"}, nil

}

}

…tPod

ZhuTopher

I wasn't able to provide much insight for the CSI changes, k8s library stuff looks fine to me. Some minor style nits here and there. Also remember to update the Helm Chart.md and CHANGELOG.md, thanks!

ZhuTopher · 2022-03-31T15:59:46Z

integration/docker/csi/alluxio/driver.go

@@ -49,6 +53,7 @@ func (d *driver) newNodeServer() *nodeServer {
 	return &nodeServer{
 		nodeId:            d.nodeId,
 		DefaultNodeServer: csicommon.NewDefaultNodeServer(d.csiDriver),
+		client:			   d.client,


nit: misaligned whitespace

ZhuTopher · 2022-03-31T16:01:22Z

integration/docker/csi/alluxio/nodeserver.go

+	"fmt"
+	csicommon "github.com/kubernetes-csi/drivers/pkg/csi-common"
+	"io/ioutil"
+	v1 "k8s.io/api/core/v1"
+	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
+	"k8s.io/client-go/kubernetes"
+	"k8s.io/client-go/kubernetes/scheme"


nit: idr what our import style conventions were, but I'd prefer to keep the Golang native libraries in a separate block from packages like github.com or k8s.io

ZhuTopher · 2022-03-31T16:27:27Z

integration/docker/csi/alluxio/nodeserver.go

+	csiFuseObj, grpVerKind, err := scheme.Codecs.UniversalDeserializer().Decode(csiFuseYaml, nil, nil)
+	if err != nil {
+		glog.V(4).Info("Failed to decode csi-fuse config yaml file")
+		return nil, status.Errorf(codes.NotFound, "Failed to decode csi-fuse config yaml file.\n", err.Error())


Is there a more fitting error code than codes.NotFound?

Changing to codes.Internal. Indeed this is not a NotFound.

ZhuTopher · 2022-03-31T16:47:06Z

integration/kubernetes/helm-chart/alluxio/values.yaml

+          cpu: 4
+          memory: 8G
        requests:
-          cpu: "1"
-          memory: "1G"
+          cpu: 10m
+          memory: 300Mi


If it's modal based on mountInPod I'd stick with the default being the greedy request, and then conditionally use some lower value in the template file (i.e: if .Values.csi.mountInPod).

ssz1997 · 2022-04-01T03:48:19Z

@jiacheliu3 @ZhuTopher PTAL. Thanks!

jiacheliu3

the helm chart parts LGTM except one small nit, thanks!

jiacheliu3 · 2022-04-01T04:38:29Z

integration/kubernetes/helm-chart/alluxio/templates/csi/fuse-configmap.yaml

+    kind: Pod
+    apiVersion: v1
+    metadata:
+      name: {{ $fullName }}-fuse-


nit

Suggested change

name: {{ $fullName }}-fuse-

name: {{ $fullName }}-fuse

Binyang2014

After this change. If two jobs use the same volume, then these two jobs will share the same fuse daemon. This will break the isolation we assumed before. And if Job A use I/O heavily, due to the thread number limitation, Job B will be impacted. Even more, if Job A cause fuse daemon crashed, Job B will also crashed. The idea solution is job A and job B use the different fuse daemons and job submitter can config the fuse daemon resource limitation based on their requirements.

Binyang2014 · 2022-04-01T05:26:07Z

integration/docker/csi/alluxio/nodeserver.go

+	return nodePublishVolumeMountPod(req)
+}
+
+func nodePublishVolumeMountProcess(req *csi.NodePublishVolumeRequest) (*csi.NodePublishVolumeResponse, error) {


Make it as a member function of nodeServer?

Why should we do it? We are not using the nodeServer in it

You can ignore this if not make sense to you

Binyang2014 · 2022-04-01T05:28:28Z

integration/docker/csi/alluxio/nodeserver.go

+		if err != nil {
+			return nil, err
+		}
+		if _, err := ns.client.CoreV1().Pods("default").Create(fusePod); err != nil {


It should in the same namespace with nodeserver, not default namespace

Binyang2014 · 2022-04-01T05:40:58Z

integration/docker/csi/alluxio/nodeserver.go

+	if nodeId == "" {
+		return nil, status.Errorf(codes.InvalidArgument, "nodeID is missing in the csi setup.\n%v", err.Error())
+	}
+	csiFusePodObj.Name = csiFusePodObj.Name + nodeId


What if multi-pod mount the different volume in the same node? Will they use the same pod name?

Good point. Thank you.

Binyang2014 · 2022-04-01T05:45:21Z

integration/kubernetes/helm-chart/alluxio/CHANGELOG.md

+
+0.6.41
+
+- Remove javaOptions under CSI


Why remove javaOptions?

It should be there. Adding it back.

Binyang2014 · 2022-04-01T05:46:05Z

integration/docker/csi/main.go

@@ -94,3 +102,19 @@ func startReaper() {
 		}
 	}()
 }
+
+func startKubeClient() (*kubernetes.Clientset, error) {


Change to newKubeClient?

Binyang2014 · 2022-04-01T05:48:36Z

integration/docker/csi/alluxio/nodeserver.go

+	i := 0
+	for i < 10 {
+		time.Sleep(3 * time.Second)
+		command := exec.Command("bash", "-c", fmt.Sprintf("mount | grep %v | grep alluxio-fuse", stagingPath))
+		stdout, err := command.CombinedOutput()
+		if err != nil {
+			glog.V(3).Infoln("Alluxio is not mounted.")
+		}
+		if len(stdout) > 0 {
+			break
+		}
+		i++
+	}
+	if i == 10 {
+		glog.V(3).Infoln("alluxio-fuse is not mounted to global mount point in 30s.")


Move this to NodeStageVolume? After NodeStageVolume we should make sure the volume already ready in this node

Makes a lot of sense. Thanks

ssz1997 · 2022-04-04T21:58:58Z

After this change. If two jobs use the same volume, then these two jobs will share the same fuse daemon. This will break the isolation we assumed before. And if Job A use I/O heavily, due to the thread number limitation, Job B will be impacted. Even more, if Job A cause fuse daemon crashed, Job B will also crashed. The idea solution is job A and job B use the different fuse daemons and job submitter can config the fuse daemon resource limitation based on their requirements.

@Binyang2014 Thank you so much for your review. I think these are good points.

However I'm wondering, if job A and B are different workloads (one I/O heavy and one not), does it make more sense to use different pv/pvc, so that they will launch separate Fuse pods and thus using different Fuse processes?

Do you prefer to just getting rid of the current way of launching Fuse processes in the nodeserver container, and launching them in different pods instead?

Binyang2014 · 2022-04-07T11:32:54Z

After this change. If two jobs use the same volume, then these two jobs will share the same fuse daemon. This will break the isolation we assumed before. And if Job A use I/O heavily, due to the thread number limitation, Job B will be impacted. Even more, if Job A cause fuse daemon crashed, Job B will also crashed. The idea solution is job A and job B use the different fuse daemons and job submitter can config the fuse daemon resource limitation based on their requirements.

@Binyang2014 Thank you so much for your review. I think these are good points.

However I'm wondering, if job A and B are different workloads (one I/O heavy and one not), does it make more sense to use different pv/pvc, so that they will launch separate Fuse pods and thus using different Fuse processes?

Do you prefer to just getting rid of the current way of launching Fuse processes in the nodeserver container, and launching them in different pods instead?

For AI scenario, most of jobs sharing a few well-known datasets such as ImageNet. Users may tune different model based on the dataset. So group/cluster admin may create one PV/PVC and all these jobs attached the same volume. Since we don't known the processing speed for each model. So maybe some model is processing faster than others. I agree admin can create different PV/PVC for different jobs, but this method seems not recommended by Kubernetes

For second question.
Yes, launching them in different pods will bring benefits for resource isolation and platform upgrade. There are two problems for current approach:

isolation: we can not cap the cpu/memory usage for each fuse daemon, if some less import jobs consuming data too fast, it will impact other critical jobs
Upgrading node-server will kill the fuse-daemon as well as the related jobs
So from my point of view, after migrate to pod method, we need to solve above problems.

ssz1997 · 2022-04-07T16:49:19Z

@Binyang2014 Thanks for the clarifications.

For next step, we plan to have two modes for the pod method: 1. Jobs using the same pv/pvc share one fuse daemon; 2. Each job always has its own fuse daemon.

For the problem #2 you mentioned, I believe as long as the fuse process is in a different pod, nodeserver upgrading should not kill the fuse process. Then the second mode in which each job has its own fuse daemons, we may be able to pass in some cap to limit its resource consumption when fuse pod is started, which resolves the problem #1.

ssz1997 · 2022-04-07T16:49:56Z

integration/docker/csi/alluxio/controllerserver.go

@@ -122,7 +126,6 @@ func (cs *controllerServer) DeleteVolume(ctx context.Context, req *csi.DeleteVol
 		glog.V(3).Infof("Invalid delete volume req: %v", req)
 		return nil, err
 	}
-	glog.V(4).Infof("Deleting volume %s", volumeID)


We don't do anything here. The log is misleading.

Does the returned value not trigger a deletion? If so why is the return type not a CreateVolumeResponse?
return &csi.DeleteVolumeResponse{}, nil

Actually the returned value will trigger the deletion of the pv. However that is not happening inside this function, so logging should also not be here. Plus we are not removing any data stored in Alluxio

ZhuTopher

Overall LGTM, thanks for all this work Shawn!

ZhuTopher · 2022-04-07T20:18:54Z

integration/docker/csi/alluxio/nodeserver.go

+	if req.GetVolumeContext()["mountInPod"] == "true" {
+		ns.mutex.Lock()
+		defer ns.mutex.Unlock()
+
+		glog.V(4).Infoln("Creating Alluxio-fuse pod and mounting Alluxio to global mount point.")
+		fusePod, err := getAndCompleteFusePodObj(ns.nodeId, req)
+		if err != nil {
+			return nil, err
+		}
+		if _, err := ns.client.CoreV1().Pods(os.Getenv("NAMESPACE")).Create(fusePod); err != nil {
+			return nil, status.Errorf(codes.Internal, "Failed to launch Fuse Pod at %v.\n%v", ns.nodeId, err.Error())
+		}
+		glog.V(4).Infoln("Successfully creating Fuse pod.")
+
+		// Wait for alluxio-fuse pod finishing mount to global mount point
+		i := 0
+		for i < 12 {
+			time.Sleep(5 * time.Second)
+			command := exec.Command("bash", "-c", fmt.Sprintf("mount | grep %v | grep alluxio-fuse", req.GetStagingTargetPath()))
+			stdout, err := command.CombinedOutput()
+			if err != nil {
+				glog.V(3).Infoln("Alluxio is not mounted yet.")
+			}
+			if len(stdout) > 0 {
+				break
+			}
+			i++
+		}
+		if i == 12 {
+			glog.V(3).Infoln("alluxio-fuse is not mounted to global mount point in 60s.")
+			return nil, status.Error(codes.DeadlineExceeded, "alluxio-fuse is not mounted to global mount point in 60s")
+		}
+	}


Can we make the timeout & retries configurable? I can imagine this being very varied.

Maybe we could define a readiness probe on the FUSE pod to indicate that it has finished mounting and wait on that through the K8s API, but for now the timeout is fine imo.

ZhuTopher · 2022-04-07T20:24:08Z

integration/docker/csi/alluxio/nodeserver.go

+		i := 0
+		for i < 12 {
+			time.Sleep(5 * time.Second)
+			command := exec.Command("bash", "-c", fmt.Sprintf("mount | grep %v | grep alluxio-fuse", req.GetStagingTargetPath()))
+			stdout, err := command.CombinedOutput()
+			if err != nil {
+				glog.V(3).Infoln("Alluxio is not mounted yet.")
+			}
+			if len(stdout) > 0 {
+				break
+			}
+			i++
+		}
+		if i == 12 {
+			glog.V(3).Infoln("alluxio-fuse is not mounted to global mount point in 60s.")
+			return nil, status.Error(codes.DeadlineExceeded, "alluxio-fuse is not mounted to global mount point in 60s")
+		}
+	}


Style nit: Assume the exterior of the for-loop is the error case if you are waiting on a timeout with retries.

So at the top of the method you'd need:

if req.GetVolumeContext()["mountInPod"] == "false" { return &csi.NodeStageVolumeResponse{}, nil }

And then this for-loop would look like the following:

Suggested change

i := 0

for i < 12 {

time.Sleep(5 * time.Second)

command := exec.Command("bash", "-c", fmt.Sprintf("mount | grep %v | grep alluxio-fuse", req.GetStagingTargetPath()))

stdout, err := command.CombinedOutput()

if err != nil {

glog.V(3).Infoln("Alluxio is not mounted yet.")

}

if len(stdout) > 0 {

break

}

i++

}

if i == 12 {

glog.V(3).Infoln("alluxio-fuse is not mounted to global mount point in 60s.")

return nil, status.Error(codes.DeadlineExceeded, "alluxio-fuse is not mounted to global mount point in 60s")

}

}

for i := 0; i < 12; i++ {

time.Sleep(5 * time.Second)

command := exec.Command("bash", "-c", fmt.Sprintf("mount | grep %v | grep alluxio-fuse", req.GetStagingTargetPath()))

stdout, err := command.CombinedOutput()

if err != nil {

glog.V(3).Infoln("Alluxio is not mounted yet.")

}

if len(stdout) > 0 {

return &csi.NodeStageVolumeResponse{}, nil

}

}

}

glog.V(3).Infoln("alluxio-fuse is not mounted to global mount point in 60s.")

return nil, status.Error(codes.DeadlineExceeded, "alluxio-fuse is not mounted to global mount point in 60s")

ZhuTopher · 2022-04-07T20:37:33Z

integration/kubernetes/helm-chart/alluxio/templates/csi/fuse-configmap.yaml

+            privileged: true
+            capabilities:
+              add:
+                - SYS_ADMIN


I think it'd be good practice to have in-line comments explaining why these are necessary. Same goes for any other CSI files where this is the case.

Binyang2014 · 2022-04-11T14:19:48Z

integration/docker/csi/alluxio/nodeserver.go

+	retry, err := strconv.Atoi(os.Getenv("FAILURE_THRESHOLD"))
+	if err != nil {
+		return nil, status.Errorf(codes.InvalidArgument, "Cannot convert failure threshold %v to int.", os.Getenv("FAILURE_THRESHOLD"))
+	}
+	timeout, err := strconv.Atoi(os.Getenv("PERIOD_SECONDS"))
+	if err != nil {
+		return nil, status.Errorf(codes.InvalidArgument, "Cannot convert period seconds %v to int.", os.Getenv("PERIOD_SECONDS"))
+	}
+	for i:= 0; i < retry; i++ {
+		time.Sleep(time.Duration(timeout) * time.Second)
+		command := exec.Command("bash", "-c", fmt.Sprintf("mount | grep %v | grep alluxio-fuse", req.GetStagingTargetPath()))
+		stdout, err := command.CombinedOutput()
+		if err != nil {
+			glog.V(3).Infoln(fmt.Sprintf("Alluxio is not mounted in %v seconds.", i * timeout))
+		}
+		if len(stdout) > 0 {
+			return &csi.NodeStageVolumeResponse{}, nil
+		}
+	}


I am thinking if we can leverage Kubernetes retry policy. We can let this method retry error if fuse-daemon not ready. Then k8s will retry automatically. So we don't need to write this logic by our own.

This will be in the next step.

I mean seems we can simply write as flowing:

Suggested change

retry, err := strconv.Atoi(os.Getenv("FAILURE_THRESHOLD"))

if err != nil {

return nil, status.Errorf(codes.InvalidArgument, "Cannot convert failure threshold %v to int.", os.Getenv("FAILURE_THRESHOLD"))

}

timeout, err := strconv.Atoi(os.Getenv("PERIOD_SECONDS"))

if err != nil {

return nil, status.Errorf(codes.InvalidArgument, "Cannot convert period seconds %v to int.", os.Getenv("PERIOD_SECONDS"))

}

for i:= 0; i < retry; i++ {

time.Sleep(time.Duration(timeout) * time.Second)

command := exec.Command("bash", "-c", fmt.Sprintf("mount | grep %v | grep alluxio-fuse", req.GetStagingTargetPath()))

stdout, err := command.CombinedOutput()

if err != nil {

glog.V(3).Infoln(fmt.Sprintf("Alluxio is not mounted in %v seconds.", i * timeout))

}

if len(stdout) > 0 {

return &csi.NodeStageVolumeResponse{}, nil

}

}

command := exec.Command("bash", "-c", fmt.Sprintf("mount | grep %v | grep alluxio-fuse", req.GetStagingTargetPath()))

stdout, err := command.CombinedOutput()

if err != nil {

glog.V(3).Infoln(fmt.Sprintf("Alluxio mount point is not ready"))

return err

}

You mean if fuse is not ready we just return the error, and let CSI recall this method again? But the later calls will first find out that the pod already exists and directly returns success and won't check the mount point again.
Am I interpreting it right?

OK, so we'd better use pod readiness probe. If the pod not ready, we return error directly then let CSI recall this method again, if it already ready return succeed. We should not rely on if pod existed to pass the check. Is it make sense?

Yes it makes sense. Just to clarify, here we are checking if Alluxio fuse has mounted Alluxio to mount point, not if the pod exists. I will work on the readiness probe soon.

Binyang2014 · 2022-04-11T14:21:09Z

integration/docker/csi/alluxio/nodeserver.go

+	if err != nil {
+		return nil, err
+	}
+	if _, err := ns.client.CoreV1().Pods(os.Getenv("NAMESPACE")).Create(fusePod); err != nil {


What if the pod already crated by previous request? Make it idempotent?

ssz1997 · 2022-04-11T18:42:41Z

@Binyang2014 If you think the PR is good to go, please approve it. Thanks!

Binyang2014

Thanks @ssz1997 for this change

Binyang2014 · 2022-04-12T04:44:29Z

integration/docker/csi/alluxio/nodeserver.go

+	retry, err := strconv.Atoi(os.Getenv("FAILURE_THRESHOLD"))
+	if err != nil {
+		return nil, status.Errorf(codes.InvalidArgument, "Cannot convert failure threshold %v to int.", os.Getenv("FAILURE_THRESHOLD"))
+	}
+	timeout, err := strconv.Atoi(os.Getenv("PERIOD_SECONDS"))
+	if err != nil {
+		return nil, status.Errorf(codes.InvalidArgument, "Cannot convert period seconds %v to int.", os.Getenv("PERIOD_SECONDS"))
+	}
+	for i:= 0; i < retry; i++ {
+		time.Sleep(time.Duration(timeout) * time.Second)
+		command := exec.Command("bash", "-c", fmt.Sprintf("mount | grep %v | grep alluxio-fuse", req.GetStagingTargetPath()))
+		stdout, err := command.CombinedOutput()
+		if err != nil {
+			glog.V(3).Infoln(fmt.Sprintf("Alluxio is not mounted in %v seconds.", i * timeout))
+		}
+		if len(stdout) > 0 {
+			return &csi.NodeStageVolumeResponse{}, nil
+		}
+	}


I mean seems we can simply write as flowing:

Suggested change

retry, err := strconv.Atoi(os.Getenv("FAILURE_THRESHOLD"))

if err != nil {

return nil, status.Errorf(codes.InvalidArgument, "Cannot convert failure threshold %v to int.", os.Getenv("FAILURE_THRESHOLD"))

}

timeout, err := strconv.Atoi(os.Getenv("PERIOD_SECONDS"))

if err != nil {

return nil, status.Errorf(codes.InvalidArgument, "Cannot convert period seconds %v to int.", os.Getenv("PERIOD_SECONDS"))

}

for i:= 0; i < retry; i++ {

time.Sleep(time.Duration(timeout) * time.Second)

command := exec.Command("bash", "-c", fmt.Sprintf("mount | grep %v | grep alluxio-fuse", req.GetStagingTargetPath()))

stdout, err := command.CombinedOutput()

if err != nil {

glog.V(3).Infoln(fmt.Sprintf("Alluxio is not mounted in %v seconds.", i * timeout))

}

if len(stdout) > 0 {

return &csi.NodeStageVolumeResponse{}, nil

}

}

command := exec.Command("bash", "-c", fmt.Sprintf("mount | grep %v | grep alluxio-fuse", req.GetStagingTargetPath()))

stdout, err := command.CombinedOutput()

if err != nil {

glog.V(3).Infoln(fmt.Sprintf("Alluxio mount point is not ready"))

return err

}

Binyang2014 · 2022-04-12T04:45:13Z

integration/docker/csi/alluxio/nodeserver.go

+		return nil, err
+	}
+	if _, err := ns.client.CoreV1().Pods(os.Getenv("NAMESPACE")).Create(fusePod); err != nil {
+		if strings.Contains(err.Error(), "already exists") {


Using http code 409 conflict for this?

http code exists in the http result object, and when err is not nil, we won't return the http result.

How about use this method https://github.com/kubernetes/apimachinery/blob/080c0c77fab5f76acfa0c31e9e59411ecc861973/pkg/api/errors/errors.go#L540?

HelloHorizon · 2022-04-12T19:56:45Z

alluxio-bot, merge this please

…e pod Making CSI launch a separate pod running AlluxioFuse process, instead of launcing AlluxioFuse process in the CSI nodeserver container If nodeserver container or node-plugin pod for any reason is down, we lose Alluxio Fuse process and it's very cumbersome to bring it back. With a separate Fuse pod, CSI pod won't affect Fuse process. Solves Alluxio#14917 1. Removed `javaOptions` from csi section in `values.yaml`. Alluxio properties in helm chart should be organized in one place, not in `properties` and in `csi`. 2. Add property `mountInPod` in csi section. If set to `true`, Fuse process is launched in the separate pod. pr-link: Alluxio#15221 change-id: cid-b6897172e11f80618decbfdc0758423e71aa387e

ssz1997 added 5 commits March 28, 2022 19:32

add csi fuse template

aa4c8fd

update helm values file

553bef4

Create clientset

e11a350

Support mount in pod

12bb5d3

update csi rbac

098aa68

alluxio-bot assigned LuQQiu Mar 29, 2022

ssz1997 mentioned this pull request Mar 29, 2022

[WIP] Make CSI launch Fuse pod #15165

Closed

HelloHorizon requested review from jiacheliu3 and ZhuTopher March 29, 2022 17:13

ssz1997 and others added 2 commits March 29, 2022 11:03

small fixes

43d11f6

Merge branch 'master' into CSIMountPod

d00b6f3

jiacheliu3 reviewed Mar 31, 2022

View reviewed changes

ssz1997 added 2 commits March 31, 2022 09:31

Better logging and formatting

d2b8532

Merge branch 'CSIMountPod' of github.com:ssz1997/alluxio into CSIMoun…

597fd1f

…tPod

ZhuTopher suggested changes Mar 31, 2022

View reviewed changes

ssz1997 added 2 commits March 31, 2022 10:10

Update docs

2e5399a

address comments

6667775

ssz1997 requested review from jiacheliu3 and ZhuTopher April 1, 2022 03:47

jiacheliu3 approved these changes Apr 1, 2022

View reviewed changes

Binyang2014 reviewed Apr 1, 2022

View reviewed changes

Address comments and fixes

8d26a32

ssz1997 commented Apr 7, 2022

View reviewed changes

ssz1997 added 2 commits April 7, 2022 09:51

Empty line at the end

c096403

nit

2dbb4dc

doc and CHANGELOG

8e575fd

ssz1997 mentioned this pull request Apr 7, 2022

Each job has a Fuse pod and configurable resource cap in CSI #15278

Closed

ZhuTopher approved these changes Apr 7, 2022

View reviewed changes

ssz1997 added 2 commits April 8, 2022 15:24

Address comments

151b903

Address one more comment

fc60a2a

Binyang2014 reviewed Apr 11, 2022

View reviewed changes

Make NodeStageVolume idempotent

89c03bb

Binyang2014 approved these changes Apr 12, 2022

View reviewed changes

alluxio-bot merged commit 7ed2f64 into Alluxio:master Apr 12, 2022

	for _, cap := range req.VolumeCapabilities {
	if cap.GetAccessMode().GetMode() != supportedAccessMode.GetMode() {
	return &csi.ValidateVolumeCapabilitiesResponse{Message: "Only single node writer is supported"}, nil
	}
	}

Support CSI start AlluxioFuse process in separate pod #15221

Support CSI start AlluxioFuse process in separate pod #15221

Conversation

ssz1997 commented Mar 29, 2022

What changes are proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user facing changes?

ssz1997 commented Mar 31, 2022

jiacheliu3 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ssz1997 Mar 31, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ZhuTopher left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ssz1997 commented Apr 1, 2022

jiacheliu3 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Binyang2014 left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ssz1997 Apr 4, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ssz1997 commented Apr 4, 2022

Binyang2014 commented Apr 7, 2022

ssz1997 commented Apr 7, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ZhuTopher left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Binyang2014 Apr 11, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ssz1997 commented Apr 11, 2022

Binyang2014 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

HelloHorizon commented Apr 12, 2022

ssz1997 Mar 31, 2022 •

edited

Loading

Binyang2014 left a comment •

edited

Loading

ssz1997 Apr 4, 2022 •

edited

Loading

Binyang2014 Apr 11, 2022 •

edited

Loading