Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Charlie/dev/v0.7.0 #2236

Closed
wants to merge 389 commits into from
Closed

Charlie/dev/v0.7.0 #2236

wants to merge 389 commits into from

Conversation

charlieyl
Copy link

[bugfix]1.Enhance GPU cache management by setting initial available GPU IDs when reset available gpus 2.set shm_size to 8G if not specified.

fedml-alex and others added 30 commits May 22, 2024 02:19
[CoreEngine] make the server status work.
[Deploy] Try to convert the gpu_topology value type to int.
[Deploy] Fix timezone issue when using pandas
[CoreEngine] In order to make the inference logs work, we save the co…
In order to make the inference logs work, we save the container inference logs to the single dir
[Deploy] Avoid re-download the same model serving package.
fedml-dimitris and others added 27 commits October 16, 2024 22:27
Enabling grpc to work with docker containers.
1.The work inference proxy port needs to be read from the configuration file
2.occupy_gpu_ids fail,get gpu ids from cache may return [] instead of None
[fixbug]1.read inference proxy port  from config file 2.occupy_gpu_ids fail,get gpu ids from cache may return [] instead of None
… with init gpu ids because of the system gpu resource may change) and logging in hardware and job utilities
Merge pull request #2233 from FedML-AI/charlie/dev/v0.7.0
Revert "Merge pull request #2233 from FedML-AI/charlie/dev/v0.7.0"
+ " [email protected] -n fedml-devops-aggregator-"
+ self.version
)
logging.info("Create secret cmd: " + registry_secret_cmd)

Check failure

Code scanning / CodeQL

Clear-text logging of sensitive information High

This expression logs
sensitive data (secret)
as clear text.

Copilot Autofix AI about 2 months ago

To fix the problem, we need to avoid logging sensitive information in clear text. Instead of logging the entire registry_secret_cmd, we can log a sanitized version of it that omits the sensitive parts. This way, we still get useful logging information without exposing sensitive data.

  • Identify the lines where sensitive information is being logged.
  • Replace the logging statements with sanitized versions that exclude sensitive data.
  • Ensure that the functionality of the code remains unchanged.
Suggested changeset 1
python/fedml/computing/scheduler/master/cloud_server_manager.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/python/fedml/computing/scheduler/master/cloud_server_manager.py b/python/fedml/computing/scheduler/master/cloud_server_manager.py
--- a/python/fedml/computing/scheduler/master/cloud_server_manager.py
+++ b/python/fedml/computing/scheduler/master/cloud_server_manager.py
@@ -83,3 +83,19 @@
         )
-        logging.info("Create secret cmd: " + registry_secret_cmd)
+        sanitized_registry_secret_cmd = (
+            "kubectl create namespace fedml-devops-aggregator-"
+            + self.version
+            + ";kubectl -n fedml-devops-aggregator-"
+            + self.version
+            + " delete secret secret-"
+            + self.cloud_server_name
+            + " ;kubectl create secret docker-registry secret-"
+            + self.cloud_server_name
+            + " --docker-server="
+            + self.agent_config["docker_config"]["registry_server"]
+            + " --docker-username=***"
+            + " --docker-password=***"
+            + " [email protected] -n fedml-devops-aggregator-"
+            + self.version
+        )
+        logging.info("Create secret cmd: " + sanitized_registry_secret_cmd)
         os.system(registry_secret_cmd)
EOF
@@ -83,3 +83,19 @@
)
logging.info("Create secret cmd: " + registry_secret_cmd)
sanitized_registry_secret_cmd = (
"kubectl create namespace fedml-devops-aggregator-"
+ self.version
+ ";kubectl -n fedml-devops-aggregator-"
+ self.version
+ " delete secret secret-"
+ self.cloud_server_name
+ " ;kubectl create secret docker-registry secret-"
+ self.cloud_server_name
+ " --docker-server="
+ self.agent_config["docker_config"]["registry_server"]
+ " --docker-username=***"
+ " --docker-password=***"
+ " [email protected] -n fedml-devops-aggregator-"
+ self.version
)
logging.info("Create secret cmd: " + sanitized_registry_secret_cmd)
os.system(registry_secret_cmd)
Copilot is powered by AI and may make mistakes. Always verify output.
Positive Feedback
Negative Feedback

Provide additional feedback

Please help us improve GitHub Copilot by sharing more details about this comment.

Please select one or more of the options
except Exception:
logging.error("Failed to connect to the docker daemon, please ensure that you have "
"installed Docker Desktop or Docker Engine, and the docker is running")
return "", "", None, None, None

# Pull the inference image
logging.info(f"Start pulling the inference image {inference_image_name}... with policy {image_pull_policy}")

Check failure

Code scanning / CodeQL

Clear-text logging of sensitive information High

This expression logs
sensitive data (password)
as clear text.

Copilot Autofix AI about 2 months ago

To fix the problem, we should ensure that sensitive information such as passwords is not logged. We can achieve this by sanitizing the inference_image_name before logging it or by avoiding logging it altogether if it contains sensitive data. In this case, we will sanitize the inference_image_name to ensure it does not contain sensitive information before logging it.

  1. Identify the logging statement that includes potentially sensitive information.
  2. Sanitize the inference_image_name to ensure it does not contain sensitive data.
  3. Update the logging statement to use the sanitized version of inference_image_name.
Suggested changeset 1
python/fedml/computing/scheduler/model_scheduler/device_model_deployment.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/python/fedml/computing/scheduler/model_scheduler/device_model_deployment.py b/python/fedml/computing/scheduler/model_scheduler/device_model_deployment.py
--- a/python/fedml/computing/scheduler/model_scheduler/device_model_deployment.py
+++ b/python/fedml/computing/scheduler/model_scheduler/device_model_deployment.py
@@ -144,3 +144,4 @@
     # Pull the inference image
-    logging.info(f"Start pulling the inference image {inference_image_name}... with policy {image_pull_policy}")
+    sanitized_inference_image_name = inference_image_name.split('/')[-1]  # Extract the image name without registry info
+    logging.info(f"Start pulling the inference image {sanitized_inference_image_name}... with policy {image_pull_policy}")
     ContainerUtils.get_instance().pull_image_with_policy(image_pull_policy, inference_image_name)
EOF
@@ -144,3 +144,4 @@
# Pull the inference image
logging.info(f"Start pulling the inference image {inference_image_name}... with policy {image_pull_policy}")
sanitized_inference_image_name = inference_image_name.split('/')[-1] # Extract the image name without registry info
logging.info(f"Start pulling the inference image {sanitized_inference_image_name}... with policy {image_pull_policy}")
ContainerUtils.get_instance().pull_image_with_policy(image_pull_policy, inference_image_name)
Copilot is powered by AI and may make mistakes. Always verify output.
Unable to commit as this autofix suggestion is now outdated
Positive Feedback
Negative Feedback

Provide additional feedback

Please help us improve GitHub Copilot by sharing more details about this comment.

Please select one or more of the options
except Exception as e:
inference_response = {"error": True, "message": f"{traceback.format_exc()}"}

return inference_response

Check warning

Code scanning / CodeQL

Information exposure through an exception Medium

Stack trace information
flows to this location and may be exposed to an external user.
Stack trace information
flows to this location and may be exposed to an external user.

Copilot Autofix AI about 2 months ago

To fix the problem, we need to ensure that stack traces are not exposed to end users. Instead, we should log the stack trace on the server and return a generic error message to the user. This can be achieved by modifying the exception handling code to log the stack trace and return a generic error message.

  1. Modify the exception handling code in the predict, predict_openai, predict_with_end_point_id, and custom_inference functions to log the stack trace and return a generic error message.
  2. Ensure that the logging configuration is set up to capture the stack trace information.
Suggested changeset 1
python/fedml/computing/scheduler/model_scheduler/device_model_inference.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/python/fedml/computing/scheduler/model_scheduler/device_model_inference.py b/python/fedml/computing/scheduler/model_scheduler/device_model_inference.py
--- a/python/fedml/computing/scheduler/model_scheduler/device_model_inference.py
+++ b/python/fedml/computing/scheduler/model_scheduler/device_model_inference.py
@@ -117,3 +117,4 @@
     except Exception as e:
-        response = {"error": True, "message": f"{traceback.format_exc()}"}
+        logging.error(traceback.format_exc())
+        response = {"error": True, "message": "An internal error has occurred!"}
 
@@ -141,3 +142,4 @@
     except Exception as e:
-        response = {"error": True, "message": f"{traceback.format_exc()}, exception {e}"}
+        logging.error(traceback.format_exc())
+        response = {"error": True, "message": "An internal error has occurred!"}
 
@@ -167,3 +169,4 @@
     except Exception as e:
-        inference_response = {"error": True, "message": f"{traceback.format_exc()}"}
+        logging.error(traceback.format_exc())
+        inference_response = {"error": True, "message": "An internal error has occurred!"}
 
@@ -183,3 +186,4 @@
     except Exception as e:
-        inference_response = {"error": True, "message": f"{traceback.format_exc()}"}
+        logging.error(traceback.format_exc())
+        inference_response = {"error": True, "message": "An internal error has occurred!"}
 
EOF
@@ -117,3 +117,4 @@
except Exception as e:
response = {"error": True, "message": f"{traceback.format_exc()}"}
logging.error(traceback.format_exc())
response = {"error": True, "message": "An internal error has occurred!"}

@@ -141,3 +142,4 @@
except Exception as e:
response = {"error": True, "message": f"{traceback.format_exc()}, exception {e}"}
logging.error(traceback.format_exc())
response = {"error": True, "message": "An internal error has occurred!"}

@@ -167,3 +169,4 @@
except Exception as e:
inference_response = {"error": True, "message": f"{traceback.format_exc()}"}
logging.error(traceback.format_exc())
inference_response = {"error": True, "message": "An internal error has occurred!"}

@@ -183,3 +186,4 @@
except Exception as e:
inference_response = {"error": True, "message": f"{traceback.format_exc()}"}
logging.error(traceback.format_exc())
inference_response = {"error": True, "message": "An internal error has occurred!"}

Copilot is powered by AI and may make mistakes. Always verify output.
Positive Feedback
Negative Feedback

Provide additional feedback

Please help us improve GitHub Copilot by sharing more details about this comment.

Please select one or more of the options
@charlieyl charlieyl closed this Dec 17, 2024
Copy link

gitguardian bot commented Dec 17, 2024

⚠️ GitGuardian has uncovered 6 secrets following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

🔎 Detected hardcoded secrets in your pull request
GitGuardian id GitGuardian status Secret Commit Filename
11782617 Triggered Generic High Entropy Secret 0491bb7 .github/workflows/registry-runners/Dockerfile View secret
5451874 Triggered Generic Password 87ae30a python/fedml/computing/scheduler/model_scheduler/master_job_runner.py View secret
11782618 Triggered Generic High Entropy Secret a5bbcd2 .github/workflows/registry-runners/windows.ps1 View secret
5692101 Triggered Generic High Entropy Secret a932082 python/fedml/computing/scheduler/model_scheduler/device_model_deployment.py View secret
9453265 Triggered Generic High Entropy Secret 87ae30a python/fedml/api/api_test.py View secret
8762943 Triggered Generic Password 87ae30a python/fedml/computing/scheduler/scheduler_core/compute_cache_manager.py View secret
🛠 Guidelines to remediate hardcoded secrets
  1. Understand the implications of revoking this secret by investigating where it is used in your code.
  2. Replace and store your secrets safely. Learn here the best practices.
  3. Revoke and rotate these secrets.
  4. If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants