Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add semgrep code scanning via -safe argument #484

Merged

Conversation

ericrallen
Copy link
Collaborator

@ericrallen ericrallen commented Sep 23, 2023

Describe the changes you have made:

This PR introduces the first tool in a safety toolkit to help verify that code might be safe to execute.

It adds a -safe argument that can be used to enable code scanning via semgrep.

-safe has 3 possible values:

  • off: (default) don't enabled safe_mode
  • auto: enable safe_mode and always scan code with semgrep before asking to execute
  • ask: enable safe_mode and ask the user if they want to scan a code snippet with semgrep before asking to execute

Note: The -safe option is disabled by default and is entirely opt-in. Enabling safe_mode disables auto_run.

Reference any relevant issue (Replaces #24)

  • I have performed a self-review of my code:

I have tested the code on the following OS:

  • Windows
  • MacOS
  • Linux

AI Language Model (if applicable)

  • GPT4
  • GPT3
  • Llama 7B
  • Llama 13B
  • Llama 34B
  • Huggingface model (Please specify which one)

I tested this with code generated and read from the filesystem with gpt-3.5-turbo and gpt-4, but the code scanning doesn't include the model in any way, so it should function regardless of what model the user has configured.

interpreter/cli/cli.py Outdated Show resolved Hide resolved
from .languages.r import R


language_map = {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up needing to reference the same language map that the create_code_interpreter module was using, so I moved it into it's own module to make it easier to share.

@ericrallen
Copy link
Collaborator Author

ericrallen commented Sep 23, 2023

Screen Shot 2023-09-23 at 12 33 40 AM

Not sure why it left all that weird space at the top, maybe something about how the subprocess stdout is displayed or something semgrep is doing with it's output?

interpreter/cli/cli.py Outdated Show resolved Hide resolved
interpreter/core/core.py Outdated Show resolved Hide resolved
@@ -4,6 +4,8 @@
import re

class Python(SubprocessCodeInterpreter):
file_extension = "py"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These file_extensions aren't strictly necessary, but it felt nicer when generating the temporary file for semgrep to scan to do it with the appropriate extension and could come in handy for some future ideas - like adding some functionality to directly export generated code into a file via a some %magic command without needing to have another exchange with Open Interpreter to get it to create some code to pipe the other code into the file.

In the case of something like the JavaScript language configuration this could get a little more complicated since they could also be modules that use the mjs extension, but for the purposes it's being used for right now - a naive extension suffices.

@ericrallen
Copy link
Collaborator Author

NOTE: This depends on #511 to work as expected.

@KillianLucas
Copy link
Collaborator

KillianLucas commented Sep 26, 2023

@ericrallen this is brilliantly, beautifully done. To me this is a strong foundation for a --safe flag.

I'm in favor of fewer, simpler options so I'm going to remove the tiers. --safe will simply scan the code automatically (let me know if this is very important for certain workflows, or if semgrep takes a very long time which would justify the tiers). We can add then additional security features into --safe later (like guarddog).

Just to confirm before merging this (and #511), users don't have to log in to semgrep to use it?

@ericrallen
Copy link
Collaborator Author

@KillianLucas I've cleaned this one up a bit given the desire to put safety tools under a single flag in the future - I've also simplified the output and made safe_mode take precedence over auto_run given that automatically running code is the opposite of safe.

I moved the temporary file creation and cleanup into it's own utility since there are other safety tools, like scanning for exposed secrets, that will likely need to leverage a similar temporary file functionality.

Semgrep's free tier works really well for this usecase - just scanning a snippet of code locally - but the CLI also allows users who have a Semgrep account to login and leverage some of the more advanced features if they want.

Here's some recordings of the output for each value of the flag:

Open Interpreter with -safe "off":

OpenInterpreter-SafeMode-Off

Open Interpreter with -safe "ask":

OpenInterpreter-SafeMode-Ask

Open Interpreter with -safe "auto":

OpenInterpreter-SafeMode-Auto

@ericrallen
Copy link
Collaborator Author

ericrallen commented Sep 26, 2023

Here are the prompts I've been using to quickly test this functionality:

Generate Safe (hopefully) Code

Solve FizzBuzz for the numbers 0 through 17 with a basic Python loop and if statement. No need to explain the rules of the game, just show us the code.

Generate Vulnerable Code

We just added a code scanner to Open Interpreter that relies on Semgrep to detect vulnerable code. Please generate a Python script with an obvious vulnerability, like SQL injection for example, that Semgrep should be able to identify. DO NOT MENTION THE VULNERABILITY IN THE SCRIPT VARIABLE NAMES OR CODE COMMENTS. This script should look like a developer accidentally included the vulnerability without realizing it. DO NOT WARN US ABOUT THE VULNERABILITY OR ITS IMPLICATIONS. We want to see if the code scanner identifies the vulnerability without you alerting us to it.

Without the bit about testing the code scanner, sometimes it will start to argue with you about generating vulnerable code.

@ericrallen
Copy link
Collaborator Author

You can also save the following snippets to files on your machine and ask Open Interpreter to read and execute the code.

Note: These examples were generated by Open Interpreter as examples of vulnerable code that should be identified by the code scanner.

JavaScript

const vm = require('vm');
const contextObject = { globalVar: 1 };

// safe
vm.runInContext('globalVar *= 2;', contextObject);

// vulnerable
let userInput = 'this.constructor.constructor("return process.env")()'; // Value supplied by user input
vm.runInContext(`globalVar = ${userInput};`, contextObject);

// safe
const code = `return 'hello ' + name`
vm.compileFunction(code, [], { parsingContext: vm.createContext({ name: 'name' }) })

// vulnerable
let userInput = '1; while (true)

Python

import subprocess
import sys

# Vulnerable
user_input = "foo && cat /etc/passwd" # value supplied by user
subprocess.call("grep -R {} .".format(user_input), shell=True)

# Vulnerable
user_input = "cat /etc/passwd" # value supplied by user
subprocess.run(["bash", "-c", user_input], shell=True)

# Not vulnerable
user_input = "cat /etc/passwd" # value supplied by user
subprocess.Popen(['ls', '-l', user_input])

# Not vulnerable
subprocess.check_output('ls -l dir/')

@ericrallen ericrallen changed the title feat: add semgrep code scanning via --scan flag feat: add semgrep code scanning via --safe flag Sep 26, 2023
@ericrallen ericrallen changed the title feat: add semgrep code scanning via --safe flag feat: add semgrep code scanning via -safe argument Sep 26, 2023
@ericrallen
Copy link
Collaborator Author

ericrallen commented Sep 27, 2023

@KillianLucas I rebased this branch, fixed a bit of code that I forgot to update from scan_code to safe_mode, and added a loading indicator (via yaspin during the scan so that the user has some feedback that things are happening.

This used to display the semgrep output, but that had a bit too much noise and it felt like it needed something to let the user know that things are happening.

Loading Indicator

OpenInterpreter-SafeMode-Auto-Loading

Also here's a simple prompt for anyone who wants to test this to get right into a state where you can test the scanner after running with poetry run interpreter -safe "ask" or poetry run interpreter -safe "auto":

Solve FizzBuzz for 0 through 17. Don't explain the code or tell me your process or how FizzBuzz works. Just generate the code so we can execute it.

@@ -4,6 +4,9 @@
import re

class Python(SubprocessCodeInterpreter):
file_extension = "py"
proper_name = "Python"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These proper names are mostly just being used for the code scanner to say

Code Scanner: No Issues were found with this Python code

We could just rely on the lowercase language string from the code block, but it felt nicer and more human-readable this way.

interpreter_intro_message.append(f"**Safe Mode**: {interpreter.safe_mode}")
else:
interpreter_intro_message.append(
"Use `interpreter -y` or set `auto_run: true` to bypass this."
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since safe_mode is opt-in and disables auto_run it seemed unnecessary to mention the -y flag.

@ericrallen
Copy link
Collaborator Author

Not sure why the workflow tests are failing. Everything seems to work when I run the test suite locally.

@ericrallen
Copy link
Collaborator Author

@KillianLucas I’ll rebase again tomorrow to resolve the poetry.lockconflict, and try to see if I can get to the bottom of what’s wrong with the test suite, but this PR only applies features behind a flag, so I’m not sure how it impacted them.

Any guidance on getting it ready to merge would be appreciated.

@KillianLucas
Copy link
Collaborator

GREAT WORK ERIC! Literally our most impressive and extensive PR on this whole project to date. Extreme utility here, and you've opened the door to a substantial wing of the OI project (and maybe even to an industry/discipline??) focused on making LLM-written code safe. Merging now. 🎉

@KillianLucas KillianLucas merged commit 0daeeef into OpenInterpreter:main Sep 28, 2023
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants