-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat:Add support for containerized Code execution, and utilities ( upload / download fn ). #459
Conversation
0fad28d
to
320b463
Compare
Cant wait for this PR to be merged. |
I should be done today! just some fixes with the container building are all the problems left. and the problems are just because I cant do bash code so gpt is basically doing it for me lol |
@unaidedelf8777 Thanks for the update. Just curious about this- "Remote Execution: Allows Docker containers to be run on remote servers by setting the DOCKER_HOST environment variable. This is particularly useful for businesses that wish to keep data on company servers." Does this mean we can run Open Interpreter in multiple docker containers to serve multiple users (concurrent connections)? Just like how ChatGPT's CodeInterpreter does. |
yeah, that's exactly why I wanted to do this; so I can sell the setup to businesses ( code still opensource ofc ). right now with how its setup, it would just be connecting to the docker engine API on the specified host ( whether it be remote or local ) and spawning a container on the server, interacting with the container via the terminal on the users' machine. I also want to make it so companies can mount there data in the container easily, and still have session paths were session specific data gets stored; but this can be done down the line since it took me a while to understand docker to this level; let alone to one where I can understand all these other things. I'm also working on making it so that you can upload and download files from the container session in a way that you can manually do it or the model can specify the URL easily. I think I will also make the downloads / uploads disable-able. any thoughts? |
Amazing. I appreciate your effort. I think uploading and downloading files from the docker container in real-time is very important here along with the streaming text responses. There is a similar implementation in CodeInterpreter-api, it uses a docker-contained Jupyter notebook to execute the python code. It's rather slow and doesn't have a streaming response. Check out how the file uploads and download implementation in each session for your reference: https://github.com/shroominic/codeinterpreter-api I think the speed to spin up new docker instances for each session should be very quick. Any ideas on how to achieve this? |
yes. currently it spins up one container, and then execs into that container for each 'CodeInterpreter'. it seems to be pretty fast though I havent measured it yet; usually less than a second or 2 to spin a new one and less than a second to send a command and recieve output since we use the containers socket directly. |
f7a65ef
to
040fd6a
Compare
Around 2 seconds is amazing. I am happy to test this for you when the PR is merged. Let me know. |
@vinodvarma24, it is fully functional on my fork right now. to use the containers, you will need to run interpreter with I also implemented 2 cli functions. Please let me know any critique you have or what could be improved! |
…he base interpreter class or anything in the core folder was needed. Update README from base/main merge rebased branch to main. (#2) * fix: stop overwriting boolean config values Without the default set to None, any boolean CLI flag that isn't passed reverts to its default state even if it is configured in the config.yaml file. * The Generator Update (English docs) * Improved --conversations, --config --------- quality of life and error messages errors and stuff again re-add readline method because doc formatting removed it somehow fix readline method of wrapper added file upload and download functionality finalized upload and download commands. tested stuff visual Improved --conversations, --config The Generator Update (English docs) fix: stop overwriting boolean config values Without the default set to None, any boolean CLI flag that isn't passed reverts to its default state even if it is configured in the config.yaml file. Update WINDOWS.md Warns the user to re-launch cmd windows after installing llama locally Fix ARM64 llama-cpp-python Install on Apple Silicon This commit updates the `MACOS.md` documentation to include detailed steps for correctly installing `llama-cpp-python` with ARM64 architecture support on Apple Silicon-based macOS systems. The update provides: - A prerequisite check for Xcode Command Line Tools. - Step-by-step installation instructions for `llama-cpp-python` with ARM64 and Metal support. - A verification step to confirm the correct installation of `llama-cpp-python` for ARM64 architecture. - An additional step for installing server components for `llama-cpp-python`. This commit resolves the issue described in `ARM64 Installation Issue with llama-cpp-python on Apple Silicon Macs for interpreter --local OpenInterpreter#503`. Broken empty message response fix crash on unknwon command on call to display help message removed unnecessary spaces Update get_relevant_procedures.py Fixed a typo in the instructions to the model The Generator Update The Generator Update The Generator Update - Azure fix The Generator Update - Azure function calling The Generator Update - Azure fix Better debugging Better debugging Proper TokenTrimming for new models Generator Update Fixes (Updated Version) Generator Update Quick Fixes Added example JARVIS Colab Notebook Added example JARVIS Colab Notebook Skip wrap_in_trap on Windows fix: allow args to have choices and defaults This allows non-boolean args to define possible options and default values, which were ignored previously. feat: add semgrep code scanning via --safe flag This reintroduces the --safe functionality from OpenInterpreter#24. --safe has 3 possible values auto, ask, and off Code scanning is opt-in. fix: default to 'off' for scan_code attribute fix: toggle code_scan based on auto_run setting; update --scan docs revert: undo default and choices change to cli.py This is being removed from this PR in favor of a standalone fix in OpenInterpreter#511 feat: cleanup code scanning and convert to safe mode docs: fix naming of safe_mode flag in README fix: pass debug_mode flag into file cleanup for code scan fix: remove extra tempfile import from scan_code util Fixed first message inturruption error Holding `--safe` docs for pip release fix: stop overwriting safe_mode config.yaml setting with default in args Fixed `%load` magic command But I think we should deprecate it in favor of `--conversations`. Generalized API key error message Better model validation, better config debugging Better config debugging Better config debugging Better config debugging Better --config Cleaned up initial message Generator Update Quick Fixes II Force then squashing (#3)
d983560
to
6ae3f20
Compare
Sure, I will try this and let you know of any feedback. |
Really looking forward to this being merged! |
I am TOO! I don't want to have to resolve merge conflicts again lol. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this PR is applying some automated formatting pretty broadly across the codebase, which creates a lot of noise for a review and also makes it's footprint much larger than actual changes it is proposing.
Could you clean these up so that it's only including the actual changes? Most editors support some sort of save without autoformatting or only autoformat the new changes to the file kind of setting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh, thank you. I didnt even notice that vscode was doing that sry.
If you need to make adjustments, kindly use the 'DockerManager' class. It offers convenient methods like: | ||
- add_dependency | ||
- remove_dependency | ||
- add_language | ||
|
||
Your cooperation helps maintain a smooth and reliable development workflow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems a little unintuitive and would really benefit from concrete examples and explanations of the how, what, and why of this design choice.
But, I wonder if, given that Python has a requirements.txt
and Node has a package.json,
, it would make more sense to just leverage those files instead of putting everything into a requirements.txt
and adding a layer of indirection and abstraction? I'm not sure if R manages dependencies similarly, but maybe we could implement something unique for that edge case instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The addition of the DockerManager
class was mainly aimed at providing a convienent interface to add and remove dependencies to the container, so that users / The Language model dont have to re-install the dependencies they want at runtime every time they want to use it.
The reason I merged all of the dependency lists into one file served 3 main purposes.
- just to avoid clutter.
- so that the interface which
DockerManager
uses for adding and removing dependencies could be standardized. - also, this way theres only one file to copy into the container when building the image which just makes image construction faster albeit by a small ammount.
I also think I am going to remove the add_language
method of the manager class. the rationale behind this is because it would take a decent ammount of work in the rest of the codebase to truly support a new language; and at that point they may aswell just modify the Dockerfile and requirements themselves.
poetry.lock
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still has some headers from a merge conflict in it.
I think you can resolve them by rebasing and then doing:
git checkout theirs poetry.lock && poetry lock --no-update
Or by checking out the version from main
(in my case the origin for this original repo is named upstream
, replace upstream
with your origin name for this repo and not your fork):
git checkout upstream poetry.lock && poetry lock --no-update
if current_hash == original_hash: | ||
images = client.images.list(name=image_name, all=True) | ||
if not images: | ||
Print("Downloading default image from Docker Hub, please wait...") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to put this behind debug_mode
as it seems to pop up more often than I would have expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
huh. it wasnt doing that before. lemme look into it. I think I know whats causing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this has been resolved. docker was tagging the image with the repository name when pulling it from dockerhub, but I had it configured to look for it locally. we now re-tag the image using docker tag [repo image name] [new local image name]
I will push this in my next commit.
Also @ericrallen referencing a file in your local system wont work, since the file isn't in the container. to upload it to the container use |
Incredible work Nathan @unaidedelf8777! Seriously floored by the work you've put into this.
And yes, containerized execution to me is critical to getting OI to be fully safe for sensitive applications. Potentially could become the number 1 choice for secure LLM code execution if we do this right. Also thank you so much to @ericrallen for reviewing this. Why not simply run Open Interpreter inside a docker container though? I feel we can benefit from the community development of OI by keeping it in one piece as much as possible. Here, the boundary to the docker container is at the code execution part. That might be the best way to do it. But why not put all of OI into one docker container, then just send the user message into it / stream a message out of it? That way we still use subprocess and everything, no added complexity there. Container just holds the OI core, and we make it so the terminal interface can stream messages to/from that just as easily as it does now (to the OI core running it the same process as it). Re: each language having its own container setup: is it feasible to simply have an official OI docker image like I'm sure there's a reason (perhaps speed of spinning up the docker?) but let me know. This is a fair number of files + complexity to introduce to the codebase, so I want to be sure we implement this as minimally as possible. |
Yes, So currently the container image uncompressed sits at about 1.81 gb, with the main contributor to the size just being the basic dependencies which I added for each language ( We can change this as needed of course if making it lightweight is a priority ). While Running OI inside of a singular container is feasible and may be a good option for some, It also takes away the ability to add dependencies into the container, since the container would be starting anew each time you use it. and that's the other thing; your starting anew each time + users will need to re-import files each time they want to use a file from there host system, and also exporting files from a singular container to the users host sys would be a nightmare, since docker isn't too keen on letting containers effect the host system. that's why I opted for a more session based approach since it means that a users container can persist as long as needed, whether it be local or remote on a host server. anyway that's my two sense. |
@vinodvarma24 , I am not 100% sure what the issue is with that one. do you have the docker python SDK installed? ( I would suspect its because macos probably has some weird sandboxing protocol or something.. maybe? I know apple sandboxes stuff wierdly on iPhone / iPad, wouldn't be surprised if they sandboxed the mac the same. to be sure i'd try running the following: import docker
def check_docker_connection():
try:
client = docker.DockerClient(base_url='unix://var/run/docker.sock')
client.ping()
return "Connection to Docker daemon succeeded!"
except docker.errors.APIError as e:
return f"Failed to connect to Docker daemon: {str(e)}"
except Exception as e:
return f"An unexpected error occurred: {str(e)}"
if __name__ == "__main__":
print(check_docker_connection()) You shouldn't have to explicitly set the If that errors out, then please lmk. Edit: GPT has hailed me with a solution from the gods. working on setting it up and testing. |
…nterpreter, though it isnt. this is for simplicity.
…nterpreter, though it isnt. this is for simplicity.
@ericrallen @KillianLucas @vinodvarma24 @nbbaier Hi all, Just letting you all know that I have everything finished and ready for merge. tested all languages. as well, I added a basic file selector ui for the I normally would also be sending screen recordings and screenshots, but I am currently using a school PC, and they don't let us screen record for some reason. |
@@ -0,0 +1,78 @@ | |||
# Base image | |||
FROM debian:bullseye |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can resolve a lot of the sqlite3
issues by changing from debian:bullseye
to debian:bookworm
as a base image as described by the ChromaDB docs on troubleshooting SQLite issues.
The fixes implemented here to get it working actually don't work on my system, and add some complexity that might not work with every user's or developer's environment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alright yeah. I simply included it because It made it work on my github codespace where I was doing dev. when tested on my windows PC it worked perfectly aswell, so I assumed it was identical and worked on every system, but maybe not idk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bookworm tag should work but it may be better to explicitly add the versioned dep.
untested, on mobile atm but something close to this should work
# it could be worth versioning this as well for consistency
# latest is debian:bullseye-20231009
FROM debian:bullseye
# SQLite versioning in URL
# <major><minor>0<patch> seems to be the scheme
# https://www.sqlite.org/download.html
ARG SQLITE_VERSION=3350000
RUN apt-get update \
&& apt-get install -y wget build-essential
# note the date path param may need to be set as an arg too in the future
RUN wget https://www.sqlite.org/2023/sqlite-autoconf-${SQLITE_VERSION}.tar.gz \
&& tar xvfz sqlite-autoconf-${SQLITE_VERSION}.tar.gz \
&& cd sqlite-autoconf-${SQLITE_VERSION} \
&& ./configure \
&& make \
&& make install
this will give longevity to the image by being versioned respective to the OI needs rather than upstream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think upgrading to the current stable release (bookworm
) makes sense.
There’s only one dependency that cares about the SQLite version, and it just relies on something introduced in 3.35, until there’s a version they are incompatible with, I’m not sure there’s a need to pin to a specific SQLite version.
I would suspect that upstream Debian is likely to be more stable than this project and any of our dependencies.
[tool.poetry.dependencies.pyreadline3] | ||
version = "^3.4.1" | ||
markers = "sys_platform == 'win32'" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These can be removed. They are commented out below due to some issues with Windows and readline
.
ooba = "^0.0.21" | ||
chroma = "^0.2.0" | ||
chromadb = "^0.4.14" | ||
pysqlite3-binary = "^0.5.2.post1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like I mentioned in the Dockerfile
I think upgrading to bookworm
is a better choice than trying to deal with the complexity of bundled sqlite3
versions.
# Add 'powershell' if the OS is Windows, 'applescript' if macOS, or none if it's another OS | ||
if os_type == 'windows': | ||
base_languages.append('powershell') | ||
elif os_type == 'darwin': # Darwin is the system name for macOS | ||
base_languages.append('applescript') | ||
|
||
corrected_schema['parameters']['properties']['language']['enum'] = base_languages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a nice little quality-of-life change. Great idea!
Generally, I'd prefer to see a change like this that isn't specifically related to the core changes of the Pull Request in a separate PR that we can more easily review, test, and merge, but I can see how this would benefit the container approach, too.
import sys | ||
import pysqlite3 | ||
|
||
# Alias pysqlite3 as sqlite3 in sys.modules. this fixes a chromadb error where it whines about the wrong version being installed, but we cant change the containers sqlite. | ||
# 'pysqlite3' is a drop in replacement for default python sqlite3 lib. ( identical apis ) | ||
sys.modules['sqlite3'] = pysqlite3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just putting a note here so that we don't forget to remove this when we switch to a different base container image.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why can't the container sqlite be changed?
@@ -2,32 +2,33 @@ | |||
from random import randint | |||
import time | |||
import pytest | |||
import interpreter | |||
import interpreter as i |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should give this a more useful and readable name than i
. Maybe using PascalCase and calling it Interpreter
would be better?
|
||
@pytest.fixture(scope="function") # This will make the interpreter instance available to all test cases. | ||
def interpreter(): | ||
interpreter = i.create_interpreter() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that every test case is using a new instance of interpreter?
One of the things we were doing previously was testing that clearing messages via the .reset()
method worked as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love the idea of trying to improve stuff as we go incrementally, but were the changes in this file related to the container implementation or necessary based on changes in the rest of the Pull Request?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those changes were to accommodate for the factory function in the init.py, which just makes integration of the interpreters easier. with those changes, pytest
will also give a fresh interpreter instance to each test, which makes the test more robust and reliable.
from typing import (Optional, | ||
Union, | ||
Iterator, | ||
Any, | ||
Callable, | ||
List, | ||
Dict | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm all for adding types, but I don't think they are necessary for this PR, and end up creating more noise in the diffs. I think this kind of change is better handled in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alright yeah. I do agree, I simply added them for testing; we can remove them.
display_markdown_message( | ||
f"> messages json loaded from {os.path.abspath(json_path)}") | ||
|
||
def handle_container_upload(self,type=None, *args): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can address this in a separate PR afterward, but I want to leave a note here to refer back to.
From a user experience perspective, I think this should just be a polymorphic function that lets me either run %upload
, %upload misc/tests/OpenInterpreter-Notebook-Mode.gif
, or %upload misc/tests/data
and figures out if I have referenced a file or directory and acts accordingly instead of requiring the file
or folder
keyword.
"%help": "Show this help message.", | ||
"%upload": "open a File Dialog, and select a file to upload to the container. only used when using containerized code execution", | ||
"%upload folder": "same as upload command, except you can upload a folder instead of just a file.", | ||
"%upload file": "same as upload command, except you can upload a file.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When using %upload file path/to/my/file
or %upload folder path/to/directory
it still seems to open the `Do you want to upload a folder or a file?" window.
Maybe we should just describe the %upload
magic command for now and revisit the file
and folder
functionality in a separate PR after we are able to get this one ready and merged in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think i will just remove file and folder for the command. it will still work, it is only since you cannot open a file dialog which can select files and folders, so I need to know which to open. but when using it via cli its not needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I think I misunderstood the purpose of them.
I’ll retest them in a bit and see if they open the correct file/folder selector.
We might just need to add some docs about usage to avoid confusion.
|
||
def handle_undo(self, arguments): | ||
# Removes all messages after the most recent user entry (and the entry itself). | ||
# Therefore user can jump back to the latest point of conversation. | ||
# Also gives a visual representation of the messages removed. | ||
|
||
if len(self.messages) == 0: | ||
return | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Little changes like this are unnecessary and add a lot of extra noise to the diffs in this PR, which is already very complex to review.
There appear to be several of them in this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, sorry about those. i dont know what my editor is doing. I disabled the file formatting in vscode, so i dont know what keeps doing it. any idea what is causing it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like maybe a tab width thing?
We should probably add an Editor Config file to try to help avoid that in the future.
"%upload": "open a File Dialog, and select a file to upload to the container. only used when using containerized code execution", | ||
"%upload folder": "same as upload command, except you can upload a folder instead of just a file.", | ||
"%upload file": "same as upload command, except you can upload a file.", | ||
"%download" : "Download a file or directory given the file or folder name in the container." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh wow. i guess it tried to download the entire /mnt/data dir. i would say thats wierd, but its really just me being blind. simple fix.
|
||
|
||
def wrap_in_trap(code): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we removed this in c93eb67 as it doesn't properly emit an "end_of_execution" flag (exit
here will fire before it reaches that) so if it errors it just hangs forever.
Hey @unaidedelf8777— incredible work and progress on this PR. Really excited for the enhanced safety + remote execution features this will open up. There's something really, really exciting about figuring out how to use a Docker container as a subprocess... I'm ready to merge if you can remove How easy would that be to revert? This is unrelated to this PR but ideally we'd have the behavior below. Working on this. ...or is it related to this PR? Is it important for you to have more Instance control? # Get an instance of OI quickly
import interpreter
interpreter.chat() # Create multiple instances
from interpreter import Interpreter
agent_1 = Interpreter()
agent_2 = Interpreter(all, parameters, from, config, can, be, set, here)
agent_1.chat() (Then we'd stop inheriting from the |
should be less than 20 minutes. just change the init, and revert the tests. i will just copy and paste them in the tests.py, so it may show a ton of changes, but that's all it is. |
Creating multiple Interpreter instances with generator function with docker code execution would be amazing to have.
|
Just a quick Q, how have you addressed, for example, when the interpreter wants to show plots with |
@Falven This is a difficult thing to implement, and no I have not addressed it. however, How this works is I connect to the docker container via it's allocated socket from the dockerEngine API. then we simply stream the stdout and stderr streams from the container. So it may work, I frankly don't know. however, I do want to make a way DTL to allow the model to give a URL, possibly in markdown, which we could likely use to show the image. however that is beyond the scope of this pr. |
Great job @unaidedelf8777, however, after looking through the extensive list of changes I would have to agree with @KillianLucas here. There's a lot of complexity in changes in here and abstractions over infrastructure responsibilities - you are tyring to do too much. It seems like you're trying to tackle a lot of issues at the application level and not tackling some of the more basic issues that are needed for a containerized deployment. The file download and upload is a nice addition, but I don't understand, for example, the cleanup. Docker containers are stateless, once the container is shut down nothing should be persisted anyway unless you mount volume or something (even then a better solution would be temporary storage). You're also trying to manage docker through the application, even building or running the image; that's should NOT be the responsibility of the application. These kinds of things are usually handled at an orchestration or infrastructure level, creating a container instance per user for a session, etc. Think: a Kubernetes cluster with some dedicated ephemeral storage for users to upload their files for the duration of the session. I can't stress this part enough:
Getting the basic scenarios working - running in a container completely abstracted from the application (e.g. not needing to start the container from the open-interpreter cli) and being able to send and receive messages to the container via a socket or websocket connection. Informing the assistant through the prefix prompt once it is running in a container such that it will know it can only return data using stdout and stderr (and not try to use plt.show), having the assistant return links to generated files or images, and then maybe adding a couple of configurable endpoints to upload and download files. Again, I appreciate your efforts and contribution, not trying to just bash on you here. Hope I'm not missing anything. |
Wow! What a PR and conversation. It's a shame that so much has changed since this PR was created. @KillianLucas I suggest that we close this PR. I think the idea behind it should be discussed again but unfortunately, it seems like refactoring/updating this PR will be more work than starting from scratch. |
@MikeBirdTech, yeah. Although, the base classes for the daemon connections / stderr stdout streams should be quite portable, so feel free to reuse them. if you have questions ab them, feel free to ask me; I know there documented terribly lol. |
💯 |
Agreed and thank you @unaidedelf8777, the upload stuff is brilliant, great handling of the streams. Honestly I wonder if this can be exported to its own project, a sort of better Docker experience with wide pipes going in/out..? |
Changes Made
Added Containerized Execution Support
Docker Integration: Introduced a new class
DockerProcWrapper
that mimics the interface of Python'ssubprocess.Popen
for seamless integration.Session Management: Each Docker container and code interpreter is assigned a unique session ID, which also specifies a mount point inside the container at
/mnt/data
.Automatic Cleanup: Utilized Python's
atexit
module to ensure that all session-specific folders are deleted when the application exits.Dependency: This feature introduces a new dependency—
docker
from PyPI—and requires Docker Engine to be installed on the host machine of the container.Activation: Use
.contain
attribute of the interpreter class or the CLI flag--use_container
to enable this feature.Remote Execution: Allows Docker containers to be run on remote servers by setting the
DOCKER_HOST
environment variable. This is particularly useful for businesses that wish to keep data on company servers.Customizable Container Images
Dockerfiles: Added a
dockerfiles
directory containing the Dockerfile for building the runtime container. The default image is based onDebian
and includes package managers for Node, R, and Python.Runtime Customization: Introduced a utility class for users to add or remove languages and dependencies to/from the runtime container.
Magic commands
%upload [local_file_path]
: uploads a file or directory to the container via docker engine api's.%download [path_in_container]
: download a specific file from the container given the path in the container ( default workdir in container is /mnt/data ). NOTE: in my next PR I plan to make it so the Language model can give download links to files, just like in ChatGPT's code interpreter; but this requires some UI work.Self-Review
Testing
( any system which supports docker engine )
AI Language Model Utilized
( Used gpt-3/4 for testing containerized execution. it is implicit that the other language models OI supports are supported by this feature.)