diff --git a/docs/trouble_shooting.md b/docs/trouble_shooting.md
index 8d79e140..b82cde29 100644
--- a/docs/trouble_shooting.md
+++ b/docs/trouble_shooting.md
@@ -1,103 +1,57 @@
-# Trouble shooting
-
-## When `flux` fails:
-
-### Step-by-Step Guide to Create a Custom Jupyter Kernel for Flux
-
-#### Step 1: Create a New Kernel Specification
-
-1. Install [`flux-core`](https://anaconda.org/conda-forge/flux-core) in your Jupyter environment:
-
-   ```bash
-   conda install -c conda-forge flux-core
-   ```
-
-2. **Find the Jupyter Kernel Directory**:
-
-   Open your terminal or command prompt and run:
-
-   ```bash
-   jupyter --paths
-   ```
-
-   This command will display the paths where Jupyter looks for kernels. You'll usually find a directory named `kernels` under the `jupyter` data directory. You will create a new directory for the Flux kernel in the `kernels` directory.
-
-3. **Create the Kernel Directory**:
-
-   Navigate to the kernels directory (e.g., `~/.local/share/jupyter/kernels` on Linux or macOS) and create a new directory called `flux`.
-
-   ```bash
-   mkdir -p ~/.local/share/jupyter/kernels/flux
-   ```
-
-   If you're using Windows, the path will be different, such as `C:\Users\<YourUsername>\AppData\Roaming\jupyter\kernels`.
-
-4. **Create the `kernel.json` File**:
-
-   Inside the new `flux` directory, create a file named `kernel.json`:
-
-   ```bash
-   nano ~/.local/share/jupyter/kernels/flux/kernel.json
-   ```
-
-   Paste the following content into the file:
-
-   ```json
-   {
-     "argv": [
-       "flux",
-       "start",
-       "/srv/conda/envs/notebook/bin/python",
-       "-m",
-       "ipykernel_launcher",
-       "-f",
-       "{connection_file}"
-     ],
-     "display_name": "Flux",
-     "language": "python",
-     "metadata": {
-       "debugger": true
-     }
-   }
-   ```
-
-   - **`argv`**: This array specifies the command to start the Jupyter kernel. It uses `flux start` to launch Python in the Flux environment.
-   - **`display_name`**: The name displayed in Jupyter when selecting the kernel.
-   - **`language`**: The programming language (`python`).
-
-   **Note**:
-
-   - Make sure to replace `"/srv/conda/envs/notebook/bin/python"` with the correct path to your Python executable. You can find this by running `which python` or `where python` in your terminal.
-   - If you installed `flux` in a specific environment, you have to write the absolute path to `flux` in the `argv` array.
-
-#### Step 2: Restart Jupyter Notebook
-
-1. **Restart the Jupyter Notebook Server**:
-
-   Close the current Jupyter Notebook server and restart it:
-
-   ```bash
-   jupyter notebook
-   ```
-
-   ```bash
-   jupyter lab
-   ```
-
-    Or simply restart your server.
-
-2. **Select the Flux Kernel**:
-
-   When creating a new notebook or changing the kernel of an existing one, you should see an option for "Flux" in the list of available kernels. Select it to run your code with the Flux environment.
-
-#### Step 3: Run Your Code with `FluxExecutor`
-
-Now, your Jupyter environment is set up to use `flux-core`. You can run your code like this:
-
-```python
-import flux.job
-
-# Use FluxExecutor within the Flux kernel
-with flux.job.FluxExecutor() as flux_exe:
-    print("FluxExecutor is running within the Jupyter Notebook")
-```
+# Trouble Shooting
+Some of the most frequent issues are covered below, for everything else do not be shy and [open an issue on Github](https://github.com/pyiron/executorlib/issues).
+
+## Filesystem Usage
+The cache of executorlib is not removed after the Python process completed. So it is the responsibility of the user to 
+clean up the cache directory they created. This can be easily forgot, so it is important to check for remaining cache 
+directories from time to time and remove them. 
+
+## Firewall Issues
+MacOS comes with a rather strict firewall, which does not allow to connect to an MacOS computer using the hostname even
+if it is the hostname of the current computer. MacOS only supports connections based on the hostname `localhost`. To use
+`localhost` rather than the hostname to connect to the Python processes executorlib uses for the execution of the Python
+function, executorlib provides the option to set `hostname_localhost=True`. For MacOS this option is enabled by default,
+still if other operating systems implement similar strict firewall rules, the option can also be set manually to enabled
+local mode on computers with strict firewall rules.
+
+## Message Passing Interface
+To use the message passing interface (MPI) executorlib requires [mpi4py](https://mpi4py.readthedocs.io/) as optional 
+dependency. The installation of this and other optional dependencies is covered in the [installation section]().
+
+## Missing Dependencies
+The default installation of executorlib only comes with a limited number of dependencies, especially the [zero message queue](https://zeromq.org)
+and [cloudpickle](https://github.com/cloudpipe/cloudpickle). Additional features like [caching](), [HPC submission mode]() 
+and [HPC allocation mode]() require additional dependencies. The dependencies are explained in more detail in the 
+[installation section]().
+
+## Python Version 
+Executorlib supports all current Python version ranging from 3.9 to 3.13. Still some of the dependencies and especially 
+the [flux](http://flux-framework.org) job scheduler are currently limited to Python 3.12 and below. Consequently for high
+performance computing installations Python 3.12 is the recommended Python verion. 
+
+## Resource Dictionary
+The resource dictionary parameter `resource_dict` can contain one or more of the following options: 
+* `cores_per_worker` (int): number of MPI cores to be used for each function call
+* `threads_per_core` (int): number of OpenMP threads to be used for each function call
+* `gpus_per_worker` (int): number of GPUs per worker - defaults to 0
+* `cwd` (str/None): current working directory where the parallel python task is executed
+* `openmpi_oversubscribe` (bool): adds the `--oversubscribe` command line flag (OpenMPI and SLURM only) - default False
+* `slurm_cmd_args` (list): Additional command line arguments for the srun call (SLURM only)
+
+For the special case of the [HPC allocation mode]() the resource dictionary parameter `resource_dict` can also include 
+additional parameters define in the submission script of the [Python simple queuing system adatper (pysqa)](https://pysqa.readthedocs.io)
+these include but are not limited to: 
+* `run_time_max` (int): the maximum time the execution of the submitted Python function is allowed to take in seconds.
+* `memory_max` (int): the maximum amount of memory the Python function is allowed to use in Gigabytes. 
+* `partition` (str): the partition of the queuing system the Python function is submitted to. 
+* `queue` (str): the name of the queue the Python function is submitted to. 
+
+All parameters in the resource dictionary `resource_dict` are optional. 
+
+## SSH Connection
+While the [Python simple queuing system adatper (pysqa)](https://pysqa.readthedocs.io) provides the option to connect to
+high performance computing (HPC) clusters via SSH, this functionality is not supported for executorlib. The background 
+is the use of [cloudpickle](https://github.com/cloudpipe/cloudpickle) for serialization inside executorlib, this requires
+the same Python version and dependencies on both computer connected via SSH. As tracking those parameters is rather 
+complicated the SSH connection functionality of [pysqa](https://pysqa.readthedocs.io) is not officially supported in 
+executorlib. 
\ No newline at end of file
diff --git a/notebooks/4-developer.ipynb b/notebooks/4-developer.ipynb
index 34774b43..aa2427bb 100644
--- a/notebooks/4-developer.ipynb
+++ b/notebooks/4-developer.ipynb
@@ -5,16 +5,31 @@
    "id": "511b34e0-12af-4437-8915-79f033fe7cda",
    "metadata": {},
    "source": [
-    "# Developer"
+    "# Developer\n",
+    "executorlib is designed to work out of the box for up-scaling Python functions and distribute them on a high performance computing (HPC) cluster. Most users should only import the `Executor` class from executorlib and should not need to use any of the internal functionality covered in this section. Still for more advanced applications beyond the submission of Python functions executorlib provides additional functionality. The functionality in this section is not officially supported and might change in future versions without further notice. "
    ]
   },
   {
    "cell_type": "markdown",
-   "id": "18f36227-4389-44b8-9968-93b55101c642",
+   "id": "7bc073aa-6036-48e7-9696-37af050d438a",
    "metadata": {},
    "source": [
-    "## Serialisation \n",
-    "* `pickle` vs. `cloudpickle`"
+    "## Communication\n",
+    "The key functionality of the `executorlib` package is the up-scaling of python functions with thread based parallelism, \n",
+    "MPI based parallelism or by assigning GPUs to individual python functions. In the background this is realized using a \n",
+    "combination of the [zero message queue](https://zeromq.org) and [cloudpickle](https://github.com/cloudpipe/cloudpickle) \n",
+    "to communicate binary python objects. The `executorlib.standalone.interactive.communication.SocketInterface` is an abstraction of this \n",
+    "interface, which is used in the other classes inside `executorlib` and might also be helpful for other projects. It \n",
+    "comes with a series of utility functions:\n",
+    "\n",
+    "* `executorlib.standalone.interactive.communication.interface_bootup()`: To initialize the interface\n",
+    "* `executorlib.standalone.interactive.communication.interface_connect()`: To connect the interface to another instance\n",
+    "* `executorlib.standalone.interactive.communication.interface_send()`: To send messages via this interface \n",
+    "* `executorlib.standalone.interactive.communication.interface_receive()`: To receive messages via this interface \n",
+    "* `executorlib.standalone.interactive.communication.interface_shutdown()`: To shutdown the interface\n",
+    "\n",
+    "While `executorlib` was initially designed for up-scaling python functions for HPC, the same functionality can be \n",
+    "leveraged to up-scale any executable independent of the programming language it is developed in."
    ]
   },
   {
@@ -22,7 +37,8 @@
    "id": "8754df33-fa95-4ca6-ae02-6669967cf4e7",
    "metadata": {},
    "source": [
-    "## External executables "
+    "## External Executables\n",
+    "On extension beyond the submission of Python functions is the communication with an external executable. This could be any kind of program written in any programming language which does not provide Python bindings so it cannot be represented in Python functions. "
    ]
   },
   {
@@ -30,7 +46,8 @@
    "id": "75af1f8a-7ad7-441f-80a2-5c337484097f",
    "metadata": {},
    "source": [
-    "### Subprocess"
+    "### Subprocess\n",
+    "If the external executable is called only once, then the call to the external executable can be represented in a Python function with the [subprocess](https://docs.python.org/3/library/subprocess.html) module of the Python standard library. In the example below the shell command `echo test` is submitted to the `execute_shell_command()` function, which itself is submitted to the `Executor` class."
    ]
   },
   {
@@ -46,23 +63,15 @@
   {
    "cell_type": "code",
    "execution_count": 2,
-   "id": "14339156-3e0b-4856-aac8-db15089d5e7f",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import subprocess"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 3,
    "id": "f1ecee94-24a6-4bf9-8a3d-d50eba994367",
    "metadata": {},
    "outputs": [],
    "source": [
-    "def submit_shell_command(\n",
+    "def execute_shell_command(\n",
     "    command: list, universal_newlines: bool = True, shell: bool = False\n",
     "):\n",
+    "    import subprocess\n",
+    "    \n",
     "    return subprocess.check_output(\n",
     "        command, universal_newlines=universal_newlines, shell=shell\n",
     "    )"
@@ -70,7 +79,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 4,
+   "execution_count": 3,
    "id": "32ef5b63-3245-4336-ac0e-b4a6673ee362",
    "metadata": {},
    "outputs": [
@@ -86,7 +95,7 @@
    "source": [
     "with Executor(backend=\"local\") as exe:\n",
     "    future = exe.submit(\n",
-    "        submit_shell_command,\n",
+    "        execute_shell_command,\n",
     "        [\"echo\", \"test\"],\n",
     "        universal_newlines=True,\n",
     "        shell=False,\n",
@@ -99,29 +108,219 @@
    "id": "54837938-01e0-4dd3-b989-1133d3318929",
    "metadata": {},
    "source": [
-    "### Interactive"
+    "### Interactive\n",
+    "The more complex case is the interaction with an external executable during the run time of the executable. This can be implemented with executorlib using the block allocation `block_allocation=True` feature. The external executable is started as part of the initialization function `init_function` and then the indivdual functions submitted to the `Executor` class interact with the process which is connected to the external executable. \n",
+    "\n",
+    "Starting with the definition of the executable, in this example it is a simple script which just increases a counter. The script is written in the file `count.py` so it behaves like an external executable, which could also use any other progamming language. "
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": null,
-   "id": "747c1b78-4804-467b-9ac8-8144d8031da3",
+   "execution_count": 4,
+   "id": "dedf138f-3003-4a91-9f92-03983ac7de08",
    "metadata": {},
    "outputs": [],
-   "source": []
+   "source": [
+    "count_script = \"\"\"\\\n",
+    "def count(iterations):\n",
+    "    for i in range(int(iterations)):\n",
+    "        print(i)\n",
+    "    print(\"done\")\n",
+    "\n",
+    "\n",
+    "if __name__ == \"__main__\":\n",
+    "    while True:\n",
+    "        user_input = input()\n",
+    "        if \"shutdown\" in user_input:\n",
+    "            break\n",
+    "        else:\n",
+    "            count(iterations=int(user_input))\n",
+    "\"\"\"\n",
+    "\n",
+    "with open(\"count.py\", \"w\") as f:\n",
+    "    f.writelines(count_script)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "771b5b84-48f0-4989-a2c8-c8dcb4462781",
+   "metadata": {},
+   "source": [
+    "The connection to the external executable is established in the initialization function `init_function` of the `Executor` class. By using the [subprocess](https://docs.python.org/3/library/subprocess.html) module from the standard library two process pipes are created to communicate with the external executable. One process pipe is connected to the standard input `stdin` and the other is connected to the standard output `stdout`. "
+   ]
   },
   {
    "cell_type": "code",
-   "execution_count": null,
-   "id": "3e6fe1ca-200a-4d1b-938d-651838136bd7",
+   "execution_count": 5,
+   "id": "8fe76668-0f18-40b7-9719-de47dacb0911",
    "metadata": {},
    "outputs": [],
-   "source": []
+   "source": [
+    "def init_process():\n",
+    "    import subprocess\n",
+    "    return {\n",
+    "        \"process\": subprocess.Popen(\n",
+    "            [\"python\", \"count.py\"],\n",
+    "            stdin=subprocess.PIPE,\n",
+    "            stdout=subprocess.PIPE,\n",
+    "            universal_newlines=True,\n",
+    "            shell=False,\n",
+    "        )\n",
+    "    }"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "09dde7a1-2b43-4be7-ba36-38200b9fddf0",
+   "metadata": {},
+   "source": [
+    "The interaction function handles the data conversion from the Python datatypes to the strings which can be communicated to the external executable. It is important to always add a new line `\\n` to each command send via the standard input `stdin` to the external executable and afterwards flush the pipe by calling `flush()` on the standard input pipe `stdin`.  "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "7556f2bd-176f-4275-a87d-b5c940267888",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def interact(shell_input, process, lines_to_read=None, stop_read_pattern=None):\n",
+    "    process.stdin.write(shell_input)\n",
+    "    process.stdin.flush()\n",
+    "    lines_count = 0\n",
+    "    output = \"\"\n",
+    "    while True:\n",
+    "        output_current = process.stdout.readline()\n",
+    "        output += output_current\n",
+    "        lines_count += 1\n",
+    "        if stop_read_pattern is not None and stop_read_pattern in output_current:\n",
+    "            break\n",
+    "        elif lines_to_read is not None and lines_to_read == lines_count:\n",
+    "            break\n",
+    "    return output"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "5484b98b-546f-4f2c-8db1-919ce215e228",
+   "metadata": {},
+   "source": [
+    "Finally, to close the process after the external executable is no longer required it is recommended to define a shutdown function, which communicates to the external executable that it should shutdown. In the case of the `count.py` script defined above this is achieved by sending the keyword `shutdown`. "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "id": "d5344d2b-cb53-4d38-8cae-621e3b98bb56",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def shutdown(process):\n",
+    "    process.stdin.write(\"shutdown\\n\")\n",
+    "    process.stdin.flush()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3899467c-dc54-41cb-b05e-b60f5cf97e46",
+   "metadata": {},
+   "source": [
+    "With these utility functions is to possible to communicate with any kind of external executable. Still for the specific implementation of the external executable it might be necessary to adjust the corresponding Python functions. Therefore this functionality is currently limited to developers and not considered a general feature of executorlib. "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "id": "747c1b78-4804-467b-9ac8-8144d8031da3",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "0\n",
+      "1\n",
+      "2\n",
+      "3\n",
+      "done\n",
+      "\n",
+      "None\n"
+     ]
+    }
+   ],
+   "source": [
+    "with Executor(\n",
+    "    max_workers=1,\n",
+    "    init_function=init_process,\n",
+    "    block_allocation=True,\n",
+    ") as exe:\n",
+    "    future = exe.submit(\n",
+    "        interact, shell_input=\"4\\n\", lines_to_read=5, stop_read_pattern=None\n",
+    "    )\n",
+    "    print(future.result())\n",
+    "    future_shutdown = exe.submit(shutdown)\n",
+    "    print(future_shutdown.result())"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "96e56af9-3031-4d7b-9111-d2d031a0a6e4",
+   "metadata": {},
+   "source": [
+    "## License\n",
+    "```\n",
+    "BSD 3-Clause License\n",
+    "\n",
+    "Copyright (c) 2022, Jan Janssen\n",
+    "All rights reserved.\n",
+    "\n",
+    "Redistribution and use in source and binary forms, with or without\n",
+    "modification, are permitted provided that the following conditions are met:\n",
+    "\n",
+    "* Redistributions of source code must retain the above copyright notice, this\n",
+    "  list of conditions and the following disclaimer.\n",
+    "\n",
+    "* Redistributions in binary form must reproduce the above copyright notice,\n",
+    "  this list of conditions and the following disclaimer in the documentation\n",
+    "  and/or other materials provided with the distribution.\n",
+    "\n",
+    "* Neither the name of the copyright holder nor the names of its\n",
+    "  contributors may be used to endorse or promote products derived from\n",
+    "  this software without specific prior written permission.\n",
+    "\n",
+    "THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS \"AS IS\"\n",
+    "AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE\n",
+    "IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE\n",
+    "DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE\n",
+    "FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL\n",
+    "DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR\n",
+    "SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER\n",
+    "CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,\n",
+    "OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\n",
+    "OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n",
+    "```"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c2522d54-a00b-49ae-81e4-69e8fa05c9c3",
+   "metadata": {},
+   "source": [
+    "## Modules\n",
+    "While it is not recommended to link to specific internal components of executorlib in external Python packages but rather only the `Executor` class should be used as central interface to executorlib, the internal architecture is briefly outlined below. \n",
+    "* `backend` - the backend module contains the functionality for the Python processes created by executorlib to execute the submitted Python functions.\n",
+    "* `base` - the base module contains the definition of the executorlib `ExecutorBase` class which is internally used to create the different interfaces. To compare if an given `Executor` class is based on executorlib compare with the `ExecutorBase` class which can be imported as `from executorlib.base.executor import ExecutorBase`.\n",
+    "* `cache` - the cache module defines the file based communication for the [HPC submission mode]().\n",
+    "* `interactive` - the interactive modules defines the [zero message queue](https://zeromq.org) based communication for the [local mode]() and the [HPC allocation mode]().\n",
+    "* `standalone` - the standalone module contains a number of utility functions which only depend on external libraries and do not have any internal dependency to other parts of `executorlib`. This includes the functionality to generate executable commands, the [h5py](https://www.h5py.org) based interface for caching, a number of input checks, routines to plot the dependencies of a number of future objects, functionality to interact with the [queues defined in the Python standard library](https://docs.python.org/3/library/queue.html), the interface for serialization based on [cloudpickle](https://github.com/cloudpipe/cloudpickle) and finally an extension to the [threading](https://docs.python.org/3/library/threading.html) of the Python standard library.\n",
+    "\n",
+    "Given the level of separation the integration of submodules from the standalone module in external software packages should be the easiest way to benefit from the developments in executorlib beyond just using the `Executor` class. "
+   ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "1de93586-d302-4aa6-878a-51acfb1d3009",
+   "id": "39096340-f169-4438-b9c6-90c48ea37e4d",
    "metadata": {},
    "outputs": [],
    "source": []
@@ -143,7 +342,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.11.10"
+   "version": "3.12.5"
   }
  },
  "nbformat": 4,
diff --git a/notebooks/examples.ipynb b/notebooks/examples.ipynb
deleted file mode 100644
index 1546dcb8..00000000
--- a/notebooks/examples.ipynb
+++ /dev/null
@@ -1,771 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "id": "c31c95fe-9af4-42fd-be2c-713afa380e09",
-   "metadata": {},
-   "source": [
-    "# Examples\n",
-    "The `executorlib.Executor` extends the interface of the [`concurrent.futures.Executor`](https://docs.python.org/3/library/concurrent.futures.html#module-concurrent.futures)\n",
-    "to simplify the up-scaling of individual functions in a given workflow."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "a1c6370e-7c8a-4da2-ac7d-42a36e12b27c",
-   "metadata": {},
-   "source": "## Compatibility\nStarting with the basic example of `1+1=2`. With the `ThreadPoolExecutor` from the [`concurrent.futures`](https://docs.python.org/3/library/concurrent.futures.html#module-concurrent.futures)\nstandard library this can be written as: "
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 1,
-   "id": "8b663009-60af-4d71-8ef3-2e9c6cd79cce",
-   "metadata": {
-    "trusted": true
-   },
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": "2\n"
-    }
-   ],
-   "source": [
-    "from concurrent.futures import ThreadPoolExecutor\n",
-    "\n",
-    "with ThreadPoolExecutor(max_workers=1) as exe:\n",
-    "    future = exe.submit(sum, [1, 1])\n",
-    "    print(future.result())"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "56192fa7-bbd6-43fe-8598-ff764addfbac",
-   "metadata": {},
-   "source": "In this case `max_workers=1` limits the number of threads used by the `ThreadPoolExecutor` to one. Then the `sum()`\nfunction is submitted to the executor with a list with two ones `[1, 1]` as input. A [`concurrent.futures.Future`](https://docs.python.org/3/library/concurrent.futures.html#module-concurrent.futures)\nobject is returned. The `Future` object allows to check the status of the execution with the `done()` method which \nreturns `True` or `False` depending on the state of the execution. Or the main process can wait until the execution is \ncompleted by calling `result()`. \n\nThe result of the calculation is `1+1=2`. "
-  },
-  {
-   "cell_type": "markdown",
-   "id": "99aba5f3-5667-450c-b31f-2b53918b1896",
-   "metadata": {},
-   "source": [
-    "The `executorlib.Executor` class extends the interface of the [`concurrent.futures.Executor`](https://docs.python.org/3/library/concurrent.futures.html#module-concurrent.futures)\n",
-    "class by providing more parameters to specify the level of parallelism. In addition, to specifying the maximum number \n",
-    "of workers `max_workers` the user can also specify the number of cores per worker `cores_per_worker` for MPI based \n",
-    "parallelism, the number of threads per core `threads_per_core` for thread based parallelism and the number of GPUs per\n",
-    "worker `gpus_per_worker`. Finally, for those backends which support over-subscribing this can also be enabled using the \n",
-    "`oversubscribe` parameter. All these parameters are optional, so the `executorlib.Executor` can be used as a drop-in\n",
-    "replacement for the [`concurrent.futures.Executor`](https://docs.python.org/3/library/concurrent.futures.html#module-concurrent.futures).\n",
-    "\n",
-    "The previous example is rewritten for the `executorlib.Executor` in:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "2ed59582cab0eb29",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from executorlib import Executor\n",
-    "\n",
-    "with Executor(max_cores=1, backend=\"flux_allocation\") as exe:\n",
-    "    future = exe.submit(sum, [1, 1])\n",
-    "    print(future.result())"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "e1ae417273ebf0f5",
-   "metadata": {},
-   "source": "The result of the calculation is again `1+1=2`."
-  },
-  {
-   "cell_type": "markdown",
-   "id": "bcf8a85c015d55da",
-   "metadata": {},
-   "source": [
-    "Beyond pre-defined functions like the `sum()` function, the same functionality can be used to submit user-defined \n",
-    "functions. In the next example a custom summation function is defined:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "70ff8c30cc13bfd5",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from executorlib import Executor\n",
-    "\n",
-    "\n",
-    "def calc(*args):\n",
-    "    return sum(*args)\n",
-    "\n",
-    "\n",
-    "with Executor(max_cores=2, backend=\"flux_allocation\") as exe:\n",
-    "    fs_1 = exe.submit(calc, [2, 1])\n",
-    "    fs_2 = exe.submit(calc, [2, 2])\n",
-    "    fs_3 = exe.submit(calc, [2, 3])\n",
-    "    fs_4 = exe.submit(calc, [2, 4])\n",
-    "    print(\n",
-    "        [\n",
-    "            fs_1.result(),\n",
-    "            fs_2.result(),\n",
-    "            fs_3.result(),\n",
-    "            fs_4.result(),\n",
-    "        ]\n",
-    "    )"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "495e6e17964fe936",
-   "metadata": {},
-   "source": [
-    "In contrast to the previous example where just a single function was submitted to a single worker, in this case a total\n",
-    "of four functions is submitted to a group of two workers `max_cores=2`. Consequently, the functions are executed as a\n",
-    "set of two pairs.\n",
-    "\n",
-    "It returns the corresponding sums as expected. The same can be achieved with the built-in [`concurrent.futures.Executor`](https://docs.python.org/3/library/concurrent.futures.html#module-concurrent.futures)\n",
-    "classes. Still one advantage of using the `executorlib.Executor` rather than the built-in ones, is the ability to execute\n",
-    "the same commands in interactive environments like [Jupyter notebooks](https://jupyter.org). This is achieved by using \n",
-    "[cloudpickle](https://github.com/cloudpipe/cloudpickle) to serialize the python function and its parameters rather than\n",
-    "the regular pickle package."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "7f13ea3733327ff8",
-   "metadata": {},
-   "source": [
-    "For backwards compatibility with the [`multiprocessing.Pool`](https://docs.python.org/3/library/multiprocessing.html) \n",
-    "class the [`concurrent.futures.Executor`](https://docs.python.org/3/library/concurrent.futures.html#module-concurrent.futures)\n",
-    "also implements the `map()` function to map a series of inputs to a function. The same `map()` function is also \n",
-    "available in the `executorlib.Executor`:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "c320897f8c44f364",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from executorlib import Executor\n",
-    "\n",
-    "\n",
-    "def calc(*args):\n",
-    "    return sum(*args)\n",
-    "\n",
-    "\n",
-    "with Executor(max_cores=2, backend=\"flux_allocation\") as exe:\n",
-    "    print(list(exe.map(calc, [[2, 1], [2, 2], [2, 3], [2, 4]])))"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "6a22677b67784c97",
-   "metadata": {},
-   "source": "The results remain the same. "
-  },
-  {
-   "cell_type": "markdown",
-   "id": "240ad1f5dc0c43c2",
-   "metadata": {},
-   "source": [
-    "## Resource Assignment\n",
-    "By default, every submission of a python function results in a flux job (or SLURM job step) depending on the backend. \n",
-    "This is sufficient for function calls which take several minutes or longer to execute. For python functions with shorter \n",
-    "run-time `executorlib` provides block allocation (enabled by the `block_allocation=True` parameter) to execute multiple\n",
-    "python functions with similar resource requirements in the same flux job (or SLURM job step). \n",
-    "\n",
-    "The following example illustrates the resource definition on both level. This is redundant. For block allocations the \n",
-    "resources have to be configured on the **Executor level**, otherwise it can either be defined on the **Executor level**\n",
-    "or on the **Submission level**. The resource defined on the **Submission level** overwrite the resources defined on the \n",
-    "**Executor level**."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "631422e52b7f8b1d",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import flux.job\n",
-    "from executorlib import Executor\n",
-    "\n",
-    "\n",
-    "def calc_function(parameter_a, parameter_b):\n",
-    "    return parameter_a + parameter_b\n",
-    "\n",
-    "\n",
-    "with flux.job.FluxExecutor() as flux_exe:\n",
-    "    with Executor(\n",
-    "        # Resource definition on the executor level\n",
-    "        max_workers=2,  # total number of cores available to the Executor\n",
-    "        backend=\"flux_allocation\",  # optional in case the backend is not recognized\n",
-    "        # Optional resource definition\n",
-    "        resource_dict={\n",
-    "            \"cores\": 1,\n",
-    "            \"threads_per_core\": 1,\n",
-    "            \"gpus_per_core\": 0,\n",
-    "            \"cwd\": \"/home/jovyan/notebooks\",\n",
-    "            \"openmpi_oversubscribe\": False,\n",
-    "            \"slurm_cmd_args\": [],\n",
-    "        },\n",
-    "        flux_executor=flux_exe,\n",
-    "        flux_executor_pmi_mode=None,\n",
-    "        flux_executor_nesting=False,\n",
-    "        hostname_localhost=None,  # only required on MacOS\n",
-    "        block_allocation=False,  # reuse existing processes with fixed resources\n",
-    "        init_function=None,  # only available with block_allocation=True\n",
-    "        disable_dependencies=False,  # disable dependency check for faster execution\n",
-    "        refresh_rate=0.01,  # for refreshing the dependencies\n",
-    "        plot_dependency_graph=False,  # visualize dependencies for debugging\n",
-    "    ) as exe:\n",
-    "        future_obj = exe.submit(\n",
-    "            calc_function,\n",
-    "            1,  # parameter_a\n",
-    "            parameter_b=2,\n",
-    "            # Resource definition on the submission level - optional\n",
-    "            resource_dict={\n",
-    "                \"cores\": 1,\n",
-    "                \"threads_per_core\": 1,\n",
-    "                \"gpus_per_core\": 0,  # here it is gpus_per_core rather than gpus_per_worker\n",
-    "                \"cwd\": \"/home/jovyan/notebooks\",\n",
-    "                \"openmpi_oversubscribe\": False,\n",
-    "                # \"slurm_cmd_args\": [],  # additional command line arguments for SLURM\n",
-    "                \"flux_executor\": flux_exe,\n",
-    "                \"flux_executor_pmi_mode\": None,\n",
-    "                \"flux_executor_nesting\": False,\n",
-    "                \"hostname_localhost\": None,  # only required on MacOS\n",
-    "            },\n",
-    "        )\n",
-    "        print(future_obj.result())"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "ab12ff4ebd5efb98",
-   "metadata": {},
-   "source": [
-    "The `max_cores` which defines the total number of cores of the allocation, is the only mandatory parameter. All other\n",
-    "resource parameters are optional. If none of the submitted Python function uses [mpi4py](https://mpi4py.readthedocs.io)\n",
-    "or any GPU, then the resources can be defined on the **Executor level** as: `cores_per_worker=1`, `threads_per_core=1` \n",
-    "and `gpus_per_worker=0`. These are defaults, so they do even have to be specified. In this case it also makes sense to \n",
-    "enable `block_allocation=True` to continuously use a fixed number of python processes rather than creating a new python\n",
-    "process for each submission. In this case the above example can be reduced to: "
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "efe054c93d835e4a",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from executorlib import Executor\n",
-    "\n",
-    "\n",
-    "def calc_function(parameter_a, parameter_b):\n",
-    "    return parameter_a + parameter_b\n",
-    "\n",
-    "\n",
-    "with Executor(\n",
-    "    # Resource definition on the executor level\n",
-    "    max_cores=2,  # total number of cores available to the Executor\n",
-    "    block_allocation=True,  # reuse python processes\n",
-    "    backend=\"flux_allocation\",\n",
-    ") as exe:\n",
-    "    future_obj = exe.submit(\n",
-    "        calc_function,\n",
-    "        1,  # parameter_a\n",
-    "        parameter_b=2,\n",
-    "    )\n",
-    "    print(future_obj.result())"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "c6983f28b18f831b",
-   "metadata": {},
-   "source": [
-    "The working directory parameter `cwd` can be helpful for tasks which interact with the file system to define which task\n",
-    "is executed in which folder, but for most python functions it is not required."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "3bf7af3ce2388f75",
-   "metadata": {},
-   "source": [
-    "## Data Handling\n",
-    "A limitation of many parallel approaches is the overhead in communication when working with large datasets. Instead of\n",
-    "reading the same dataset repetitively, the `executorlib.Executor` in block allocation mode (`block_allocation=True`) loads the dataset only once per worker and afterwards\n",
-    "each function submitted to this worker has access to the dataset, as it is already loaded in memory. To achieve this\n",
-    "the user defines an initialization function `init_function` which returns a dictionary with one key per dataset. The \n",
-    "keys of the dictionary can then be used as additional input parameters in each function submitted to the `executorlib.Executor`. When block allocation is disabled this functionality is not available, as each function is executed in a separate process, so no data can be preloaded.\n",
-    "\n",
-    "This functionality is illustrated below: "
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "74552573e3e3d3d9",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from executorlib import Executor\n",
-    "\n",
-    "\n",
-    "def calc(i, j, k):\n",
-    "    return i + j + k\n",
-    "\n",
-    "\n",
-    "def init_function():\n",
-    "    return {\"j\": 4, \"k\": 3, \"l\": 2}\n",
-    "\n",
-    "\n",
-    "with Executor(\n",
-    "    max_cores=1,\n",
-    "    init_function=init_function,\n",
-    "    backend=\"flux_allocation\",\n",
-    "    block_allocation=True,\n",
-    ") as exe:\n",
-    "    fs = exe.submit(calc, 2, j=5)\n",
-    "    print(fs.result())"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "c71bc876a65349cf",
-   "metadata": {},
-   "source": [
-    "The function `calc()` requires three inputs `i`, `j` and `k`. But when the function is submitted to the executor only \n",
-    "two inputs are provided `fs = exe.submit(calc, 2, j=5)`. In this case the first input parameter is mapped to `i=2`, the\n",
-    "second input parameter is specified explicitly `j=5` but the third input parameter `k` is not provided. So the \n",
-    "`executorlib.Executor` automatically checks the keys set in the `init_function()` function. In this case the returned\n",
-    "dictionary `{\"j\": 4, \"k\": 3, \"l\": 2}` defines `j=4`, `k=3` and `l=2`. For this specific call of the `calc()` function,\n",
-    "`i` and `j` are already provided so `j` is not required, but `k=3` is used from the `init_function()` and as the `calc()`\n",
-    "function does not define the `l` parameter this one is also ignored. \n",
-    "\n",
-    "The result is `2+5+3=10` as `i=2` and `j=5` are provided during the submission and `k=3` is defined in the `init_function()`\n",
-    "function."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "a4d4d5447e68a834",
-   "metadata": {},
-   "source": [
-    "## Up-Scaling \n",
-    "[flux](https://flux-framework.org) provides fine-grained resource assigment via `libhwloc` and `pmi`."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "ad6fec651dfbc263",
-   "metadata": {},
-   "source": [
-    "### Thread-based Parallelism\n",
-    "The number of threads per core can be controlled with the `threads_per_core` parameter during the initialization of the \n",
-    "`executorlib.Executor`. Unfortunately, there is no uniform way to control the number of cores a given underlying library\n",
-    "uses for thread based parallelism, so it might be necessary to set certain environment variables manually: \n",
-    "\n",
-    "* `OMP_NUM_THREADS`: for openmp\n",
-    "* `OPENBLAS_NUM_THREADS`: for openblas\n",
-    "* `MKL_NUM_THREADS`: for mkl\n",
-    "* `VECLIB_MAXIMUM_THREADS`: for accelerate on Mac Os X\n",
-    "* `NUMEXPR_NUM_THREADS`: for numexpr\n",
-    "\n",
-    "At the current stage `executorlib.Executor` does not set these parameters itself, so you have to add them in the function\n",
-    "you submit before importing the corresponding library: \n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "1fbcc6242f13973b",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "def calc(i):\n",
-    "    import os\n",
-    "\n",
-    "    os.environ[\"OMP_NUM_THREADS\"] = \"2\"\n",
-    "    os.environ[\"OPENBLAS_NUM_THREADS\"] = \"2\"\n",
-    "    os.environ[\"MKL_NUM_THREADS\"] = \"2\"\n",
-    "    os.environ[\"VECLIB_MAXIMUM_THREADS\"] = \"2\"\n",
-    "    os.environ[\"NUMEXPR_NUM_THREADS\"] = \"2\"\n",
-    "    import numpy as np\n",
-    "\n",
-    "    return i"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "aadd8aa9902d854e",
-   "metadata": {},
-   "source": [
-    "Most modern CPUs use hyper-threading to present the operating system with double the number of virtual cores compared to\n",
-    "the number of physical cores available. So unless this functionality is disabled `threads_per_core=2` is a reasonable \n",
-    "default. Just be careful if the number of threads is not specified it is possible that all workers try to access all \n",
-    "cores at the same time which can lead to poor performance. So it is typically a good idea to monitor the CPU utilization\n",
-    "with increasing number of workers. \n",
-    "\n",
-    "Specific manycore CPU models like the Intel Xeon Phi processors provide a much higher hyper-threading ration and require\n",
-    "a higher number of threads per core for optimal performance. \n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "d19861a257e40fc3",
-   "metadata": {},
-   "source": [
-    "### MPI Parallel Python Functions\n",
-    "Beyond thread based parallelism, the message passing interface (MPI) is the de facto standard parallel execution in \n",
-    "scientific computing and the [`mpi4py`](https://mpi4py.readthedocs.io) bindings to the MPI libraries are commonly used\n",
-    "to parallelize existing workflows. The limitation of this approach is that it requires the whole code to adopt the MPI\n",
-    "communication standards to coordinate the way how information is distributed. Just like the `executorlib.Executor` the\n",
-    "[`mpi4py.futures.MPIPoolExecutor`](https://mpi4py.readthedocs.io/en/stable/mpi4py.futures.html#mpipoolexecutor) \n",
-    "implements the [`concurrent.futures.Executor`](https://docs.python.org/3/library/concurrent.futures.html#module-concurrent.futures)\n",
-    "interface. Still in this case eah python function submitted to the executor is still limited to serial execution. The\n",
-    "novel approach of the `executorlib.Executor` is mixing these two types of parallelism. Individual functions can use\n",
-    "the [`mpi4py`](https://mpi4py.readthedocs.io) library to handle the parallel execution within the context of this \n",
-    "function while these functions can still me submitted to the `executorlib.Executor` just like any other function. The\n",
-    "advantage of this approach is that the users can parallelize their workflows one function at the time. \n",
-    "\n",
-    "The example in `test_mpi.py` illustrates the submission of a simple MPI parallel python function: "
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "e00d8448d882dfd5",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from executorlib import Executor\n",
-    "\n",
-    "\n",
-    "def calc(i):\n",
-    "    from mpi4py import MPI\n",
-    "\n",
-    "    size = MPI.COMM_WORLD.Get_size()\n",
-    "    rank = MPI.COMM_WORLD.Get_rank()\n",
-    "    return i, size, rank\n",
-    "\n",
-    "\n",
-    "with Executor(\n",
-    "    max_cores=2,\n",
-    "    resource_dict={\"cores\": 2},\n",
-    "    backend=\"flux_allocation\",\n",
-    "    flux_executor_pmi_mode=\"pmix\",\n",
-    ") as exe:\n",
-    "    fs = exe.submit(calc, 3)\n",
-    "    print(fs.result())"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "35c49013c2de3907",
-   "metadata": {},
-   "source": [
-    "In the example environment OpenMPI version 5 is used, so the `pmi` parameter has to be set to `pmix` rather than `pmi1` or `pmi2` which is the default. For `mpich` it is not necessary to specify the `pmi` interface manually.\n",
-    "The `calc()` function initializes the [`mpi4py`](https://mpi4py.readthedocs.io) library and gathers the size of the \n",
-    "allocation and the rank of the current process within the MPI allocation. This function is then submitted to an \n",
-    "`executorlib.Executor` which is initialized with a single worker with two cores `cores_per_worker=2`. So each function\n",
-    "call is going to have access to two cores. \n",
-    "\n",
-    "Just like before the script can be called with any python interpreter even though it is using the [`mpi4py`](https://mpi4py.readthedocs.io)\n",
-    "library in the background it is not necessary to execute the script with `mpiexec` or `mpirun`.\n",
-    "\n",
-    "The response consists of a list of two tuples, one for each MPI parallel process, with the first entry of the tuple \n",
-    "being the parameter `i=3`, followed by the number of MPI parallel processes assigned to the function call `cores_per_worker=2`\n",
-    "and finally the index of the specific process `0` or `1`. "
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "6960ccc01268e1f7",
-   "metadata": {},
-   "source": [
-    "### GPU Assignment\n",
-    "With the rise of machine learning applications, the use of GPUs for scientific application becomes more and more popular.\n",
-    "Consequently, it is essential to have full control over the assignment of GPUs to specific python functions. In the \n",
-    "`test_gpu.py` example the `tensorflow` library is used to identify the GPUs and return their configuration: "
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "db3727c5da7072cd",
-   "metadata": {},
-   "source": [
-    "```\n",
-    "import socket\n",
-    "from executorlib import Executor\n",
-    "from tensorflow.python.client import device_lib\n",
-    "\n",
-    "def get_available_gpus():\n",
-    "    local_device_protos = device_lib.list_local_devices()\n",
-    "    return [\n",
-    "        (x.name, x.physical_device_desc, socket.gethostname()) \n",
-    "        for x in local_device_protos if x.device_type == 'GPU'\n",
-    "    ]\n",
-    "\n",
-    "with Executor(\n",
-    "    max_workers=2, \n",
-    "    gpus_per_worker=1,\n",
-    "    backend=\"flux_allocation\",\n",
-    ") as exe:\n",
-    "    fs_1 = exe.submit(get_available_gpus)\n",
-    "    fs_2 = exe.submit(get_available_gpus)\n",
-    "    print(fs_1.result(), fs_2.result())\n",
-    "```"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "e7ccb6c390b33c73",
-   "metadata": {},
-   "source": [
-    "The additional parameter `gpus_per_worker=1` specifies that one GPU is assigned to each worker. This functionality \n",
-    "requires `executorlib` to be connected to a resource manager like the [SLURM workload manager](https://www.schedmd.com)\n",
-    "or preferably the [flux framework](https://flux-framework.org). The rest of the script follows the previous examples, \n",
-    "as two functions are submitted and the results are printed. \n",
-    "\n",
-    "To clarify the execution of such an example on a high performance computing (HPC) cluster using the [SLURM workload manager](https://www.schedmd.com)\n",
-    "the submission script is given below: "
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "8aa7df69d42b5b74",
-   "metadata": {},
-   "source": [
-    "```\n",
-    "#!/bin/bash\n",
-    "#SBATCH --nodes=2\n",
-    "#SBATCH --gpus-per-node=1\n",
-    "#SBATCH --get-user-env=L\n",
-    "\n",
-    "python test_gpu.py\n",
-    "```"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "8a6636284ba16750",
-   "metadata": {},
-   "source": [
-    "The important part is that for using the `executorlib.slurm.PySlurmExecutor` backend the script `test_gpu.py` does not\n",
-    "need to be executed with `srun` but rather it is sufficient to just execute it with the python interpreter. `executorlib`\n",
-    "internally calls `srun` to assign the individual resources to a given worker. \n",
-    "\n",
-    "For the more complex setup of running the [flux framework](https://flux-framework.org) as a secondary resource scheduler\n",
-    "within the [SLURM workload manager](https://www.schedmd.com) it is essential that the resources are passed from the \n",
-    "[SLURM workload manager](https://www.schedmd.com) to the [flux framework](https://flux-framework.org). This is achieved\n",
-    "by calling `srun flux start` in the submission script: "
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "888454c1532ad432",
-   "metadata": {},
-   "source": [
-    "```\n",
-    "#!/bin/bash\n",
-    "#SBATCH --nodes=2\n",
-    "#SBATCH --gpus-per-node=1\n",
-    "#SBATCH --get-user-env=L\n",
-    "\n",
-    "srun flux start python test_gpu.py\n",
-    "````"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "d1285038563eee32",
-   "metadata": {},
-   "source": [
-    "As a result the GPUs available on the two compute nodes are reported: \n",
-    "```\n",
-    ">>> [('/device:GPU:0', 'device: 0, name: Tesla V100S-PCIE-32GB, pci bus id: 0000:84:00.0, compute capability: 7.0', 'cn138'),\n",
-    ">>>  ('/device:GPU:0', 'device: 0, name: Tesla V100S-PCIE-32GB, pci bus id: 0000:84:00.0, compute capability: 7.0', 'cn139')]\n",
-    "```\n",
-    "In this case each compute node `cn138` and `cn139` is equipped with one `Tesla V100S-PCIE-32GB`.\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "df3ff4f3c9ee10b8",
-   "metadata": {},
-   "source": [
-    "## Coupled Functions \n",
-    "For submitting two functions with rather different computing resource requirements it is essential to represent this \n",
-    "dependence during the submission process. In `executorlib` this can be achieved by leveraging the separate submission of\n",
-    "individual python functions and including the `concurrent.futures.Future` object of the first submitted function as \n",
-    "input for the second function during the submission. Consequently, this functionality can be used for directed acyclic \n",
-    "graphs, still it does not enable cyclic graphs. As a simple example we can add one to the result of the addition of one\n",
-    "and two:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "1dbc77aadc5b6ed0",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from executorlib import Executor\n",
-    "\n",
-    "\n",
-    "def calc_function(parameter_a, parameter_b):\n",
-    "    return parameter_a + parameter_b\n",
-    "\n",
-    "\n",
-    "with Executor(max_cores=2, backend=\"flux_allocation\") as exe:\n",
-    "    future_1 = exe.submit(\n",
-    "        calc_function,\n",
-    "        1,\n",
-    "        parameter_b=2,\n",
-    "        resource_dict={\"cores\": 1},\n",
-    "    )\n",
-    "    future_2 = exe.submit(\n",
-    "        calc_function,\n",
-    "        1,\n",
-    "        parameter_b=future_1,\n",
-    "        resource_dict={\"cores\": 1},\n",
-    "    )\n",
-    "    print(future_2.result())"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "bd3e6eea-3a77-49ec-8fec-d88274aeeda5",
-   "metadata": {},
-   "source": "Here the first addition `1+2` is computed and the output `3` is returned as the result of `future_1.result()`. Still \nbefore the computation of this addition is completed already the next addition is submitted which uses the future object\nas an input `future_1` and adds `1`. The result of both additions is `4` as `1+2+1=4`. \n\nTo disable this functionality the parameter `disable_dependencies=True` can be set on the executor level. Still at the\ncurrent stage the performance improvement of disabling this functionality seem to be minimal. Furthermore, this \nfunctionality introduces the `refresh_rate=0.01` parameter, it defines the refresh rate in seconds how frequently the \nqueue of submitted functions is queried. Typically, there is no need to change these default parameters. "
-  },
-  {
-   "cell_type": "markdown",
-   "id": "d1086337-5291-4e06-96d1-a6e162d28c58",
-   "metadata": {},
-   "source": [
-    "## SLURM Job Scheduler\n",
-    "Using `executorlib` without the [flux framework](https://flux-framework.org) results in one `srun` call per worker in\n",
-    "`block_allocation=True` mode and one `srun` call per submitted function in `block_allocation=False` mode. As each `srun`\n",
-    "call represents a request to the central database of SLURM this can drastically reduce the performance, especially for\n",
-    "large numbers of small python functions. That is why the hierarchical job scheduler [flux framework](https://flux-framework.org)\n",
-    "is recommended as secondary job scheduler even within the context of the SLURM job manager. \n",
-    "\n",
-    "Still the general usage of `executorlib` remains similar even with SLURM as backend:"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "27569937-7d99-4697-b3ee-f68c43b95a10",
-   "metadata": {},
-   "source": [
-    "```\n",
-    "from executorlib import Executor\n",
-    "\n",
-    "with Executor(max_cores=1, backend=\"slurm_allocation\") as exe:\n",
-    "    future = exe.submit(sum, [1,1])\n",
-    "    print(future.result())\n",
-    "```"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "ae8dd860-f90f-47b4-b3e5-664f5c949350",
-   "metadata": {},
-   "source": [
-    "The `backend=\"slurm_allocation\"` parameter is optional as `executorlib` automatically recognizes if [flux framework](https://flux-framework.org)\n",
-    "or SLURM are available. \n",
-    "\n",
-    "In addition, the SLURM backend introduces the `command_line_argument_lst=[]` parameter, which allows the user to provide\n",
-    "a list of command line arguments for the `srun` command. "
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "449d2c7a-67ba-449e-8e0b-98a228707e1c",
-   "metadata": {},
-   "source": [
-    "## Workstation Support\n",
-    "While the high performance computing (HPC) setup is limited to the Linux operating system, `executorlib` can also be used\n",
-    "in combination with MacOS and Windows. These setups are limited to a single compute node. \n",
-    "\n",
-    "Still the general usage of `executorlib` remains similar:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 11,
-   "id": "fa147b3b-61df-4884-b90c-544362bc95d9",
-   "metadata": {
-    "trusted": true
-   },
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": "2\n"
-    }
-   ],
-   "source": [
-    "from executorlib import Executor\n",
-    "\n",
-    "with Executor(max_cores=1, backend=\"local\") as exe:\n",
-    "    future = exe.submit(sum, [1, 1], resource_dict={\"cores\": 1})\n",
-    "    print(future.result())"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "0370b42d-237b-4169-862a-b0bac4bb858b",
-   "metadata": {},
-   "source": [
-    "The `backend=\"local\"` parameter is optional as `executorlib` automatically recognizes if [flux framework](https://flux-framework.org)\n",
-    "or SLURM are available. \n",
-    "\n",
-    "Workstations, especially workstations with MacOs can have rather strict firewall settings. This includes limiting the\n",
-    "look up of hostnames and communicating with itself via their own hostname. To directly connect to `localhost` rather\n",
-    "than using the hostname which is the default for distributed systems, the `hostname_localhost=True` parameter is \n",
-    "introduced."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "601852447d3839c4",
-   "metadata": {},
-   "outputs": [],
-   "source": []
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Flux",
-   "language": "python",
-   "name": "flux"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.12.3"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}