diff --git a/notebooks/1-local.ipynb b/notebooks/1-local.ipynb index efd39b38..3ba8a73d 100644 --- a/notebooks/1-local.ipynb +++ b/notebooks/1-local.ipynb @@ -6,9 +6,9 @@ "metadata": {}, "source": [ "# Local Mode\n", - "The local mode in executorlib which is selected by setting the `backend` parameter to `\"local\"` is primarily used to enable rapid prototyping on a workstation computer to test your parallel Python program with executorlib before transferring it to an high performance computer (HPC). With the added capability of executorlib it is typically 10% slower than the [ProcessPoolExecutor](https://docs.python.org/3/library/concurrent.futures.html#processpoolexecutor) from the Python standard library on a single node, when all acceleration features are enabled. This overhead is primarily related to the creation of new tasks. So the performance of executorlib improves when the individual Python function calls require extensive computations. \n", + "The local mode in executorlib, which is selected by setting the `backend` parameter to `\"local\"`, is primarily used to enable rapid prototyping on a workstation computer to test your parallel Python program with executorlib before transferring it to an high performance computer (HPC). When the `backend` parameter is not set, it defaults to `\"local\"`. With the added capability of executorlib it is typically 10% slower than the [ProcessPoolExecutor](https://docs.python.org/3/library/concurrent.futures.html#processpoolexecutor) from the Python standard library on a single node, when all acceleration features are enabled. This overhead is primarily related to the creation of new tasks. So the performance of executorlib improves when the individual Python function calls require extensive computations. \n", "\n", - "On advantage that executorlib has over the [ProcessPoolExecutor](https://docs.python.org/3/library/concurrent.futures.html#processpoolexecutor) and the [ThreadPoolExecutor](https://docs.python.org/3/library/concurrent.futures.html#threadpoolexecutor) from the Python standard libary, is the use of [cloudpickle](https://github.com/cloudpipe/cloudpickle) as serialization backend to transfer Python functions between processes. This enables the use of dynamically defined Python functions for example in the case of a Jupyter notebook. " + "An advantage that executorlib has over the [ProcessPoolExecutor](https://docs.python.org/3/library/concurrent.futures.html#processpoolexecutor) and the [ThreadPoolExecutor](https://docs.python.org/3/library/concurrent.futures.html#threadpoolexecutor) from the Python standard libary, is the use of [cloudpickle](https://github.com/cloudpipe/cloudpickle) as serialization backend to transfer Python functions between processes. This enables the use of dynamically defined Python functions for example in the case of a Jupyter notebook. " ] }, { @@ -49,14 +49,14 @@ "output_type": "stream", "text": [ "2\n", - "CPU times: user 19.9 ms, sys: 14.6 ms, total: 34.5 ms\n", - "Wall time: 210 ms\n" + "CPU times: user 73.1 ms, sys: 49.3 ms, total: 122 ms\n", + "Wall time: 804 ms\n" ] } ], "source": [ "%%time\n", - "with Executor() as exe:\n", + "with Executor(backend=\"local\") as exe:\n", " future = exe.submit(sum, [1, 1])\n", " print(future.result())" ] @@ -80,14 +80,14 @@ "output_type": "stream", "text": [ "[4, 6, 8]\n", - "CPU times: user 7.72 ms, sys: 7.54 ms, total: 15.3 ms\n", - "Wall time: 187 ms\n" + "CPU times: user 44.3 ms, sys: 27 ms, total: 71.3 ms\n", + "Wall time: 1.35 s\n" ] } ], "source": [ "%%time\n", - "with Executor() as exe:\n", + "with Executor(backend=\"local\") as exe:\n", " future_lst = [exe.submit(sum, [i, i]) for i in range(2, 5)]\n", " print([f.result() for f in future_lst])" ] @@ -111,14 +111,14 @@ "output_type": "stream", "text": [ "[10, 12, 14]\n", - "CPU times: user 7.16 ms, sys: 7.72 ms, total: 14.9 ms\n", - "Wall time: 191 ms\n" + "CPU times: user 32.9 ms, sys: 25.2 ms, total: 58.1 ms\n", + "Wall time: 968 ms\n" ] } ], "source": [ "%%time\n", - "with Executor() as exe:\n", + "with Executor(backend=\"local\") as exe:\n", " results = exe.map(sum, [[5, 5], [6, 6], [7, 7]])\n", " print(list(results))" ] @@ -148,7 +148,9 @@ "### MPI Parallel Functions\n", "MPI is the default way to develop parallel programs for HPCs. Still it can be challenging to refactor a previously serial program to efficiently use MPI to achieve optimal computational efficiency for parallel execution, even with libraries like [mpi4py](https://mpi4py.readthedocs.io). To simplify the up-scaling of Python programs executorlib provides the option to use MPI parallel Python code inside a given Python function and then submit this parallel Python function to an `Executor` for evaluation. \n", "\n", - "The following `calc_mpi()` function imports the [mpi4py](https://mpi4py.readthedocs.io) package and then uses the internal functionality of MPI to get the total number of parallel CPU cores in the current MPI group `MPI.COMM_WORLD.Get_size()` and the index of the current processor in the MPI group `MPI.COMM_WORLD.Get_rank()`." + "The following `calc_mpi()` function imports the [mpi4py](https://mpi4py.readthedocs.io) package and then uses the internal functionality of MPI to get the total number of parallel CPU cores in the current MPI group `MPI.COMM_WORLD.Get_size()` and the index of the current processor in the MPI group `MPI.COMM_WORLD.Get_rank()`.\n", + "\n", + "The [mpi4py](https://mpi4py.readthedocs.io) package is an optional dependency of executorlib. The installation of the [mpi4py](https://mpi4py.readthedocs.io) package is covered in the installation section." ] }, { @@ -166,6 +168,14 @@ " return i, size, rank" ] }, + { + "cell_type": "markdown", + "id": "adbf8a10-04e1-4fd9-8768-4375bcba9ec3", + "metadata": {}, + "source": [ + "The computational resources for the execution of the `calc_mpi()` Python function are defined using the resource dictionary parameter `resource_dict={}`. The reseource dictionary can either be provided as additional parameter for the `submit()` function. It is important that the parameter name `resource_dict` is reserved exclusively for the `submit()` function and cannot be used in the function which is submitted, like the `calc_mpi()` function in this example:" + ] + }, { "cell_type": "code", "execution_count": 6, @@ -186,6 +196,22 @@ " print(fs.result())" ] }, + { + "cell_type": "markdown", + "id": "3a449e3f-d7a4-4056-a1d0-35dfca4dad22", + "metadata": {}, + "source": [ + "Another option is to set the resource dictionary parameter `resource_dict` during the initialization of the `Executor`. In this case it is internally set for every call of the `submit()` function, without the need to specify it again." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "dcc06777-ae51-469b-bead-63ce402efe04", + "metadata": {}, + "outputs": [], + "source": [] + }, { "cell_type": "code", "execution_count": 7, @@ -201,17 +227,27 @@ } ], "source": [ - "with Executor(resource_dict={\"cores\": 2}, backend=\"local\") as exe:\n", + "with Executor(backend=\"local\", resource_dict={\"cores\": 2}) as exe:\n", " fs = exe.submit(calc_mpi, 3)\n", " print(fs.result())" ] }, + { + "cell_type": "markdown", + "id": "c1d1d7b1-64fa-4e47-bbde-2a16036568d6", + "metadata": {}, + "source": [ + "In addition, to the compute cores `cores`, the resource dictionary parameter `resource_dict` can also define the threads per core as `threads_per_core`, the GPUs per core as `gpus_per_core`, the working directory with `cwd`, the option to use the OpenMPI oversubscribe feature with `openmpi_oversubscribe` and finally for the [Simple Linux Utility for Resource \n", + "Management (SLURM)](https://slurm.schedmd.com) queuing system the option to provide additional command line arguments with the `slurm_cmd_args` parameter - [resource dictionary]()." + ] + }, { "cell_type": "markdown", "id": "4f5c5221-d99c-4614-82b1-9d6d3260c1bf", "metadata": {}, "source": [ - "### Thread Parallel Functions" + "### Thread Parallel Functions\n", + "An alternative option of parallelism is [thread based parallelism](https://docs.python.org/3/library/threading.html). executorlib supports thread based parallelism with the `threads_per_core` parameter in the resource dictionary `resource_dict`. Given the [global interpreter lock](https://docs.python.org/3/glossary.html#term-global-interpreter-lock) in the cPython implementation a common application of thread based parallelism in Python is using additional threads in linked libraries. The number of threads is commonly controlled with environment variables like `OMP_NUM_THREADS`, `OPENBLAS_NUM_THREADS`, `MKL_NUM_THREADS`, `VECLIB_MAXIMUM_THREADS` and `NUMEXPR_NUM_THREADS`. Specific libraries might require other environment variables. The environment variables can be set using the environment interface of the Python standard library `os.environ`." ] }, { @@ -234,6 +270,14 @@ " return i" ] }, + { + "cell_type": "markdown", + "id": "82ed8f46-836c-402e-9363-be6e16c2a0b0", + "metadata": {}, + "source": [ + "Again the resource dictionary parameter `resource_dict` can be set either in the `submit()` function:" + ] + }, { "cell_type": "code", "execution_count": 9, @@ -254,10 +298,18 @@ " print(fs.result())" ] }, + { + "cell_type": "markdown", + "id": "63222cd5-664b-4aba-a80c-5814166b1239", + "metadata": {}, + "source": [ + "Or alternatively, the resource dictionary parameter `resource_dict` can also be set during the initialization of the `Executor` class:" + ] + }, { "cell_type": "code", "execution_count": 10, - "id": "cfb2e2ce-0030-40fe-a9f5-3be420d0ae23", + "id": "31562f89-c01c-4e7a-bbdd-fa26ca99e68b", "metadata": {}, "outputs": [ { @@ -276,33 +328,35 @@ }, { "cell_type": "markdown", - "id": "ca9bc450-2762-4d49-b7f8-48cc83e068fd", + "id": "8b78a7b4-066e-4cbc-858e-606c8bbbbf0c", "metadata": {}, "source": [ - "## Performance Optimization" + "For most cases MPI based parallelism leads to higher computational efficiency in comparison to thread based parallelism, still the choice of parallelism depends on the specific Python function which should be executed in parallel. Careful benchmarks are required to achieve the optimal performance for a given computational architecture. \n", + "\n", + "Beyond MPI based parallelism and thread based parallelism the [HPC Submission Mode]() and the [HPC Allocation Mode]() also provide the option to assign GPUs to the execution of individual Python functions. " ] }, { "cell_type": "markdown", - "id": "e9b52ecf-3984-4695-98e7-315aa3712104", + "id": "ca9bc450-2762-4d49-b7f8-48cc83e068fd", "metadata": {}, "source": [ - "### Block Allocation" + "## Performance Optimization\n", + "The default settings of executorlib are chosen to favour stability over performance. Consequently, the performance of executorlib can be improved by setting additional parameters. It is commonly recommended to start with an initial implementation based on executorlib and then improve the performance by enabling specialized features." ] }, { - "cell_type": "code", - "execution_count": 11, - "id": "1271887b-68a5-4dbb-afb7-c88c04ebbdf1", + "cell_type": "markdown", + "id": "e9b52ecf-3984-4695-98e7-315aa3712104", "metadata": {}, - "outputs": [], "source": [ - "from executorlib import Executor" + "### Block Allocation\n", + "By default each submitted Python function is executed in a dedicated process. This gurantees that the execution of the submitted Python function starts with a fresh process. Still the initialization of the Python process takes time. Especially when the call of the Python function requires only limited computational resources it makes sense to reuse the existing Python process for the execution of multiple Python functions. In executorlib this functionality is enabled by setting the `block_allocation` parameter to `Ture`." ] }, { "cell_type": "code", - "execution_count": 12, + "execution_count": 11, "id": "0da4c7d0-2268-4ea8-b62d-5d94c79ebc72", "metadata": {}, "outputs": [ @@ -311,8 +365,8 @@ "output_type": "stream", "text": [ "2\n", - "CPU times: user 12.2 ms, sys: 27.3 ms, total: 39.5 ms\n", - "Wall time: 280 ms\n" + "CPU times: user 34.5 ms, sys: 32.1 ms, total: 66.6 ms\n", + "Wall time: 1.26 s\n" ] } ], @@ -324,59 +378,21 @@ ] }, { - "cell_type": "code", - "execution_count": 13, - "id": "ea1eaaf7-15a1-4032-83f0-b6e5ae6267b9", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[4, 6, 8]\n", - "CPU times: user 11.7 ms, sys: 28.1 ms, total: 39.8 ms\n", - "Wall time: 288 ms\n" - ] - } - ], - "source": [ - "%%time\n", - "with Executor(backend=\"local\", block_allocation=True) as exe:\n", - " future_lst = [exe.submit(sum, [i, i]) for i in range(2, 5)]\n", - " print([f.result() for f in future_lst])" - ] - }, - { - "cell_type": "code", - "execution_count": 14, - "id": "4988a169-cdd9-4e95-aadc-eedf1cd693d9", + "cell_type": "markdown", + "id": "d38163b3-1c04-431c-964b-2bad4f823a4d", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[10, 12, 14]\n", - "CPU times: user 11.6 ms, sys: 26.2 ms, total: 37.8 ms\n", - "Wall time: 268 ms\n" - ] - } - ], "source": [ - "%%time\n", - "with Executor(backend=\"local\", block_allocation=True) as exe:\n", - " results = exe.map(sum, [[5, 5], [6, 6], [7, 7]])\n", - " print(list(results))" + "The same functionality also applies to MPI parallel Python functions. The important part is that while it is possible to assign more than one Python process to the execution of a Python function in block allocation mode, it is not possible to assign resources during the submission of the function with the `submit()` function. Starting again with the `calc_mpi()` function: " ] }, { "cell_type": "code", - "execution_count": 15, + "execution_count": 12, "id": "cb8c4943-4c78-4203-95f2-1db758e588d9", "metadata": {}, "outputs": [], "source": [ - "def calc(i):\n", + "def calc_mpi(i):\n", " from mpi4py import MPI\n", "\n", " size = MPI.COMM_WORLD.Get_size()\n", @@ -384,9 +400,17 @@ " return i, size, rank" ] }, + { + "cell_type": "markdown", + "id": "9e1212c4-e3fb-4e21-be43-0a4f0a08b856", + "metadata": {}, + "source": [ + "Still the resource dictionary parameter can still be set during the initialisation of the `Executor` class. Internally, this groups the created Python processes in fixed allocations and afterwards submit Python functions to these allocations." + ] + }, { "cell_type": "code", - "execution_count": 16, + "execution_count": 13, "id": "5ebf7195-58f9-40f2-8203-2d4b9f0e9602", "metadata": {}, "outputs": [ @@ -396,37 +420,40 @@ "text": [ "[(3, 2, 0), (3, 2, 1)]\n" ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "--------------------------------------------------------------------------\n", + "A system call failed during shared memory initialization that should\n", + "not have. It is likely that your MPI job will now either abort or\n", + "experience performance degradation.\n", + "\n", + " Local host: MacBook-Pro.local\n", + " System call: unlink(2) /var/folders/z7/3vhrmssx60v240x_ndq448h80000gn/T//ompi.MacBook-Pro.501/pid.14134/1/vader_segment.MacBook-Pro.501.765b0001.1\n", + " Error: No such file or directory (errno 2)\n", + "--------------------------------------------------------------------------\n" + ] } ], "source": [ - "with Executor(\n", - " max_workers=1,\n", - " max_cores=2,\n", - " resource_dict={\"cores\": 2},\n", - " backend=\"local\",\n", - " block_allocation=True,\n", - ") as exe:\n", - " fs = exe.submit(\n", - " calc,\n", - " 3,\n", - " )\n", + "with Executor(backend=\"local\", resource_dict={\"cores\": 2}, block_allocation=True) as exe:\n", + " fs = exe.submit(calc_mpi, 3)\n", " print(fs.result())" ] }, { - "cell_type": "code", - "execution_count": 17, - "id": "cc648799-a0c6-4878-a469-97457bce024f", + "cell_type": "markdown", + "id": "b75fb95f-f2f5-4be9-9f2a-9c2e9961c644", "metadata": {}, - "outputs": [], "source": [ - "def calc(i, j, k):\n", - " return i + j + k" + "The weakness of memory from a previous Python function remaining in the Python process can at the same time be an advantage for working with large datasets. In executorlib this is achieved by introducing the `init_function` parameter. The `init_function` returns a dictionary of parameters which can afterwards be reused as keyword arguments `**kwargs` in the functions submitted to the `Executor`. When block allocation `block_allocation` is disabled this functionality is not available, as each function is executed in a separate process, so no data can be preloaded." ] }, { "cell_type": "code", - "execution_count": 18, + "execution_count": 14, "id": "8aa754cc-eb1a-4fa1-bd72-272246df1d2f", "metadata": {}, "outputs": [], @@ -437,8 +464,27 @@ }, { "cell_type": "code", - "execution_count": 19, - "id": "18bafb74-aacd-4744-9dfe-2a8ed07a1d25", + "execution_count": 15, + "id": "1854895a-7239-4b30-b60d-cf1a89234464", + "metadata": {}, + "outputs": [], + "source": [ + "def calc_with_preload(i, j, k):\n", + " return i + j + k" + ] + }, + { + "cell_type": "markdown", + "id": "d07cf107-3627-4cb0-906c-647497d6e0d2", + "metadata": {}, + "source": [ + "The function `calc_with_preload()` requires three inputs `i`, `j` and `k`. But when the function is submitted to the executor only two inputs are provided `fs = exe.submit(calc, 2, j=5)`. In this case the first input parameter is mapped to `i=2`, the second input parameter is specified explicitly `j=5` but the third input parameter `k` is not provided. So the `Executor` automatically checks the keys set in the `init_function()` function. In this case the returned dictionary `{\"j\": 4, \"k\": 3, \"l\": 2}` defines `j=4`, `k=3` and `l=2`. For this specific call of the `calc_with_preload()` function, `i` and `j` are already provided so `j` is not required, but `k=3` is used from the `init_function()` and as the `calc_with_preload()` function does not define the `l` parameter this one is also ignored." + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "id": "cc648799-a0c6-4878-a469-97457bce024f", "metadata": {}, "outputs": [ { @@ -451,33 +497,34 @@ ], "source": [ "with Executor(\n", - " init_function=init_function, backend=\"local\", block_allocation=True\n", + " backend=\"local\", init_function=init_function, block_allocation=True\n", ") as exe:\n", - " fs = exe.submit(calc, 2, j=5)\n", + " fs = exe.submit(calc_with_preload, 2, j=5)\n", " print(fs.result())" ] }, { "cell_type": "markdown", - "id": "24397d78-dff1-4834-830c-a8f390fe6b9c", + "id": "1073b8ca-1492-46e9-8d1f-f52ad48d28a2", "metadata": {}, "source": [ - "### Cache " + "The result is `2+5+3=10` as `i=2` and `j=5` are provided during the submission and `k=3` is defined in the `init_function()` function." ] }, { - "cell_type": "code", - "execution_count": 20, - "id": "7722b49c-7253-4ec8-9c73-d70e4d475106", + "cell_type": "markdown", + "id": "24397d78-dff1-4834-830c-a8f390fe6b9c", "metadata": {}, - "outputs": [], "source": [ - "from executorlib import Executor" + "### Cache\n", + "The development of scientific workflows is commonly an interactive process, extending the functionality step by step. This lead to the development of interactive environments like [Jupyter](https://jupyter.org) which is fully supported by executorlib. Still many of the computationally intensive Python functions can take in the order of minutes to hours or even longer to execute, so reusing an existing Python process is not feasible. To address this challenge executorlib provides a file based cache to store the results of previously computed [concurrent future Futures](https://docs.python.org/3/library/concurrent.futures.html#future-objects) objects. The results are serialized using [cloudpickle](https://github.com/cloudpipe/cloudpickle) and stored in a user-defined cache directory `cache_directory` to be reloaded later on. Internally, the hierarchical data format (HDF5) is used via the [h5py](https://www.h5py.org), which is an optional dependency for executorlib. \n", + "\n", + "The [h5py](https://www.h5py.org) package is an optional dependency of executorlib. The installation of the [h5py](https://www.h5py.org) package is covered in the installation section. " ] }, { "cell_type": "code", - "execution_count": 21, + "execution_count": 17, "id": "ecdcef49-5c89-4538-b377-d53979673bf7", "metadata": {}, "outputs": [ @@ -486,8 +533,8 @@ "output_type": "stream", "text": [ "[2, 4, 6]\n", - "CPU times: user 159 ms, sys: 50.1 ms, total: 209 ms\n", - "Wall time: 521 ms\n" + "CPU times: user 526 ms, sys: 134 ms, total: 661 ms\n", + "Wall time: 1.33 s\n" ] } ], @@ -498,9 +545,19 @@ " print([f.result() for f in future_lst])" ] }, + { + "cell_type": "markdown", + "id": "32d0fb2e-5ac1-4249-b6c8-953c92fdfded", + "metadata": {}, + "source": [ + "When the same code is executed again, executorlib finds the existing results in the cache directory specified by the `cache_directory` parameter and reloads the result, accelerating the computation especially during the prototyping phase when similar computations are repeated frequently for testing. \n", + "\n", + "Still it is important to mention, that this cache is not designed to identify the submission of the same parameters within the context of one `with`-statement. It is the task of the user to minimize duplicate computations, the cache is only designed to restore previous calculation results when the Python process managing executorlib was stopped after the successful execution. " + ] + }, { "cell_type": "code", - "execution_count": 22, + "execution_count": 18, "id": "c39babe8-4370-4d31-9520-9a7ce63378c8", "metadata": {}, "outputs": [ @@ -509,8 +566,8 @@ "output_type": "stream", "text": [ "[2, 4, 6]\n", - "CPU times: user 10.8 ms, sys: 10.6 ms, total: 21.4 ms\n", - "Wall time: 188 ms\n" + "CPU times: user 41.4 ms, sys: 31.7 ms, total: 73.1 ms\n", + "Wall time: 989 ms\n" ] } ], @@ -521,9 +578,17 @@ " print([f.result() for f in future_lst])" ] }, + { + "cell_type": "markdown", + "id": "68092479-e846-494a-9ac9-d9638b102bd8", + "metadata": {}, + "source": [ + "After the development phase is concluded it is the task of the user to remove the cache directory defined with the `cache_directory` parameter. The cache directory is never removed by executorlib to prevent the repeation of expensive computations. Still as disk space on shared file systems in HPC environments is commonly limited it is recommended to remove the cache directory once the development process concluded. " + ] + }, { "cell_type": "code", - "execution_count": 23, + "execution_count": 19, "id": "34a9316d-577f-4a63-af14-736fb4e6b219", "metadata": {}, "outputs": [ @@ -550,36 +615,45 @@ }, { "cell_type": "markdown", - "id": "71a8a0be-a933-4e83-9da5-50da35e9975b", + "id": "1cea95b5-4110-444c-82af-fa6718bfa17f", "metadata": {}, "source": [ - "### Dependencies" + "Typically the use of the cache is recommended for development processes only and for production workflows the user should implement their own long-term storage solution. The binary format used by executorlib is based on [cloudpickle](https://github.com/cloudpipe/cloudpickle) and might change in future without further notice, rendering existing data in the cache unusable. Consequently, using the cache beyond the development process is not recommended. In addition the writing of the results to files might result in additional overhead for accessing the shared file system. " ] }, { - "cell_type": "code", - "execution_count": 24, - "id": "e7fe0048-8ab4-4a9e-bd25-d56911153528", + "cell_type": "markdown", + "id": "71a8a0be-a933-4e83-9da5-50da35e9975b", "metadata": {}, - "outputs": [], "source": [ - "from executorlib import Executor" + "### Dependencies\n", + "Many scientific Python programs consist of series of Python function calls with varying level of parallel computations or map-reduce patterns where the same function is first mapped to a number of parameters and afterwards the results are reduced in a single function. To extend the [Executor interface](https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.Executor) of the Python standard library to support this programming pattern, the `Executor` class from executorlib supports submitting Python [concurrent futures Future](https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.Future) objects to the `Executor` which are resolved before submission. So the `Executor` internally waits until all Python [concurrent futures Future](https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.Future) objects are successfully executed before it triggers the execution of the submitted Python function. " ] }, { "cell_type": "code", - "execution_count": 25, + "execution_count": 20, "id": "d8b75a26-479d-405e-8895-a8d56b3f0f4b", "metadata": {}, "outputs": [], "source": [ - "def add_funct(a, b):\n", + "def calc_add(a, b):\n", " return a + b" ] }, + { + "cell_type": "markdown", + "id": "36118ae0-c13c-4f7a-bcd3-3d7f4bb5a078", + "metadata": {}, + "source": [ + "For example the function which adds two numbers `calc_add()` is used in a loop which adds a counter to the previous numbers. In the first iteration the `future` parameter is set to `0` but already in the second iteration it is the Python [concurrent futures Future](https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.Future) object of the first iteration and so on. \n", + "\n", + "The important part is that the user does not have to wait until the first function is executed but instead the waiting happens internally in the `Executor`. " + ] + }, { "cell_type": "code", - "execution_count": 26, + "execution_count": 21, "id": "35fd5747-c57d-4926-8d83-d5c55a130ad6", "metadata": {}, "outputs": [ @@ -587,24 +661,33 @@ "name": "stdout", "output_type": "stream", "text": [ - "7\n" + "6\n" ] } ], "source": [ "with Executor(backend=\"local\") as exe:\n", - " future = None\n", + " future = 0\n", " for i in range(1, 4):\n", - " if future is None:\n", - " future = exe.submit(add_funct, i, i)\n", - " else:\n", - " future = exe.submit(add_funct, i, future)\n", + " future = exe.submit(calc_add, i, future)\n", " print(future.result())" ] }, + { + "cell_type": "markdown", + "id": "38e1bbb3-1028-4f50-93c1-d2427f399a7d", + "metadata": {}, + "source": [ + "As the reusing of existing [concurrent futures Future](https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.Future) object can lead to rather complex dependencies, executorlib provides the option to plot the dependency graph by setting the `plot_dependency_graph=True` during the initialization of the `Executor` class. \n", + "\n", + "No computation is executed when the `plot_dependency_graph=True` is set. This parameter is for debugging only. \n", + "\n", + "Internally, the [pygraphviz](https://pygraphviz.github.io/documentation/stable) package is used for the visualisation of these dependency graphs. It is an optional dependency of executorlib. The installation of the [pygraphviz](https://pygraphviz.github.io/documentation/stable) package is covered in the installation section. " + ] + }, { "cell_type": "code", - "execution_count": 27, + "execution_count": 22, "id": "f67470b5-af1d-4add-9de8-7f259ca67324", "metadata": {}, "outputs": [ @@ -618,74 +701,74 @@ { "data": { "image/svg+xml": [ - "\n", + "\n", "\n", - "\n", + "\n", "\n", "\n", "0\n", - "\n", - "add_funct\n", + "\n", + "calc_add\n", "\n", "\n", "\n", "1\n", - "\n", - "add_funct\n", + "\n", + "calc_add\n", "\n", "\n", "\n", "0->1\n", - "\n", - "\n", + "\n", + "\n", "\n", "\n", "\n", "2\n", - "\n", - "add_funct\n", + "\n", + "calc_add\n", "\n", "\n", "\n", "1->2\n", - "\n", - "\n", + "\n", + "\n", "\n", "\n", "\n", "3\n", - "\n", - "1\n", + "\n", + "1\n", "\n", "\n", "\n", "3->0\n", - "\n", - "\n", + "\n", + "\n", "\n", "\n", "\n", "4\n", - "\n", - "1\n", + "\n", + "0\n", "\n", "\n", "\n", "4->0\n", - "\n", - "\n", + "\n", + "\n", "\n", "\n", "\n", "5\n", - "\n", - "2\n", + "\n", + "2\n", "\n", "\n", "\n", "5->1\n", - "\n", - "\n", + "\n", + "\n", "\n", "\n", "\n", @@ -696,8 +779,8 @@ "\n", "\n", "6->2\n", - "\n", - "\n", + "\n", + "\n", "\n", "\n", "" @@ -712,19 +795,16 @@ ], "source": [ "with Executor(backend=\"local\", plot_dependency_graph=True) as exe:\n", - " future = None\n", + " future = 0\n", " for i in range(1, 4):\n", - " if future is None:\n", - " future = exe.submit(add_funct, i, i)\n", - " else:\n", - " future = exe.submit(add_funct, i, future)\n", + " future = exe.submit(calc_add, i, future)\n", " print(future.result())" ] }, { "cell_type": "code", "execution_count": null, - "id": "1de93586-d302-4aa6-878a-51acfb1d3009", + "id": "67b52bd0-bd51-4538-a089-2776b8034547", "metadata": {}, "outputs": [], "source": []