diff --git a/docs/api.rst b/docs/api.rst index 84284ac..3dc1d0f 100644 --- a/docs/api.rst +++ b/docs/api.rst @@ -63,6 +63,10 @@ ENV Variables supported DFTRACER_LOG_FILE STRING PATH To log file. In this case process id and app name is appended to file. DFTRACER_DATA_DIR STRING Colon separated paths that will be traced for I/O accesses by profiler. For tracing all directories use the string "all" (not recommended). + Note: DFTRACER_DATA_DIR acts as a prefix. If both ``/local/scratch`` and + ``/local/scratch/data`` are in the list, the order matters— + the last one will override the first. As a result, the first path won’t be traced. + To avoid this, only use ``/local/scratch``. DFTRACER_INC_METADATA INT Include or exclude metadata (default 0) DFTRACER_SET_CORE_AFFINITY INT Include or exclude core affinity (default 0). ``DFTRACER_INC_METADATA`` needs to be enabled. @@ -73,7 +77,7 @@ ENV Variables supported DFTRACER_DISABLE_STDIO INT Disable automatic binding of STDIO I/O calls (default: 0). DFTRACER_TRACE_COMPRESSION INT Enable trace compression (default 0). DFTRACER_DISABLE_TIDS INT Disable tracing of thread ids (default 0). - DFTRACER_WRITE_BUFFER_SIZE INT Setup the buffering size for write optimization (default 0). Note: Disabled as + DFTRACER_WRITE_BUFFER_SIZE INT Setup the buffering size for write optimization (default 0). Note: Disabled as this won't work for AI workloads which uses ``fork`` and ``spawn`` without a clear ``exit``. Also, it does not work for workloads which uses ``exec`` and rewrite process buffer state. ================================ ====== =========================================================================== @@ -86,7 +90,6 @@ This section describes how to use DFTracer for profiling C++ application using C ----- - Include the DFTracer Header for C++ **************************************** @@ -96,8 +99,6 @@ In C or C++ applications, include ``dftracer/dftracer.h``. #include - - Initialization of DFTracer **************************************** @@ -111,7 +112,6 @@ Additionally, if users pass nullptr to process_id, then getpid() function would DFTRACER_CPP_INIT(log_file, data_dirs, process_id); - Finalization of DFTracer **************************************** @@ -121,8 +121,6 @@ Finalization call to clean DFTracer entries (Optional). If users do not call thi DFTRACER_CPP_FINI(); - - Function Profiling **************************************** @@ -135,7 +133,6 @@ To profile a function, add the wrapper ``DFTRACER_CPP_FUNCTION`` at the start of sleep(1); } // DFTRACER_CPP_FUNCTION ends here. - Region Level Profiling for Code blocks **************************************** @@ -154,7 +151,6 @@ The name of the region should unique within the scope of the function/code block } // DFTRACER_CPP_REGION ends here implicitly } // DFTRACER_CPP_FUNCTION ends here. - Region Level Profiling for lines of code **************************************** @@ -175,7 +171,6 @@ The ``START`` and ``END`` calls should be in the same scope of the function. } // DFTRACER_CPP_REGION ends here implicitly } // DFTRACER_CPP_FUNCTION ends here. - --------------------- DFTracer C APIs --------------------- @@ -184,7 +179,6 @@ This section describes how to use DFTracer for profiling C application using C A ----- - Include the DFTracer Header for C **************************************** @@ -194,8 +188,6 @@ In C application, include ``dftracer/dftracer.h``. #include - - Initialization of DFTracer **************************************** @@ -209,7 +201,6 @@ Additionally, if users pass NULL to process_id, then getpid() function would be DFTRACER_C_INIT(log_file, data_dirs, process_id); - Finalization of DFTracer **************************************** @@ -219,7 +210,6 @@ Finalization call to clean DFTracer entries (Optional). If users do not call thi DFTRACER_C_FINI(); - Function Profiling **************************************** @@ -242,7 +232,6 @@ To profile a function, add the wrapper ``DFTRACER_C_FUNCTION_START`` at the star For capturing all code branches, every return statement should have a corresponding ``DFTRACER_C_FUNCTION_END`` block within the function. - Region Level Profiling for lines of code **************************************** @@ -268,9 +257,9 @@ DFTracer C/C++ Function Profiling using GCC GCC supports function level tracing using ``-finstrument-functions``. DFTracer allows application to compile with ``-g -finstrument-functions -Wl,-E -fvisibility=default``. If the applications are using cmake, they can find_package and then use the CMAKE Variable `DFTRACER_FUNCTION_FLAGS` for compile flags. -This can be applied globally or on a target. +This can be applied globally or on a target. -Internally DFTracer uses ``dladdr`` to resolve symbol names which work for shared libraries. +Internally DFTracer uses ``dladdr`` to resolve symbol names which work for shared libraries. For executables or binaries, we store the address and the name which can be used to derive the function name at analysis time. This can be done using ``nm -D`` or ``readelf -S`` utilities. @@ -282,7 +271,6 @@ This section describes how to use DFTracer for profiling python applications. ----- - Include the DFTracer module **************************************** @@ -292,8 +280,6 @@ In C application, include ``dftracer/dftracer.h``. from dftracer.logger import dftracer - - Initialization of DFTracer **************************************** @@ -307,8 +293,6 @@ Additionally, if users pass -1 to process_id, then getpid() function would be us dft_logger = dftracer.initialize_log(logfile, data_dir, process_id) - - Finalization of DFTracer **************************************** @@ -318,8 +302,6 @@ Finalization call to clean DFTracer entries (Optional). If users do not call thi dft_logger.finalize() - - Function decorator style profiling **************************************** @@ -356,7 +338,6 @@ For logging ``__init__`` function within a class, applications can use ``log_ini For logging ``@staticmethod`` function within a class, applications can use ``log_static`` function. - Iteration/Loop Profiling **************************************** @@ -370,7 +351,6 @@ For logging every block within a loop, we have an ``dft_fn.iter`` which takes a for batch in dft_fn.iter(loader.next()): sleep(1) - Context style Profiling **************************************** @@ -383,7 +363,6 @@ We can also profile a block of code using Python's context managers using ``dft_ sleep(1) dft.update(step=1) - Custom Profiling **************************************** diff --git a/docs/examples.rst b/docs/examples.rst index afafa7d..a9e4392 100644 --- a/docs/examples.rst +++ b/docs/examples.rst @@ -70,6 +70,12 @@ Example of running this configurations are: # Enable profiler DFTRACER_ENABLE=1 +.. warning:: + + Note: DFTRACER_DATA_DIR acts as a prefix. If both ``/local/scratch`` and + ``/local/scratch/data`` are in the list, the order matters— + the last one will override the first. As a result, the first path won’t be traced. + To avoid this, only use ``/local/scratch``. LD_PRELOAD Example: ************************** @@ -109,7 +115,6 @@ Example of running this configurations are: # Enable profiler export DFTRACER_ENABLE=1 - Hybrid Example: ************************** @@ -247,7 +252,6 @@ Example of running this configurations are: # Enable profiler DFTRACER_ENABLE=1 - LD_PRELOAD Example: ************************** @@ -286,7 +290,6 @@ Example of running this configurations are: # Enable profiler export DFTRACER_ENABLE=1 - Hybrid Example: ************************** @@ -356,8 +359,6 @@ Example of running this configurations are: # Enable profiler DFTRACER_ENABLE=1 - - ---------------- Python Example ---------------- @@ -407,7 +408,6 @@ Application Level Example: pool.map(posix_calls, ((2, True),)) log_inst.finalize() - if __name__ == "__main__": main() @@ -426,7 +426,6 @@ Example of running this configurations are: # Enable profiler DFTRACER_ENABLE=1 - LD_PRELOAD Example: ******************* @@ -480,7 +479,6 @@ Example of running this configurations are: # Enable profiler export DFTRACER_ENABLE=1 - .. _python-hybrid-mode: Hybrid Example: @@ -528,7 +526,6 @@ Hybrid Example: pool.map(posix_calls, ((2, True),)) log_inst.finalize() - if __name__ == "__main__": main() @@ -550,7 +547,6 @@ Example of running this configurations are: # Enable profiler DFTRACER_ENABLE=1 - ---------------------------------------------------------------- Resnet50 with pytorch and torchvision example from ALCF Polaris: ---------------------------------------------------------------- @@ -559,15 +555,15 @@ Create a separate conda environment for the application and install dftracer .. code-block:: bash :linenos: - + #!/bin/bash +x set -e set -x export MODULEPATH=/soft/modulefiles/conda/:$MODULEPATH module load 2023-10-04 # This is the latest conda module on Polaris - - export ML_ENV=$PWD/PolarisAT/conda-envs/ml_workload_latest_conda_2 # Please change the following path accordingly - + + export ML_ENV=$PWD/PolarisAT/conda-envs/ml_workload_latest_conda_2 # Please change the following path accordingly + if [[ -e $ML_ENV ]]; then conda activate $ML_ENV else @@ -575,13 +571,13 @@ Create a separate conda environment for the application and install dftracer conda activate $ML_ENV yes | MPICC="cc -shared -target-accel=nvidia80" pip install --force-reinstall --no-cache-dir --no-binary=mpi4py mpi4py yes | pip install --no-cache-dir git+https://github.com/hariharan-devarajan/dftracer.git - pip uninstall -y torch horovod + pip uninstall -y torch horovod yes | pip install --no-cache-dir horovod - #INSTALL OTHER MISSING FILES + #INSTALL OTHER MISSING FILES fi -Since, torchvision.datasets.ImageFolder spawns separate python processes to help the parallel data loading in torch, we will be using the `HYBRID MODE` of the DFTracer (e.g., see -:ref:`Python Hybrid mode `), so that the application can use both APP and PRELOAD Mode to log I/O from all dynamically spawned processes and function profiling from application. +Since, torchvision.datasets.ImageFolder spawns separate python processes to help the parallel data loading in torch, we will be using the `HYBRID MODE` of the DFTracer (e.g., see +:ref:`Python Hybrid mode `), so that the application can use both APP and PRELOAD Mode to log I/O from all dynamically spawned processes and function profiling from application. The following dftracer code is added to profile the application at the function level. Note: dftracer python level log file location is provided inside the python code in the dftracer.initialize_log() function and the POSIX or STDIO calls level log file location is provided in the job scirpt environment variable `DFTRACER_LOG_FILE` @@ -615,27 +611,26 @@ Note: dftracer python level log file location is provided inside the python code # At the end of main function log_inst.finalize() -Job submition script +Job submition script .. code-block:: bash :linenos: - + export MODULEPATH=/soft/modulefiles/conda/:$MODULEPATH module load 2023-10-04 conda activate./dlio_ml_workloads/PolarisAT/conda-envs/ml_workload_latest_conda - + export LD_LIBRARY_PATH=$env_path/lib/:$LD_LIBRARY_PATH export DFTRACER_LOG_LEVEL=ERROR export DFTRACER_ENABLE=1 export DFTRACER_INC_METADATA=1 export DFTRACER_INIT=PRELOAD - export DFTRACER_DATA_DIR=./resnet_original_data #Path to the orignal resnet 50 dataset + export DFTRACER_DATA_DIR=./resnet_original_data #Path to the orignal resnet 50 dataset export DFTRACER_LOG_FILE=./dft_fn_posix_level.pfw - + LD_PRELOAD=./dlio_ml_workloads/PolarisAT/conda-envs/ml_workload_latest_conda/lib/python*/site-packages/dftracer/lib/libdftracer_preload.so aprun -n 4 -N 4 python resnet_hvd_dlio.py --batch-size 64 --epochs 1 > dft_fn 2>&1 - - cat *.pfw > combined_logs.pfw # To combine to a single pfw file. + cat *.pfw > combined_logs.pfw # To combine to a single pfw file. ----------------------- Integrated Applications @@ -657,5 +652,3 @@ Here, we can see that we can get application level calls (e.g., ``train`` and `` .. image:: images/tracing/trace.png :width: 400 :alt: Unet3D applications - -