Added all files for profiling and loadgen.

Signed-off-by: L Lakshmanan <[email protected]> Added yamls to loadgen. Signed-off-by: L Lakshmanan <[email protected]> Edited invocations.csv. Signed-off-by: Lakshman <Lakshman@localhost>
vhive-serverless · Aug 17, 2024 · 45d97c1 · 45d97c1
1 parent 135a667
commit 45d97c1
Show file tree

Hide file tree

Showing 486 changed files with 104,430 additions and 6 deletions.
diff --git a/tools/invoker/client.go b/tools/invoker/client.go
@@ -133,6 +133,30 @@ func runExperiment(endpoints []*endpoint.Endpoint, runDuration int, targetRPS fl
 
 	Start(TimeseriesDBAddr, endpoints, workflowIDs)
 
+	// Calculation to find the right RPS
+	for _, ep := range endpoints {
+		if ep.Eventing {
+			invokeEventingFunction(ep)
+		} else {
+			invokeServingFunction(ep)
+		}
+	}
+	latSlice.Lock()
+	var maxLatency int64
+	maxLatency = 0
+	for _, value := range latSlice.slice {
+		if value > maxLatency {
+			maxLatency = value
+		}
+	}
+	latSlice.slice = []int64{}
+	latSlice.Unlock()
+	profSlice.Lock()
+	profSlice.slice = []int64{}
+	profSlice.Unlock()
+
+	// ACTUAL EXPERIMENT
+
 	timeout := time.After(time.Duration(runDuration) * time.Second)
 	d := time.Duration(1000000/targetRPS) * time.Microsecond
 	if d <= 0 {
@@ -196,11 +220,14 @@ func SayHello(address, workflowID string) {
 		log.Warnf("Failed to invoke %v, err=%v", address, err)
 	} else {
 		if *funcDurEnableFlag {
-			log.Debugf("Inside if\n")
 			words := strings.Fields(response.Message)
 			lastWord := words[len(words)-1]
 			duration, err := strconv.ParseInt(lastWord, 10, 64)
+			if err != nil {
+				log.Warnf("Failed to parse the duration from the response: %v", err)
+			}
 			if err == nil {
+				log.Debugf("Invoked %v. Response: %v\n", address, response.Message)
 				profSlice.Lock()
 				profSlice.slice = append(profSlice.slice, duration)
 				profSlice.Unlock()
@@ -290,7 +317,6 @@ func writeFunctionDurations(funcDurationOutputFile string) {
 	}
 
 	datawriter := bufio.NewWriter(file)
-
 	for _, dur := range profSlice.slice {
 		_, err := datawriter.WriteString(strconv.FormatInt(dur, 10) + "\n")
 		if err != nil {

diff --git a/tools/load-generator/README.md b/tools/load-generator/README.md
@@ -0,0 +1,251 @@
+# Load Generator
+
+The Load Generator tool can be used for analyzing the performance of serverless cluster deployments. It reconstructs the invocation traffic(load) based on a given trace using the functions in the vSwarm benchmark suite as its proxy. It reads the trace (containing memory utilization, duration of invocation & invocation timestamps of the trace functions) and generates a load consisting of timestamped invocations of the proxy functions closely mimicking the trace. 
+
+Each function in the trace is mapped to a function in the benchmark suite as its closest proxy (based on memory and duration), and the proxy functions are invoked. The tool utilizes the `profile.json` JSON output file generated by the [`profiler` tool](https://github.com/vhive-serverless/vSwarm/tree/load-generator/tools/profiler#profiler) to obtain the profile of the benchmark suite functions (for it to be used as a proxy). Once the load is generated (by default `load.json` file), the `invoker` can be used to run the generated load. 
+
+## Setup
+
+The profiler utilizes the stock-only cluster setup as per [vHive quickstart guide](https://github.com/vhive-serverless/vhive/blob/main/docs/quickstart_guide.md) and [vHive developer guide](https://github.com/vhive-serverless/vHive/blob/main/docs/developers_guide.md). 
+
+### Python packages
+
+```bash
+pip3 install -r requirements.txt
+```
+
+## Generating load
+
+```
+python3 main.py loadgen -h
+usage: main.py loadgen [-h] [-o path] [-t path] [-p path] [-c path] [-b path] [-m bool] [-u bool] [-i string]
+                       [-d int] [-w int] [-dbg bool]
+optional arguments:
+  -h, --help            show this help message and exit
+  -o path, --output path
+                        Output JSON file containing timestamps and endpoints of load. Default: load.json
+  -t path, --trace path
+                        Directory in which durations, invocations, memory CSV files of trace are located. Default: trace
+  -p path, --profile path
+                        JSON file containing profile details of the proxy functions. Default: profile.json
+  -c path, --config_proxy path
+                        Contains details about proxy functions used for deploying. Default: config.json
+  -b path, --build path
+                        Directory in which temporary build files are located. Default: build
+  -m bool, --minute bool
+                        Trace contains information at minute/second granularity. True if minute granularity. Default: True
+  -u bool, --unique bool
+                        Proxy-Trace functions mapping. Should it be unique?. Default: False
+  -i string, --iat_distribution string
+                        IAT Distribution: equidistant, uniform or exponential. Default: uniform
+  -d int, --duration int
+                        Experiment Duration. Default: 20
+  -dbg bool, --dbg bool
+                        Show debug messages
+```
+### Command:
+
+```bash
+python3 main.py loadgen
+```
+
+The tool reads the trace information(memory, duration and invocation details) from the `trace/` directory (can be configured using `-t` or `--trace` flags). The `trace/` directory must contain `memory.csv`, `durations.csv` and `invocations.csv` files containing the respective trace information of the format mentioned in [*Azure Functions Dataset 2019*](https://github.com/Azure/AzurePublicDataset/blob/master/AzureFunctionsDataset2019.md)
+
+#### Function Invocation Counts `invocations.csv` Schema
+
+|Field|Description  |
+|--|--|
+| HashOwner | unique id of the application owner|
+| HashApp | unique id for application name|
+| HashFunction | unique id for the function name within the app|
+|Trigger | trigger for the function|
+|x .. y | fields describing the number of invocations of the function per minute/second
+
+If the `invocations.csv` file contains the number of invocations of functions per minute, the `-m`(or `--minute`) flag must be set to `true`. If it is per second, then the flag must be set to `false`.
+
+#### Function Execution Duration `durations.csv` Schema
+
+|Field|Description  |
+|--|--|
+| HashOwner | unique id of the application owner |
+| HashApp | unique id for application name  |
+| HashFunction | unique id for the function name within the app | 
+|Average | Average execution time (ms) across all invocations of the 24-period|  
+|Count | Number of executions used in computing the average|  
+|Minimum | Minimum execution time|  
+|Maximum | Maximum execution time|  
+|percentile_Average_0| Weighted 0th-percentile of the execution time *average*|  
+|percentile_Average_1| Weighted 1st-percentile of the execution time *average*|  
+|percentile_Average_25 | Weighted 25th-percentile of the execution time *average*|  
+|percentile_Average_50 | Weighted 50th-percentile of the execution time *average*|  
+|percentile_Average_75 | Weighted 75th-percentile of the execution time *average*|  
+|percentile_Average_99 | Weighted 99th-percentile of the execution time *average*|  
+|percentile_Average_100 | Weighted 100th-percentile of the execution time *average*|
+Execution time is in milliseconds. 
+
+#### Function Memory Usage `memory.csv` Schema
+
+|Field|Description  |
+|--|--|
+| HashOwner | unique id of the application owner |
+| HashApp | unique id for application name  |
+|SampleCount | Number of samples used for computing the average |  
+|AverageAllocatedMb | Average allocated memory across all SampleCount measurements|  
+|AverageAllocatedMb_pct1 | 1st percentile of the average allocated memory|  
+|AverageAllocatedMb_pct5 | 5th percentile of the average allocated memory|  
+|AverageAllocatedMb_pct25 | 25th percentile of the average allocated memory|  
+|AverageAllocatedMb_pct50 | 50th percentile of the average allocated memory|  
+|AverageAllocatedMb_pct75 | 75th percentile of the average allocated memory|  
+|AverageAllocatedMb_pct95 | 95th percentile of the average allocated memory|  
+|AverageAllocatedMb_pct99 | 99th percentile of the average allocated memory|  
+|AverageAllocatedMb_pct100 | 100th percentile of the average allocated memory|
+
+*One can utilize the [`sampler`](https://github.com/vhive-serverless/invitro/tree/main/sampler) from [`invitro`](https://github.com/vhive-serverless/invitro/tree/main) to generate a sampled trace from the original Azure dataset trace. The sampled trace can be used as the input trace to this tool.*
+
+For every function in the trace, the closest function in the benchmark suite is set as its proxy. (75-percentile memory and 75-percentile duration are considered to find the highest correlation). If the `-u` (or `--unique`) flag is set to true, the tool tries to find a one-to-one(injective) mapping between trace functions and proxy functions by modelling it as a *linear sum assignment* problem. If the number of trace functions is greater than the number of proxy functions, or if the mapping is not achieved, the injective constraint is removed, and the closest proxy function is obtained. Currently the tool utilizes only _Serving Functions_ that are _NOT Pipelined_ as proxy functions.
+
+This mapping obviously requires the profiles of the benchmark functions for it to be used as a proxy. The tool utilizes the `profile.json` JSON output file generated by the [`profiler` tool](https://github.com/vhive-serverless/vSwarm/tree/load-generator/tools/profiler#profiler) to obtain the profile of the benchmark suite functions. The User can configure the path of the JSON file through the `-p` (or `--profile`) flag (by default, it is `profile.json`). 
+
+The tool finds the trace-proxy mapping and automatically deploys the proxies. The deployment of the proxies would require details about `predeployment-commands`, `postdeployment-commands` and `yaml-location` of the proxy functions in the JSON format. The tool obtains this information from the `config.json` file by default (The user can utilize the `-c` or `--config_proxy` flag to change this argument). More details about this format can be found in the [profiler tool README.md](https://github.com/vhive-serverless/vSwarm/tree/load-generator/tools/profiler#command). One can directly copy the file from the `vSwarm/tools/profiler/config.json` directory and use it here. 
+
+An example generated load file would look like
+```json
+[
+  {
+    "timestamp": [
+      1365552,
+      2418429,
+      5343091,
+      5548417,
+      9047101
+    ],
+    "endpoint": [
+      "aes-nodejs-8vicg27vtbz07fi6iu0uu3d42ha7c0vl.default.192.168.1.240.sslip.io",
+      "aes-nodejs-8vicg27vtbz07fi6iu0uu3d42ha7c0vl.default.192.168.1.240.sslip.io",
+      "aes-nodejs-8vicg27vtbz07fi6iu0uu3d42ha7c0vl.default.192.168.1.240.sslip.io",
+      "aes-nodejs-8vicg27vtbz07fi6iu0uu3d42ha7c0vl.default.192.168.1.240.sslip.io",
+      "aes-nodejs-8vicg27vtbz07fi6iu0uu3d42ha7c0vl.default.192.168.1.240.sslip.io"
+    ]
+  },
+  {
+    "timestamp": [
+      980005,
+      1836228,
+      2068303,
+      2145022,
+      2162754,
+      3101559,
+      3425828,
+      4719316,
+      4915234,
+      5536220,
+      5743669,
+      5819248,
+      5893692,
+      6423868,
+      6519094,
+      6713161,
+      7456063,
+      7492532,
+      8710957,
+      8736089
+    ],
+    "endpoint": [
+      "currencyservice-i2lki1cie2nv762ki2o6ugaxeebr03qd.default.192.168.1.240.sslip.io",
+      "emailservice-2hab4uja57g7852i10xk8zx6joltcklu.default.192.168.1.240.sslip.io",
+      "currencyservice-i2lki1cie2nv762ki2o6ugaxeebr03qd.default.192.168.1.240.sslip.io",
+      "emailservice-2hab4uja57g7852i10xk8zx6joltcklu.default.192.168.1.240.sslip.io",
+      "emailservice-2hab4uja57g7852i10xk8zx6joltcklu.default.192.168.1.240.sslip.io",
+      "currencyservice-i2lki1cie2nv762ki2o6ugaxeebr03qd.default.192.168.1.240.sslip.io",
+      "emailservice-2hab4uja57g7852i10xk8zx6joltcklu.default.192.168.1.240.sslip.io",
+      "currencyservice-i2lki1cie2nv762ki2o6ugaxeebr03qd.default.192.168.1.240.sslip.io",
+      "gptj-python-q0j5cwv0iri9h1v0enba17hl46636ksx.default.192.168.1.240.sslip.io",
+      "gptj-python-q0j5cwv0iri9h1v0enba17hl46636ksx.default.192.168.1.240.sslip.io",
+      "aes-nodejs-8vicg27vtbz07fi6iu0uu3d42ha7c0vl.default.192.168.1.240.sslip.io",
+      "currencyservice-i2lki1cie2nv762ki2o6ugaxeebr03qd.default.192.168.1.240.sslip.io",
+      "emailservice-2hab4uja57g7852i10xk8zx6joltcklu.default.192.168.1.240.sslip.io",
+      "gptj-python-q0j5cwv0iri9h1v0enba17hl46636ksx.default.192.168.1.240.sslip.io",
+      "aes-nodejs-8vicg27vtbz07fi6iu0uu3d42ha7c0vl.default.192.168.1.240.sslip.io",
+      "aes-nodejs-8vicg27vtbz07fi6iu0uu3d42ha7c0vl.default.192.168.1.240.sslip.io",
+      "emailservice-2hab4uja57g7852i10xk8zx6joltcklu.default.192.168.1.240.sslip.io",
+      "aes-nodejs-8vicg27vtbz07fi6iu0uu3d42ha7c0vl.default.192.168.1.240.sslip.io",
+      "aes-nodejs-8vicg27vtbz07fi6iu0uu3d42ha7c0vl.default.192.168.1.240.sslip.io",
+      "gptj-python-q0j5cwv0iri9h1v0enba17hl46636ksx.default.192.168.1.240.sslip.io"
+    ]
+  }
+]
+```
+The generated load consists of a list of timestamp-invocation objects, with each object depicting a minute/second. The timestamp is a list of microseconds of that minute/second when the appropriate endpoint is invoked.
+
+## Invoking load
+
+The invoker reads the target addresses specified inside the input file (`load.json` by default) generated previously and invokes the endpoints at the correct time instance. `make invoker` can be used to build a binary of the `invoker`. Once this and the input file are ready, calling `./invoker` will start the process.
+
+### Command
+
+```bash
+cd ./invoker
+make invoker
+./invoker -duration 15 -min true -traceFile load.json
+```
+
+#### More details:
+
+```
+./invoker -h
+Usage of ./invoker:
+  -dbg
+    	Enable debug logging
+  -duration int
+    	Experiment duration (default 8)
+  -grpcTimeout int
+    	Timeout in seconds for gRPC requests (default 30)
+  -latf string
+    	CSV file for the latency measurements in microseconds (default "lat.csv")
+  -min
+    	Is it minute granularity (default true)
+  -port int
+    	The port that functions listen to (default 80)
+  -trace
+    	Enable tracing in the client
+  -traceFile string
+    	File with trace endpoints' metadata (default "load.json")
+  -warmup int
+    	Warm up duration (default 2)
+  -zipkin string
+    	zipkin url (default "http://localhost:9411/api/v2/spans")
+```
+
+### Multi-node setup
+
+In the case of a multi-node setup, the `python3 main.py loadgen` must be run on the Master node to generate and deploy the load. Copy the generated `load.json` to the worker nodes. `invoker` must be run on the worker nodes with the `load.json` file as the input. However, note that the generated `load.json` is independent of the number of worker nodes. It is the user's responsibility to distribute the load to the worker nodes if there is more than one worker node.
+
+
+## Plotting
+
+```bash
+python3 main.py plot -h
+usage: main.py plot [-h] [-t path] [-p path] [-o path] [-dbg bool]
+
+optional arguments:
+  -h, --help            show this help message and exit
+  -t path, --trace path
+                        Directory in which durations, invocations, memory CSV files of trace are located
+  -p path, --profile path
+                        JSON file containing profile details of the proxy functions
+  -o path, --png_folder path
+                        Output folder where plots are stored
+  -dbg bool, --dbg bool
+                        Show debug messages
+```
+The tool can also plot the profiles of the trace and proxy functions as a histogram. The command takes the profile.json file (`-p` or `--profile`) and `trace` location (`-t` or `--trace`)  as input and saves the histogram PNG files at the `png/` folder (-o or --png_folder flags). 
+
+This functionality can be utilized to understand how the distribution of proxy functions varies from the trace functions. A good set of proxy functions should cover the entire distribution of the trace functions.
+
+### Example:
+
+![image](https://github.com/vhive-serverless/vSwarm/assets/70060966/017ae173-343c-4b6d-99b2-478331d4d04d)
+![image](https://github.com/vhive-serverless/vSwarm/assets/70060966/3f9482aa-3544-40b1-959c-80f69cc143bf)
+![image](https://github.com/vhive-serverless/vSwarm/assets/70060966/e1ef35ce-2e6a-4d2c-bf6c-b16e125ee1de)
+![image](https://github.com/vhive-serverless/vSwarm/assets/70060966/7ccb4548-83e4-4449-95c2-7fea27360ea1)
+![image](https://github.com/vhive-serverless/vSwarm/assets/70060966/d18f1da6-7373-4769-a84d-a563ab7b6a37)