cpr-perf-model

Welcome!

This repository hosts a software framework for multi-parameter application performance modeling via tensor completion. See our paper for experimental studies: [https://arxiv.org/abs/2210.10184]

This C++ framework leverages high-performance tensor computation software (publically available within the Cyclops Tensor Framework) to optimize canonical-polyadic tensor decomposition models from provided runtime data.

Build and Use

First, clone https://github.com/huttered40/ctf and build using CTF's build system. Then, build cpr-perf-model by modifying the config.mk and running make static or make shared. Example programs using cpr-perf-model may be found in the test directory. See file model_parameter_selection.sh for optional selection of model hyperparameters at runtime. For additional information on our model interface, see src/model.h

For Python users

Performance data is specified separately as training data within file training_file and test data within file test_file. Parameters and data within these files must be comma-delimited and castable to non-negative floats (e.g., categorical parameters must first be mapped onto a non-negative real number scale). An example is provided below, in which input_columns=0,1,2 and data_columns=3.

m,n,k,runtime
0,89,61,4075,0.00237545
1,1075,34,2247,0.00800553
2,1344,109,845,0.00987118
3,1968,216,1765,0.0497626
4,293,64,187,0.00029943
5,288,716,425,0.00615035
...

training_set_size samples are randomly selected from training_file and used to optimize CP decompositions. Similarly, test_set_size samples are randomly selected from test_file and used to evaluate optimized models. However, a random subset of the randomly selected test samples of size 100% x test_set_split_percentage x test_set_size will partition the data within test_file to reserve data for hyper-parameter selection. The remaining partition will be used to evaluate the configured CP decomposition model.

Minimum and maximum values can be specified as comma-delimited lists for each benchmark parameter within mode_range_min and mode_range_max, respectively. If left unspecified, the range of each parameter will be deduced from the training data.

A number of model parameters govern CP decomposition performance model optimization, including cell_spacing, ngrid_pts, response_transform, interp_map, max_num_sweeps, reg, and cp_rank. cell_spacing, ngrid_pts, and interp_map each take a comma-delimited list, the size of which equates to the number of benchmark parameters. cell_spacing=0,1 signifies that along the first and second benchark parameters, uniform spacing and geometric spacing is used to partition the ranges of the corresponding parameters, respectively. ngrid_pts then specifies the number of grid-points to place along the range of each parameter (including boundaries). interp_map specifies which tensor modes (equivalently dimensions of the underlying regular grid) to interpolate during inference time using the configured CP decomposition model. Users may set response_transform=0 to use raw execution data and response_transform=1 to apply a logarithmic transformation. cp_rank specifies the CP rank of the model, reg specifies the regularization parameter in the underlying objective function, and max_num_sweeps specifies the maximum number of sweeps of one of the alternating minimization methods (e.g., alternating least-squares) used to optimize the CP decomposition.

A complete list of runtime arguments is provided below:

Argument	Meaning	Default
training_file	Full path to csv file that stores training set	N/A
test_file	Full path to csv file that stores test set	N/A
output_file	Full path to csv file to write results	N/A
input_columns	Comma-delimited list of column indices corresponding to benchmark parameters	N/A
data_columns	Column index corresponding to execution times	N/A
training_set_size	Number of samples to use from specified training set	N/A
test_set_size	Number of samples to use from specified test set	N/A
training_set_split_percentage	Percentage of training-set samples to use for hyper-parameter selection	0
mode_range_min	Comma-delimited list of minimum values taken by each benchmark parameter	N/A
mode_range_max	Comma-delimited list of maximum values taken by each benchmark parameter	N/A
cell_spacing	Comma-delimited list specifying the spacing between grid-points (0: Uniform spacing, 1: Geometric spacing)	N/A
ngrid_pts	Comma-delimited list specifying the number of grid-points (including end-points) along each dimension	N/A
custom_grid_pts	Grid-point locations for any mode in order of modes with cell_spacing=2	N/A
response_transform	Whether or not to transform execution data (0: No transformation to execution data, 1: Logarithm transformation to execution data)	1
interp_map	Comma-delimited list specifying which tensor modes (equivalently grid dimensions) about which to interpolate (0: No interpolation, 1: Interpolate)	N/A
max_num_sweeps	Maximum number of sweeps of alternating minimization methods	100
reg	Regularization parameter	1e-4
cp_rank	Canonical-Polyadic tensor decomposition rank for use in interpolation setting	3
build_extrapolation_model	Signifies whether to build a separate model for extrapolation	1
max_spline_degree	Maximum spline degree for extrapolation model	3
sweep_tol	Error tolerance for alternating minimization method	1e-2
barrier_start	Interior-point method parameter	1e1
barrier_stop	Interior-point method parameter	1e-11
barrier_reduction_factor	Interior-point method parameter	8
tol_newton	Change (in factor matrix) tolerance within Newtons method	1e-3
max_num_newton_iter	Maximum number of iterations of Newtons method	40
cp_rank_for_extrapolation	Canonical-Polyadic tensor decomposition rank for use in extrapolation setting	1
print_model_parameters	Whether or not to print the elements of each factor matrix (0: don't print, 1: print)	0

Example for a 3-parameter kernel and 3-dimensional tensor:

python cpr.py --test_set_size 1000 --training_set_split_percentage 0.1 --interp_map 1,1,1 --max_num_sweeps 100 --reg 1e-5 --response_transform 1 --cp_rank 3 --training_set_size 65536 --training_file 'gemm-train.csv' --test_file 'gemm-test.csv' --output_file 'cpr-results.csv' --input_columns 0,1,2 --data_columns 3 --mode_range_min 32,32,32 --mode_range_max 4096,4096,4096 --cell_spacing 1,1,1 --ngrid_pts 4,4,4

Output data containing loss on training data, error metrics on test data, and model configuration execution times is written to output_file. All errors (besides loss) are with respect to the test set.

A complete list of the output data is provided below:

Output	Meaning
training_set_size	as specified in input
test_set_size	as specified in input
tensor_dim	as specified in input
ngrid_pts	as specified in input
cell_spacing	as specified in input
density	percentage of grid-cells that have at least one sample
response_transform	as specified in input
reg	as specified in input
nals_sweeps	as specified in input
cp_rank	as specified in input
interp_map	as specified in input
loss	mean squared error on training data
mlogq	arithmetic mean of log-absolute accuracy ratios
mlogq2	arithmetic mean of log-squared accuracy ratios
gmre	geometric mean of relative error
mape	mean absolute percentage error
smape	symmetric mean absolute percentage error
tensor_gen_time	time to generate tensor from data
model_config_time	time to configure model

Reproducibility

To facilitate reproducibility of the experimental results in our paper [https://arxiv.org/abs/2210.10184], we have curated the scripts necessary to generate all results. Unless stated otherwise, we used Python v2.7 to generate all results except for sparse grid regression, for which we used Python v3. Note that the subdirectory names match the corresponding figure in the paper.

Please replace all training and test files specified in these scripts with those found at [https://github.com/huttered40/app_ed]. Read the corresponding README for the exact file names.

Name		Name	Last commit message	Last commit date
Latest commit History 111 Commits
datasets/generation_scripts		datasets/generation_scripts
include		include
lib		lib
obj		obj
reproducibility		reproducibility
src		src
src_python		src_python
test		test
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
config.mk		config.mk

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cpr-perf-model

Build and Use

For Python users

Reproducibility

About

Releases

Packages

Languages

License

huttered40/cpr-perf-model

Folders and files

Latest commit

History

Repository files navigation

cpr-perf-model

Build and Use

For Python users

Reproducibility

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages