Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lattice benchmark lbm #43

Closed
wants to merge 102 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
102 commits
Select commit Hold shift + click to select a range
aacf68e
Adding exception for arrayOfStructure option for bGrid.
massimim Jun 9, 2023
18f2d72
Some documentation to bGrid.
massimim Jun 15, 2023
b81c423
bGrid: API documentation and refactoring of the template API.
massimim Jun 15, 2023
d82e985
Cleaning up naming for the BlockViewGrid
massimim Jun 15, 2023
9e29f8e
bGrid - introducing the concept of BlockView and refactoring the bitm…
massimim Jun 15, 2023
cdcdc0d
bGrid - fixing multi-GPU
massimim Jun 16, 2023
54b508d
Merge branch 'bGrid' into bGrid-newTemaplateAPI
massimim Jun 16, 2023
ea82dfc
Adding scripts
massimim Jun 16, 2023
cc536e8
Merge branch 'bGrid-newTemaplateAPI' into bGrid
massimim Jun 16, 2023
55af708
Benchmarks and scripts
massimim Jun 19, 2023
90a4ba9
Code documentation
massimim Jun 19, 2023
019db4d
Fixing grid spacing in bGrid.
massimim Jun 19, 2023
1790087
Merge remote-tracking branch 'origin/develop' into bGrid
massimim Jun 22, 2023
588b746
WIP
massimim Jun 22, 2023
9a87088
Fixing report filename for benchmarks scripts
massimim Jun 22, 2023
1168cc2
Adding halo option.
massimim Jun 23, 2023
0bdce94
Adding halo option.
massimim Jun 23, 2023
3dc808e
WIP
massimim Jun 23, 2023
ceab2a6
domain_neighbour_globalIdx for dGridSoA
massimim Jun 26, 2023
8197006
Merge branch 'lattice-benchmark-lbm' into dGridSOA
massimim Jun 26, 2023
13377a4
Testing block sizes on bGrid
massimim Jun 27, 2023
3a36f0c
Adding dGridSoA to the stencil tests
massimim Jun 28, 2023
a49b27a
WIP
massimim Jun 29, 2023
fde014d
Extending unit test for stencil to dGridSoA
massimim Jun 29, 2023
b0e74e6
WIP
massimim Jun 29, 2023
1dd5abc
WIP
massimim Jun 30, 2023
81b3526
WIP
massimim Jun 30, 2023
2a2caf7
WIP
massimim Jun 30, 2023
1030345
Adding documentation to ConstexprFor
massimim Jun 30, 2023
73b063e
WIP
massimim Jun 30, 2023
1169538
WIP
massimim Jun 30, 2023
3404f03
WIP
massimim Jun 30, 2023
a2ed8f6
Refactoring of the LBM benchmark
massimim Jul 4, 2023
0a56cf4
WIP - D3Q27
massimim Jul 5, 2023
c552094
WIP - D3Q27
massimim Jul 5, 2023
5f07bca
WIP - D3Q27
massimim Jul 5, 2023
2665122
WIP - D3Q27
massimim Jul 5, 2023
c4cc536
WIP - D3Q27
massimim Jul 6, 2023
06774be
Encoding and decoding tools for Morton and Hilbert curves.
massimim Jul 11, 2023
971afc7
Merge remote-tracking branch 'origin/develop' into lattice-benchmark-lbm
massimim Jul 11, 2023
b3897f0
WIP
massimim Jul 13, 2023
0c8d8cb
WIP
massimim Jul 18, 2023
3cc397c
Fixing space filling curves
massimim Jul 19, 2023
1e7c890
Adding space filling curve parameter to dGrid
massimim Jul 19, 2023
9e9dc40
Extending benchmark with space filling curve option.
massimim Jul 19, 2023
18ffd02
WIP
massimim Jul 19, 2023
199c946
Extending grid report capabilities.
massimim Jul 25, 2023
6ddabb3
Fixes to python script.
massimim Jul 25, 2023
7e158b6
WIP
massimim Jul 28, 2023
a43ec4e
WIP: new lbm benchmark
massimim Aug 1, 2023
b6142ef
WIP: new lbm benchmark
massimim Aug 1, 2023
de704d3
WIP
massimim Aug 2, 2023
406b41c
WIP: cleaning.
massimim Aug 29, 2023
d7da72b
WIP
massimim Aug 29, 2023
852eaf5
WIP
massimim Aug 29, 2023
389d8e1
Parametric Refactoring
massimim Aug 30, 2023
b8627f5
WIP: test with D3Q27
massimim Aug 30, 2023
6759005
D3Q27 tested
massimim Aug 31, 2023
6adff5a
WIP: refactoring CLI
massimim Aug 31, 2023
f599017
WIP
massimim Aug 31, 2023
0413392
WIP: CLI refactoring.
massimim Sep 1, 2023
6ff1aa6
WIP: KBC for D3Q27
massimim Sep 1, 2023
d0667a3
Pull method.
massimim Sep 4, 2023
0572e66
WIP: kbc
massimim Sep 4, 2023
7c537e8
Fix for kbc
massimim Sep 4, 2023
d363a04
WIP: AA
massimim Sep 4, 2023
64d3d30
WIP
massimim Sep 5, 2023
d203e9e
AA working for D3Q19 and bgk.
massimim Sep 5, 2023
abc4e28
Cleaning up LBM benchmarking
massimim Sep 10, 2023
cf19169
Cleaning up LBM benchmarking
massimim Sep 10, 2023
680b84e
cuda issues
massimim Sep 10, 2023
90c6ede
WIP: fixing nvcc bug.
massimim Sep 10, 2023
ad64173
Updating script.
massimim Sep 10, 2023
59e3161
Updating script.
massimim Sep 10, 2023
0112332
Updating script.
massimim Sep 10, 2023
6b896f9
Updating script.
massimim Sep 10, 2023
65d829b
Updating script.
massimim Sep 10, 2023
5133b11
Updating script.
massimim Sep 10, 2023
3f28bfd
Updating script.
massimim Sep 10, 2023
be81ec7
Updating script.
massimim Sep 10, 2023
cb6b437
Cleaning up for PR.
massimim Sep 11, 2023
e9d12c8
Cleaning up for PR.
massimim Sep 11, 2023
e4f43c4
Cleaning up for PR.
massimim Sep 11, 2023
3c7f092
Issue with nvcc fixed.
massimim Sep 11, 2023
74a0ae0
Fix for win compilation
massimim Sep 13, 2023
ea655c3
Merge branch 'fixingCompilerIssue' into lattice-benchmark-lbm
massimim Sep 13, 2023
2c474ed
Fixing CUDA C++ issues for D3Q19
massimim Sep 14, 2023
eea7cf8
Fixing CUDA C++ issues for D3Q19 - bgk
massimim Sep 15, 2023
f69c3b7
Adding remote write support to bGrid.
massimim Sep 15, 2023
ef494df
WIP
massimim Sep 18, 2023
ddf430a
Fixing lbm benchmark template initialization
massimim Sep 20, 2023
728c186
Dropping kernel bound mechanisms.
massimim Oct 9, 2023
45e82ba
Removing debugging command.
massimim Oct 9, 2023
a79ef8b
Fixing print messages.
massimim Oct 9, 2023
d758ab0
Merge branch 'lattice-benchmark-lbm' into disaggregated-dGRid
massimim Oct 10, 2023
7e23387
Lattice halo update
massimim Oct 10, 2023
207d252
Fixing issue with dSpan and dataView.
massimim Oct 10, 2023
b574b49
Fixing issue with dSpan and dataView.
massimim Oct 10, 2023
4e690c6
Merge branch 'disaggregated-dGRid' into lattice-benchmark-lbm
massimim Oct 10, 2023
2e533ab
Fixing CLI for lbm unitoform.
massimim Oct 10, 2023
bef23c1
Fixing windows compilation
massimim Oct 11, 2023
b2235b4
Fixing windows compilation
massimim Oct 11, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion benchmarks/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
cmake_minimum_required(VERSION 3.19 FATAL_ERROR)

add_subdirectory("lbm-lid-driven-cavity-flow")
add_subdirectory(lbm)
# add_subdirectory("lbm-lid-driven-cavity-flow")
# add_subdirectory("lbm-flow-over-sphere")
103 changes: 62 additions & 41 deletions benchmarks/lbm-lid-driven-cavity-flow/lbm-lid-driven-cavity-flow.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,11 @@
GRID_LIST = "dGrid bGrid eGrid".split()
STORAGE_FP_LIST = "double float".split()
COMPUTE_FP_LIST = "double float".split()
OCC_LIST = "nOCC".split()
OCC_LIST = "nOCC sOCC".split()
HU_LIST = "huGrid huLattice".split()
CURVE_LIST = "sweep morton hilbert".split()
WARM_UP_ITER = 10
MAX_ITER = 100
MAX_ITER = 10000
REPETITIONS = 5

import subprocess
Expand Down Expand Up @@ -38,60 +40,79 @@ def countAll():
for COMPUTE_FP in COMPUTE_FP_LIST:
for DEVICE_SET in DEVICE_SET_LIST:
for GRID in GRID_LIST:
if STORAGE_FP == 'double' and COMPUTE_FP == 'float':
continue
for HU in HU_LIST:
for CURVE in CURVE_LIST:
if STORAGE_FP == 'double' and COMPUTE_FP == 'float':
continue
if STORAGE_FP == 'float' and COMPUTE_FP == 'double':
continue

counter += 1
counter += 1
return counter


SAMPLES = countAll()
counter = 0
command = './lbm-lid-driven-cavity-flow'
# command = 'echo'
with open(command + '.log', 'w') as fp:
for DEVICE_TYPE in DEVICE_TYPE_LIST:
DEVICE_SET_LIST = [DEVICE_ID_LIST[0]]
if DEVICE_TYPE == 'gpu':
for DEVICE in DEVICE_ID_LIST[1:]:
DEVICE_SET_LIST.append(DEVICE_SET_LIST[-1] + ' ' + DEVICE)
for OCC in OCC_LIST:
for DOMAIN_SIZE in DOMAIN_SIZE_LIST:
for STORAGE_FP in STORAGE_FP_LIST:
for COMPUTE_FP in COMPUTE_FP_LIST:
for DEVICE_SET in DEVICE_SET_LIST:
for DEVICE_SET in DEVICE_SET_LIST:
for OCC in OCC_LIST:
for DOMAIN_SIZE in DOMAIN_SIZE_LIST:
for STORAGE_FP in STORAGE_FP_LIST:
for COMPUTE_FP in COMPUTE_FP_LIST:
for GRID in GRID_LIST:
if STORAGE_FP == 'double' and COMPUTE_FP == 'float':
continue
for HU in HU_LIST:
for CURVE in CURVE_LIST:

if STORAGE_FP == 'double' and COMPUTE_FP == 'float':
continue
if STORAGE_FP == 'float' and COMPUTE_FP == 'double':
continue

parameters = []
parameters.append('--deviceType ' + DEVICE_TYPE)
parameters.append('--deviceIds ' + DEVICE_SET)
parameters.append('--grid ' + GRID)
parameters.append('--domain-size ' + DOMAIN_SIZE)
parameters.append('--warmup-iter ' + str(WARM_UP_ITER))
parameters.append('--repetitions ' + str(REPETITIONS))
parameters.append('--max-iter ' + str(MAX_ITER))
parameters.append(
'--report-filename ' + 'lbm-lid-driven-cavity-flow___' +
DEVICE_TYPE + '_' +
DEVICE_SET.replace(' ', '_') + '-' +
GRID + '_' +
DOMAIN_SIZE + '-' +
STORAGE_FP + '-' + COMPUTE_FP + '-' +
OCC + '-' +
HU + '-' +
CURVE)
parameters.append('--computeFP ' + COMPUTE_FP)
parameters.append('--storageFP ' + STORAGE_FP)
parameters.append('--curve ' + CURVE)

parameters = []
parameters.append('--deviceType ' + DEVICE_TYPE)
parameters.append('--deviceIds ' + DEVICE_SET)
parameters.append('--grid ' + GRID)
parameters.append('--domain-size ' + DOMAIN_SIZE)
parameters.append('--warmup-iter ' + str(WARM_UP_ITER))
parameters.append('--repetitions ' + str(REPETITIONS))
parameters.append('--max-iter ' + str(MAX_ITER))
parameters.append(
'--report-filename ' + 'lbm-lid-driven-cavity-flow___' +
DEVICE_TYPE + '_' + DOMAIN_SIZE + '_' +
STORAGE_FP + '_' + COMPUTE_FP + '_' +
DEVICE_SET.replace(' ', '_') + '_' + OCC)
parameters.append('--computeFP ' + COMPUTE_FP)
parameters.append('--storageFP ' + STORAGE_FP)
parameters.append('--benchmark')
parameters.append('--' + OCC)
parameters.append('--benchmark')
parameters.append('--' + OCC)
parameters.append('--' + HU)

commandList = []
commandList.append(command)
for el in parameters:
for s in el.split():
commandList.append(s)
commandList = []
commandList.append(command)
for el in parameters:
for s in el.split():
commandList.append(s)

fp.write("\n-------------------------------------------\n")
fp.write(' '.join(commandList))
fp.write("\n-------------------------------------------\n")
fp.flush()
subprocess.run(commandList, text=True, stdout=fp)
fp.write("\n-------------------------------------------\n")
fp.write(' '.join(commandList))
fp.write("\n-------------------------------------------\n")
fp.flush()
print(' '.join(commandList))
subprocess.run(commandList, text=True, stdout=fp)

counter += 1
printProgressBar(counter * 100.0 / SAMPLES, 'Progress')
counter += 1
printProgressBar(counter * 100.0 / SAMPLES, 'Progress')
15 changes: 15 additions & 0 deletions benchmarks/lbm-lid-driven-cavity-flow/src/CellType.h
Original file line number Diff line number Diff line change
Expand Up @@ -22,13 +22,28 @@ struct CellType
classification = c;
wallNghBitflag = n;
}

NEON_CUDA_HOST_DEVICE explicit CellType(Classification c)
{
classification = c;
wallNghBitflag = 0;
}

// Converting to int to exportVti
operator int() const { return int(classification); }

template <int fwdRegIdx>
static auto isWall(const uint32_t& wallNghBitFlag)
-> bool
{
return wallNghBitFlag & (uint32_t(1) << fwdRegIdx);
}

auto setWall(int fwdRegIdx)
-> void
{
wallNghBitflag = wallNghBitflag | ((uint32_t(1) << fwdRegIdx));
}

uint32_t wallNghBitflag;
Classification classification;
Expand Down
72 changes: 44 additions & 28 deletions benchmarks/lbm-lid-driven-cavity-flow/src/Config.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ auto Config::toString() const -> std::string

s << "......... computeType " << c.computeType << std::endl;
s << "........... storeType " << c.storeType << std::endl;
s << "............... curve " << c.curve << std::endl;

s << ". ............... occ " << Neon::skeleton::OccUtils::toString(c.occ) << std::endl;
s << "....... transfer Mode " << Neon::set::TransferModeUtils::toString(c.transferMode) << std::endl;
Expand All @@ -60,43 +61,58 @@ auto Config::parseArgs(const int argc, char* argv[])
auto& config = *this;

auto cli =
(
clipp::required("--deviceType") & clipp::value("deviceType", config.deviceType) % "Device ids to use",
clipp::required("--deviceIds") & clipp::integers("gpus", config.devices) % "Device ids to use",
clipp::option("--grid") & clipp::value("grid", config.gridType) % "Could be dGrid, eGrid, bGrid",
clipp::option("--domain-size") & clipp::integer("domain_size", config.N) % "Voxels along each dimension of the cube domain",
clipp::option("--warmup-iter") & clipp::integer("warmup_iter", config.benchIniIter) % "Number of iteration for warm up. max_iter = warmup_iter + timed_iters",
clipp::option("--max-iter") & clipp::integer("max_iter", config.benchMaxIter) % "Maximum solver iterations",
clipp::option("--repetitions") & clipp::integer("repetitions", config.repetitions) % "Number of times the benchmark is run.",
clipp::option("--report-filename ") & clipp::value("keeper_filename", config.reportFile) % "Output perf keeper filename",

clipp::option("--computeFP") & clipp::value("computeFP", config.computeType) % "Could be double or float",
clipp::option("--storageFP") & clipp::value("storageFP", config.storeType) % "Could be double or float",

(
(clipp::option("--sOCC").set(config.occ, Neon::skeleton::Occ::standard) % "Standard OCC") |
(clipp::option("--nOCC").set(config.occ, Neon::skeleton::Occ::none) % "No OCC (on by default)")),
(
(clipp::option("--put").set(config.transferMode, Neon::set::TransferMode::put) % "Set transfer mode to PUT") |
(clipp::option("--get").set(config.transferMode, Neon::set::TransferMode::get) % "Set transfer mode to GET (on by default)")),
(
(clipp::option("--huLattice").set(config.stencilSemantic, Neon::set::StencilSemantic::streaming) % "Halo update with lattice semantic (on by default)") |
(clipp::option("--huGrid").set(config.stencilSemantic, Neon::set::StencilSemantic::standard) % "Halo update with grid semantic ")),
(
(clipp::option("--benchmark").set(config.benchmark, true) % "Run benchmark mode") |
(clipp::option("--visual").set(config.benchmark, false) % "Run export partial data")),

(
clipp::option("--vti").set(config.vti, true) % "Standard OCC")
(clipp::required("--deviceType") & clipp::value("deviceType", config.deviceType) % "Device ids to use",
clipp::required("--deviceIds") & clipp::integers("gpus", config.devices) % "Device ids to use",
clipp::option("--grid") & clipp::value("grid", config.gridType) % "Could be dGrid, eGrid, bGrid",
clipp::option("--domain-size") & clipp::integer("domain_size", config.N) % "Voxels along each dimension of the cube domain",
clipp::option("--warmup-iter") & clipp::integer("warmup_iter", config.benchIniIter) % "Number of iteration for warm up. max_iter = warmup_iter + timed_iters",
clipp::option("--max-iter") & clipp::integer("max_iter", config.benchMaxIter) % "Maximum solver iterations",
clipp::option("--repetitions") & clipp::integer("repetitions", config.repetitions) % "Number of times the benchmark is run.",
clipp::option("--report-filename ") & clipp::value("keeper_filename", config.reportFile) % "Output perf keeper filename",

clipp::option("--computeFP") & clipp::value("computeFP", config.computeType) % "Could be double or float",
clipp::option("--storageFP") & clipp::value("storageFP", config.storeType) % "Could be double or float",

clipp::option("--curve") & clipp::value("curve", config.curve) % "Could be sweep (the default), morton, or hilber",
(
(clipp::option("--sOCC").set(config.occ, Neon::skeleton::Occ::standard) % "Standard OCC") |
(clipp::option("--nOCC").set(config.occ, Neon::skeleton::Occ::none) % "No OCC (on by default)")),
(
(clipp::option("--put").set(config.transferMode, Neon::set::TransferMode::put) % "Set transfer mode to PUT") |
(clipp::option("--get").set(config.transferMode, Neon::set::TransferMode::get) % "Set transfer mode to GET (on by default)")),
(
(clipp::option("--huLattice").set(config.stencilSemantic, Neon::set::StencilSemantic::streaming) % "Halo update with lattice semantic (on by default)") |
(clipp::option("--huGrid").set(config.stencilSemantic, Neon::set::StencilSemantic::standard) % "Halo update with grid semantic ")),
(
(clipp::option("--benchmark").set(config.benchmark, true) % "Run benchmark mode") |
(clipp::option("--visual").set(config.benchmark, false) % "Run export partial data")),

(
clipp::option("--vti").set(config.vti, true) % "Standard OCC")

);


if (!clipp::parse(argc, argv, cli)) {
auto fmt = clipp::doc_formatting{}.doc_column(31);
std::cout << make_man_page(cli, argv[0], fmt) << '\n';
return -1;
}

if (config.curve == "sweep")
config.spaceCurve = Neon::domain::tool::spaceCurves::EncoderType::sweep;
if (config.curve == "morton")
config.spaceCurve = Neon::domain::tool::spaceCurves::EncoderType::morton;
if (config.curve == "hilbert")
config.spaceCurve = Neon::domain::tool::spaceCurves::EncoderType::hilbert;

if (config.curve != "sweep" && config.curve != "morton" && config.curve != "hilbert") {
auto fmt = clipp::doc_formatting{}.doc_column(31);
std::cout << config.curve << " is not a supported configuration" << std::endl;
std::cout << make_man_page(cli, argv[0], fmt) << '\n';
return -1;
}

helpSetLbmParameters();

return 0;
Expand Down
46 changes: 24 additions & 22 deletions benchmarks/lbm-lid-driven-cavity-flow/src/Config.h
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
#include <string>
#include <vector>
#include "Neon/core/tools/clipp.h"
#include "Neon/domain/tools/SpaceCurves.h"
#include "Neon/skeleton/Skeleton.h"

template <typename ComputeType>
Expand All @@ -16,28 +17,29 @@ struct LbmParameters

struct Config
{
double Re = 100.; // Reynolds number
double ulb = 0.04; // Velocity in lattice units
int N = 160; // Number of nodes in x-direction
bool benchmark = false; // Run in benchmark mode ?
double max_t = 10.0; // Non-benchmark mode: Total time in dim.less units
int outFrequency = 200; // Non-benchmark mode: Frequency in LU for output of terminal message and profiles (use 0 for no messages)
int dataFrequency = 0; // Non-benchmark mode: Frequency in LU of full data dump (use 0 for no data dump)
int benchIniIter = 1000; // Benchmark mode: Number of warmup iterations
int benchMaxIter = 2000; // Benchmark mode: Total number of iterations
int repetitions = 1; // Benchmark mode: number of time the test is run
std::string deviceType = "gpu";
std::vector<int> devices = std::vector<int>(0); // Devices for the execution
std::string reportFile = "lbm-lid-driven-cavity-flow"; // Report file name
std::string gridType = "dGrid"; // Neon grid type
Neon::skeleton::Occ occ = Neon::skeleton::Occ::none; // Neon OCC type
Neon::set::TransferMode transferMode = Neon::set::TransferMode::get; // Neon transfer mode for halo update
Neon::set::StencilSemantic stencilSemantic = Neon::set::StencilSemantic::streaming;
bool vti = false; // Export vti file
std::string computeType = "double";
std::string storeType = "double";

LbmParameters<double> mLbmParameters;
double Re = 100.; // Reynolds number
double ulb = 0.04; // Velocity in lattice units
int N = 160; // Number of nodes in x-direction
bool benchmark = false; // Run in benchmark mode ?
double max_t = 10.0; // Non-benchmark mode: Total time in dim.less units
int outFrequency = 200; // Non-benchmark mode: Frequency in LU for output of terminal message and profiles (use 0 for no messages)
int dataFrequency = 0; // Non-benchmark mode: Frequency in LU of full data dump (use 0 for no data dump)
int benchIniIter = 1000; // Benchmark mode: Number of warmup iterations
int benchMaxIter = 2000; // Benchmark mode: Total number of iterations
int repetitions = 1; // Benchmark mode: number of time the test is run
std::string deviceType = "gpu";
std::vector<int> devices = std::vector<int>(0); // Devices for the execution
std::string reportFile = "lbm-lid-driven-cavity-flow"; // Report file name
std::string gridType = "dGrid"; // Neon grid type
Neon::skeleton::Occ occ = Neon::skeleton::Occ::none; // Neon OCC type
Neon::set::TransferMode transferMode = Neon::set::TransferMode::get; // Neon transfer mode for halo update
Neon::set::StencilSemantic stencilSemantic = Neon::set::StencilSemantic::streaming;
bool vti = false; // Export vti file
std::string computeType = "double";
std::string storeType = "double";
std::string curve = "sweep";
Neon::domain::tool::spaceCurves::EncoderType spaceCurve = Neon::domain::tool::spaceCurves::EncoderType::sweep;
LbmParameters<double> mLbmParameters;

auto toString()
const -> std::string;
Expand Down
Loading
Loading