-
Notifications
You must be signed in to change notification settings - Fork 5
/
README
152 lines (98 loc) · 4.72 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
# Text Swap AFU
This package contains RTL and C test code for an AFU accelerated text searching and
replacing. It works using IBM's Coherent Accelerator Processor Interface.
To get started, if this is an initial clone of the repositor, you should
initialize the submodules:
git submodule update --init
After that is done, configure the project:
./waf configure [--hardware]
This will check for the required dependancies. If it doesn't succeed, please
find the missing components and re-run until it does. Note at the
current time a common issue during the configure phase is a missing
cxl.h which should be obtained from a working Power8 based system and
placed somewhere on the build path.
Once the configuration passes, you may build the C code with:
./waf
Note that you may see an issue when you do this on a machine which
does not have a capi enabled kernel. In that case you will have to
copy the offending files from a suitable machine to the current
machine. If you have no access to said files you should contact
IBM. Note that certain files (e.g. cxl.h) are now pulled by the waf
script using the internet so you will need network connectivity for
that to work.
Note that currently hardware based builds will not complete
sucessfully on non-PPC architectures.
## Running Simulations
To run a simulation, use the ./sim script. Currently only Cadence's ncsim tools
are supported. You may select a binary to run with the -e option (it defaults to
'textswap') and you may pass additional arguments to this binary by passing them
after a '--'. For example, to simulate unittest with random data run:
./sim -e unittest -- -R
There is also a regression script
./run_tests
## Running on Hardware
To run on actual powerpc hardware you must configure the code specially:
./waf configure --hardware
Then you may rebuild the code and run the binaries normally:
./waf
build/unittest -R
The regression script also works on hardware (when available)
./run_tests
## FPGA Build and Bitfiles
To build the FPGA rbf file to run against the C code you will need to
work with IBM to enable you with the PSL files. This files should then
be placed or symlinked in a folder called libs/psl_fpga and then run
make
in the fpga folder (of course you will need Quartus tools installed on
your system). A pre-built bitfile that works with this codebase is
available on the releases page of this GitHub project at
https://github.com/sbates130272/capi-textswap/releases
To download the rbf file to the FPGA on the CAPI card you will need to
tools provided by IBM and/or Nallatech. Support for other cards is an
open issue for this project.
## Generate Datasets
You can generate datasets using
./build/gen_haystack -s <SIZE> [OUTPUT_FILE]
and you can run
./build/gen_haystack -h
to see the defaults. For example
./build/gen_haystack -s 8G /mnt/nvme/demo.GoPower8.50.8G.dat
Will generate a 8GiB dataset with the phase "GoPower8" inserted in 50
random locations in a file located at /mnt/nvme/demo.GoPower8.50.8G.dat.
This file can be on any block IO device such as a HDD or SSD.
## Performance Testing
You can test how quickly you can read the datasets and count the
occurances of the needle string using something like
./build/textswap -R -E 50 /mnt/nvme/demo.GoPower8.50.8G.dat -r 14 -c 8M -q 22
which should produce an output like
Transfer rate:
8.00GiB in 2.9 s 2.72GiB/s
Matches Found: 50 (Good)
Note this if all check pass the program will return with exit code 0
(to faciliate scripting). A failing test will exist with a non-zero
exit code. You should play with the -r, -c and -q options to tune the
performance. Run
./build/textswap -h
for a complete list of command line options and defaults. Note you may
need to run the above command as root or set your udev rules to permit
access to the CAPI device.
A word of warning, be sure to flush your caches before and between
performance test runs or you will see exceptional performance due to
hitting the page cache. We normally install a simple script in
/usr/local/sbin called drop_caches which consists of
#!/bin/sh
sync
echo 3 > /proc/sys/vm/drop_caches
## CPU Utilization
A simple script called cpuperf.py lives in the scripts folder and it
can be run either in an adjacent shell or as a background task (using
something like nohup). By running
./scripts/cpuperf.py -C textswap -w 100 -s | tee cpuperf.log
the script will capture CPU and memory utilization using a system call
to ps once every 100ms when textswap is running. Omit the -s to
capture at all times. Refer to the ps manual page for specifics on
exactly what is being measured.
## Updates
This code is open-source, we welcome patches and pull requests against
this codebase. You are under no obligation to submit code back to us
but we are hoping you will ;-).