Gem5 is a very
powerful simulation tool for computer architects.
However, if you need to run detailed simulations, it can take too long.
A widely used approach to overcome this problem is the use of the
so-called SimPoint tool.
In this article I'll show you how to generate represetative regions, using SimPoint tool,
and run them with the Gem5 simulator (RISC-V ISA).
The SimPoint tool enables the extraction of representative regions within an application, characterized by an instruction number and a weight. By specifying the length in terms of the number of instructions, the tool provides N regions accordingly. the following steps need to be undertaken:
The first step to obtain SimPoint regions is the generation of the Basic Block Vector (BBV) file. BBV file contains useful information for the SimPoint analysis. Gem5 allows to generate BBV file, but I found this method quite slow. Another approach is to use QEMU emulator, allowing to obtain the BBV file quickly.
In order to generate BBV file with QEMU, you need to use a plugin called qpoints. I had some problems using the official version of qpoints with the newest QEMU versions. In my GitHub account, I published a version of qpoints that is compatible with QEMU v6.2: https://github.com/mircomannino/qpoints/tree/qemu-v6.2.0.
To build the qpoints plugin, you can follow the instructions in the README of the GitHub repository. To generate the BBV file for an application, run the following command:
export OUT_DIR=./
export OUT_NAME=hello
export SIMPOINTS_INTERVAL=100000000 # 100M
qemu-riscv64 \
-plugin ./libbv.so,out_dir=${OUT_DIR},out_name=${OUT_NAME},simpoint_interval=${SIMPOINTS_INTERVAL} \
./hello \
arg-0 \
arg-1 \
arg-2
This previous command generates a BBV file for SimPoints, where each region is 100 million instructions long. The example is performed for an application named "hello" with input arguments "arg-0," "arg-1," and "arg-2". The file we are interested in is hello.bb.gz.
To generate SimPoints, you need to download and build the tool you can find at this link. Then, run the following command:
<path-to-simpoints-folder>/bin/simpoint \
./hello.bb.gz \
-saveSimpoints ./out.simpoint \
-saveSimpointWeights ./out.weight \
-loadFVFile \
-inputVectorsGzipped
After running the previous command, we have two files:
out.simpoint and out.weight.
The first one contains the information about the position of representative regions
(i.e., the number of the first instruction of each region), while
the second one indicates the weight associated to each region.
After obtaining the simpoints regions, it is possible to use Gem5 to take checkpoints on those regions. Fortunately, Gem5 provides support for simpoints, so it will only be necessary to specify the location of the previously generated files and the number of warmup instructions we want to execute before entering the actual region
The se.py configuration script has some input arguments useful to deal with SimPoints. Run the following command, in the gem5 folder, to take checkpoints:
export SIMPOINTS_INTERVAL=100000000 #100M
export WARMUP_INTERVAL=50000000 #50M
export OUT_DIR=./cpt_from_simpints
build/RISCV/gem5.opt \
--outdir=$OUT_DIR \
configs/example/se.py \
--cmd="hello" \
--options="arg-0 arg-1 arg-2" \
--take-simpoint-checkpoint=./out.simpoint,./out.weight,$SIMPOINTS_INTERVAL,$WARMUP_INTERVAL
After running the previous command, we have a folder for each checkpoints
in the cpt_from_simpoints folder.
Fortunately, the information regarding the simpoint interval
length and warmup length is conveniently embedded in the name of
each checkpoint folder. This eliminates the need to contend with
cumbersome instruction counts during the restoration of checkpoints.
Now, we have to run the simulation for each checkpoint. You can do that running the following command:
export CPT_DIR=./cpt_from_simpoints
export OUT_DIR=./output_cpt-1
build/RISCV/gem5.opt \
--outdir=$OUT_DIR
configs/example/se.py \
--cmd="hello" \
--options="arg-0 arg-1 arg-2" \
--cpu-type=O3CPU \
--restore-simpoint-checkpoint \
-r 1 \
--checkpoint-dir=$CPT_DIR \
--restore-with-cpu=MinorCPU
The previous command restores the simulation to a point 50M instructions before the SimPoint region, runs with a MinorCPU, and then continues the simulation for an additional 100M instructions using an O3CPU
You have to run the previous command for all the checkpoints. If you have, for instance, 4 checkpoints, you need to run the previous command 4 times, changing the -r value (i.e., -r 1, -r 2, -r 3, and -r 4).
Almost done! Now, we need to use the weight of each region in order to obtain the value of the statistics we are interested in.
Assuming we want to compute the IPC of the benchmark. To calculate the IPC of the benchmark, we have to sum the products of each IPC and its corresponding weight. The final results has to be divided by the sum of all the weights. The formula can be summarized as follows:
ipc = (ipc_0*weight_0 + ipc_1*weight_1 + ... ipc_N*weight_N) / (weight_0 + weight_1 + ... + weight_N)
In this brief article we saw how to use SimPoints in Gem5 simulator. If you have any feedback or comments, please let me know by writing to the following e-mail: mannino@diism.unisi.it