Configuration guide
The core of the Feel++ benchmarking framework are its configuration files. Users must provide the following configuration files:
-
A complete system description, based on ReFrame’s configuration files.
-
A machine specific configuration, defining HOW to execute benchmarks.
-
A benchmark (or application) specific configuration, defining WHAT should be executed.
-
A figure description, containing information of what to display on the final reports.
These files, with the exeption of ReFrame configuration, are equiped with a special placeholder syntax, allowing to dynamically update the files along the tests execution. Aditionally, multiple environments can be specified, including Apptainer containers.
Comments are supported on these JSON files. |
1. System configuration
The system configuration files need to be placed under src/feelpp/benchmarking/reframe/config/machineConfigs, and are strictly ReFrame dependent. A single Python file should be provided per machine. Please follow ReFrame’s configuration file reference for precise settings.
Example configurations are provided for the Gaya machine and for a simple single node 8-core system.
Processor bindings and other launcher options should be specified as a resource under the desired partition, with the
|
2. Magic strings
Benchmarking configuration files support a special placeholder syntax, using double curly braces {{placeholder}}
.
This syntax is specially useful for:
-
Refactoring configuration fields.
-
Replacing with values from other configuration files, such as the machine config.
-
Making use of code variables modified at runtime, by having reserved keywords.
-
Fetching defined parameters values that change during runtime.
To get a value of a field in the same file, the field path must be separated by dots. For example,
"field_a":{
"field_b":{
"field_c": "my value"
}
}
"example_placeholder": "{{field_a.field_b.field_c}}"
For replacing a value coming from the machine configuration, simply prepend any placeholder path with machine.
2.1. Reserved Keywords
The framework is equiped with the following reserved keywords for placeholders:
-
{{instance}}
: Returns the hashcode of the current ReFrame test. -
{{.value}}
: The value keyword must be appended to a parameter name (e.g.{{parameters.my_param.value}}
). It fetches the current value of a given runtime variable (such as a parameter). More information on the Parameters section.
2.2. Nested placeholders
Nested placeholders are supported.
For example, lets say you have a machine configuration containing
"platform":"builtin"
And a benchmark configuration:
"platforms":{
"builtin":"my_builtin_value",
"other":"my_other_value"
},
"nested_placeholder":"My platform dependent value is {{ platforms.{{machine.platform}} }}"
The nested_placeholder
field will then take the value of "My platform dependent value is my_buildin_value", because the machine config specifies that "platform" is "builtin". But this will change if "platform" is set to "other".
3. Machine configuration
The machine configuration file contains all information related uniquely to the system where benchmarks will run on. It is used to tell the application HOW benchmarks will run on it. The framework supports multiple containers and environments such as Apptainer and Spack. This information should be specified here.
The following table describes all supported fields.
Field name | Optional | type | description | Default |
---|---|---|---|---|
machine |
No |
str |
The name of the machine. It needs to be the same as the name of the ReFrame configuration file. |
|
execution_policy |
Yes |
str |
Either 'async' or 'serial'. The way in which ReFrame will run the tests. |
serial |
reframe_base_dir |
No |
str |
The base directory of ReFrame’s stage and output directories. If it does not exist, it will be created. |
|
reports_base_dir |
No |
str |
The base directory where ReFrame reports will be saved to. |
|
input_dataset_base_dir |
Yes |
str |
The base directory where input data can be found (if applicable) |
None |
output_app_dir |
No |
str |
The base directory where the benchmarked application should write its outputs to. |
|
targets |
Yes |
list[str] or str |
Specifies in which partition, platform and prog_environment run benchmarks on.
The syntax is |
None |
platform |
Yes |
str |
Platform to run the benchmark on, possible values are "Apptainer" and "builtin". Docker will soon be supported. |
builtin |
partitions |
Yes |
List[str] |
Partitions where the test can run on. Tests will run on the cartesian product of partitions and prog_environments, where environments are specified for the current partition on the ReFrame configuration. |
[] |
prog_environments |
Yes |
List[str] |
Environments where the test can run on. Test will run with this programming environment if it is specified on the current partition on the ReFrame configuration. |
[] |
containers |
Yes |
Dict[str,Container] |
Dictionary specifying container type "Apptainer" or "Docker" (not yet supported), and container related information. More details on the |
{} |
The containers object is defined as follow
Field name | Optional | type | description | Default value |
---|---|---|---|---|
cachedir |
Yes |
str |
Directory where the pulled images will be cached on. |
None |
tmpdir |
Yes |
str |
Directory where temporary image files will be written. |
None |
image_base_dir |
No |
str |
Base directory where images can be found in the system |
None |
options |
Yes |
List[str] |
Options to add to the container execution command. |
None |
Below, an example of a machine configuration file can be found, for a machine called "my_machine".
{
"machine": "my_machine",
"targets":"production:builtin:hpcx",
"execution_policy": "async",
"reframe_base_dir":"$PWD/build/reframe",
"reports_base_dir":"$PWD/reports/",
"input_dataset_base_dir":"$PWD/input/",
"output_app_dir":"$PWD/output/",
"containers":{
"apptainer":{
"image_base_dir":"/data/images/",
"options":[ "--sharens", "--bind /opt/:/opt/" ]
}
}
}
Let’s review step by step what the file defines.
-
"machine":"my_machine"
indicates that the ReFrame config can be found as my_machine.py -
"targets":"production:builtin:hpcx"
tells reframe to run tests uniquely on the production partition with the builtin platform and the hpcx programming environment. -
"execution_policy":"async"
tells ReFrame to run tests asynchronously on available resources. -
"reframe_base_dir":"$PWD/build/reframe"
Reframe will use the build/stage/ folder and build/output/ folder of the current working directory for staging tests and storing the benchmarked application’s standard output and errors. -
"reports_base_dir":"$PWD/reports/"
Means that the reframe reports will be found under the reports/ folder of the current working directory. -
"input_dataset_base_dir":"$PWD/input/"
Means that the framework should look for input somewhere under the input/ folder of the current working directory. The rest of the path is specified on the benchmark configuration. -
"output_app_dir":"$PWD/output/"
Means that the benchmarked application should write its output files under the output/ folder of the current working directory. The rest of the path is specified on the benchmark configuration.
Concerning containers:
-
The
"apptainer"
key indicated that the application COULD be benchmarked using apptainer. Not necesserily that it will. If thetargets
field specifies the apptainer platform, then this field is mandatory. -
"image_base_dir":"/data/images"
indicates that the built apptainer images can be found somewhere under the _/data/images/ directory. The rest of the path is specified on the benchmark configuration. -
"options":"[ "--sharens", "--bind /opt/:/opt/" ]"
Tells ReFrame to add these options to the Apptainer execution command. For example,mpiexec -n 4 apptainer exec --sharens --bind /opt/:/opt/ …
. Only machine related options should be specified here, more options can be defined in the benchmark configuration.
4. Benchmark configuration
Configuring a benchmark can be quite extensive, as this framework focuses on flexibility. For this, the documentation will be divided in main sections.
The base of the configuration file is shown below.
{
"executable": "",
"use_case_name": "",
"timeout":"",
"platforms":{},
"options": [],
"outputs":{},
"scalability":{},
"sanity":{},
"parameters":{},
}
4.1. Fields on JSON root
Users can add any field used for refactoring. For example, one can do the following.
"output_directory":"/data/outputs" // This is a custom field
"options":["--output {{output_directory}}"]
Mandatory and actionable fields are:
Field name | Optional | type | description | Default |
---|---|---|---|---|
executable |
No |
str |
Path or name of the application executable. |
|
use_case_name |
No |
str |
Custom name given to the use case. Serves as an ID of the use case, must be unique across use cases. |
|
timeout |
No |
str |
Job execution timeout. Format: days-hours:minutes:seconds |
4.2. Platforms
The platforms
object lists all options and directories related to the benchmark execution for each supported platform. A platform present on this object does not imply that it will be benchmarked, but it rather lists all possible options.
The field is optional, if not provided, the builtin platform will be considered.
The syntax for builtin platform is the following:
"platforms": {
"builtin":{
"input_dir":"",
"append_app_option":[]
}
}
-
append_app_options
is in this case equivalent to theoptions
field on the configuration root. -
input_dir
indicates the path of the directory where input files can be found.
The following example shows how to configure the Apptainer platform:
"platforms":{
"apptainer":{
"image":{
"name":"{{machine.containers.apptainer.image_base_dir}}/my_image.sif"
},
"input_dir":"/input_data/",
"options":["--home {{machine.output_app_dir}}"],
"append_app_options":["--my_custom_option_for_apptainer"]
}
}
For any container, the image
field must be specified, specifically image.name
, containing the path of the image. Pulling images is not yet supported.
In this case, input_dir
represents the directory where input files will be found INSIDE the container.
The options
field contains a list of all the options to include on the container execution. It is equivalent to the machine’s containers.apptainer.options
field. However, users should only include application dependent options in this list.
The append_app_options
lists all the options to add to the application execution. It does the same as the options
field in the root of the file, but can be used for case handling.
4.3. Outputs
The outputs
field lists all the files where application outputs are exported.
Field name | Optional | type | description | Default |
---|---|---|---|---|
filepath |
No |
str |
Path of the file containing the outputs |
|
filepath |
No |
str (csv, json) |
Format of the output file |
All columns or fields present on the output file will be considered. Outputs are added to the ReFrame report as pervormance variables.
Soon, the same syntax as scalability files will be used. |
4.4. Scalability
Lists all the files where performance times can be found.
Field name | Optional | type | description | Default |
---|---|---|---|---|
directory |
No |
str |
Common directory where scalability files can be found. Used for refactoring fields. |
|
stages |
No |
list[Stage] |
List of scalability file objects describing them. |
Each stage object is described as follows
Field name | Optional | type | description | Default |
---|---|---|---|---|
name |
No |
str |
Prefix to add to the performance variables found in the file |
|
filepath |
No |
str |
partial filepath, relative to the |
|
format |
No |
str |
Format of the file. Supported values are (csv, json, tsv) |
|
variables_path |
Yes |
str |
Only if format is json. Defines where, in the JSON hierrarchy, performance variables will be found. Supports the use of a single wildcard ( |
An example of the scalability field is found and explained below.
"scalability": {
"directory": "{{output_directory}}/{{instance}}/cem/",
"stages": [
{
"name":"custom_execution_name",
"filepath": "instances/np_{{parameters.nb_tasks.tasks.value}}/logs/execution_timers.json",
"format": "json",
"variables_path":"execution.*"
},
{
"name":"construction",
"filepath": "logs/timers.json",
"format": "json",
"variables_path":"*.constructor"
}
]
}
-
directory
implies that scalability files can be found under{{output_directory}}/{{instance}}/cem/
. Whereoutput_directory
is defined above, andinstance
is a reserved keyword containing the hashcode of the test.
There are two scalability files in the example. Let’s suppose files are built like follows:
-
logs/execution_timers.json
{ "execution":{ "step1":0.5, "step2":0.7, "step3":1.0, }, "postprocess":{...} }
-
logs/timers.json
{ "function1":{ "constructor":1.0, "init":0.1, }, "function2":{ "constructor":1.0, "init":0.1, } }
Then, by specifying "variables_path":"exection.*"
, performance variables will be custom_execution_name.step1
, custom_execution_name.step2
and custom_execution_name.step3
.
And by specifying "variables_path":"*.constructor
for the other file, performance vairalbes will be construction.function1
, and construction.function2
.
Note how variables are prefixed with the value under name
, and that the wildcard (*
) determines the variable names.
Deeply nested and complex JSON scalability files are supported. |
4.5. Sanity
The sanity
field is used to validate the application execution.
The syntax is the following:
"sanity":{
"success":[],
"error":[]
}
-
The
success
field contains a list of patterns to look for in the standard output. If any of the patterns are not found, the test will fail. -
The
error
field contains a list of patters that will make the test fail if found in the standard output. If any of these paterns are found, the test will fail.
At the moment, only validating standard output is supported. It will soon be possible to specify custom log files. |
4.6. Parameters
The parameters
field list all parameters to be used in the test.
The cartesian product of the elements in this list will determine the benchmarks.
Parameters are accessible across the whole configuration file by using the syntax {{parameters.my_parameter.value}}
.
Each parameter is described by a name and a generator.
Valid generators are :
-
linspace
:
{ "name": "my_linspace_generator", "geomspace":{ "min":2, "max":10, "n_steps":5 } }
The example will yield [2,4,6,8,10]
. Min and max are inclusive.
-
geomspace
:
{ "name": "my_geomspace_generator", "geomspace":{ "min":1, "max":10, "n_steps":4 } }
The example will yield [2,16,128,1024]
. Min and max are inclusive.
-
range
:
{ "name": "my_range_generator", "geomspace":{ "min":1, "max":5, "step":1 } }
The example will yield [1,2,3,4,5]
. Min and max are inclusive.
-
geometric
:
{ "name": "my_geometric_generator", "geometric":{ "start":1, "ratio":2, "n_steps":5 } }
The example will yield [1,2,4,8,16]
.
-
repeat
:
{ "name": "my_repeat_generator", "repeat":{ "value":"a repeated value", "count":3 } }
The example will yield ["a repeated value", "a repeated value", "a repeated value"]
.
-
sequence
:
Sequence accepts
{
"name": "my_sequence_generator",
"sequence":[ 1, 2, 3, 4]
}
Sequence is the simplest generator. It will yield exactly the given list.
It accepts dictionnaries as items, which can then be accessed via the .
separator.
-
zip
and subparameters:
Parameters can contain subparameters, which can be accessed recursively via the .
separator. Its objective is to have parameters that depend on eachother, without producing a cartesian product.
Aditionnaly, parameters can be zipped together via the zip
generator.
The zip
generator takes a list of parameters to produce a list of python dictionaries. Each param inside the list can then have any desired generator from above.
{
"name": "my_zip_generator",
"zip":[
{
"name":"param1",
"sequence":[
{"val1":1,"val2":2},
{"val1":3,"val2":4},
{"val1":5,"val2":6}
]
},
{
"name":"param2",
"repeat":{
"value":"a repeated value",
"count":3
}
}
]
}
This example will yield [{'param1': {'val1': 1, 'val2': 2}, 'param2': 'a repeated value'}, {'param1': {'val1': 3, 'val2': 4}, 'param2': 'a repeated value'}, {'param1': {'val1': 5, 'val2': 6}, 'param2': 'a repeated value'}]
Zipped parameters need to have the same lenght. |
-
Special parameters
There is one special parameter: nb_tasks
. If included, should follow some rules for its subparameters.
Accepts exclusive_access
subparameter. Defaults to true
.
Either specify tasks_per_node
and tasks
subparameters, OR specify tasks_per_node
and nodes
subparameters, OR Specify only the tasks
parameter.
Specifying tasks
and nodes
is NOT currently supported.
The nb_tasks
parameter and its subparameters are directly accesses by ReFrame.
Other parameters have only an impact on the application execution, meaning that they should be passed as options to the executable.
5. Figures
In order to generate reports, the Feel++ benchmarking framework requires a figure description to specify what the website report page should contain.
These descriptions should be provided either with a specific JSON file with the structure containing uniquely
{
"plots":[]
}
Or by specifying the plots
field on the benchmark configuration JSON file.
Each figure description should contain the following fields
{
"title": "The figure title",
"plot_types": [], //List of figure types
"transformation": "", //Transformation type
"variables":[], // List of variables to consider
"names":[], //Respective labels for variables
"yaxis":{},
"xaxis":{},
"color_axis":{}, //Default: performance variables
"secondary_axis":{}
}
Figures will appear in the same order as they appear on the list.
Users can provide multiple plot_types in the same description field.
Only performance variables specified under the |
5.1. Axis
Each axis (with the exception of the yaxis
) take a parameter
and a label
field.
The yaxis
will always contain the performance values, therefore only the label
key should be specified.
The parameter
field of each axis should correspond to either a single dimension parameter specified on the benchmark configuration.
In the case of subparameters, the syntax should be the following: parameter.subparameter
.
By default, the color axis will contain the performance variables, but this can be customized.
5.2. Transformations
The ReFrame report will be used to create a Master DataFrame, which will contain all performance variables and their respective values, as well as all parameters and environments.
To explain how transformation and plot types work, we can consider the following example.
Out[1]: performance_variable value ... nb_tasks.exclusive_access elements 0 computation_time 5.329330 ... True 1.000000e+09 1 communication_time 0.052584 ... True 1.000000e+09 5 computation_time 4.569710 ... True 7.000000e+08 6 communication_time 0.005627 ... True 7.000000e+08 10 computation_time 3.855820 ... True 4.000000e+08 11 communication_time 0.180105 ... True 4.000000e+08 15 computation_time 0.062597 ... True 1.000000e+08 16 communication_time 0.016100 ... True 1.000000e+08 20 computation_time 5.130160 ... True 1.000000e+09 21 communication_time 0.000502 ... True 1.000000e+09 25 computation_time 5.069410 ... True 7.000000e+08 26 communication_time 0.035590 ... True 7.000000e+08 30 computation_time 4.704580 ... True 4.000000e+08 31 communication_time 0.129757 ... True 4.000000e+08 35 computation_time 0.183174 ... True 1.000000e+08 36 communication_time 0.003264 ... True 1.000000e+08 40 computation_time 4.833400 ... True 1.000000e+09 41 communication_time 0.000021 ... True 1.000000e+09 45 computation_time 4.903710 ... True 7.000000e+08 46 communication_time 0.000027 ... True 7.000000e+08 50 computation_time 2.243160 ... True 4.000000e+08 51 communication_time 0.000032 ... True 4.000000e+08 55 computation_time 0.622329 ... True 1.000000e+08 56 communication_time 0.000032 ... True 1.000000e+08 [24 rows x 9 columns]
We can see that this dataframe contains the parameters: - environment - platform - nb_tasks.tasks - nb_tasks.exclusive_access - elements - performance_variable
By having this common structure, we can make use of transformation strategies to manipulate values depending on the desired output.
Strategies will depend on the figure axis. All strategies will create a pivot dataframe that will contain the parameter specified as color_axis
as columns, xaxis
as first level index and secondary_axis
as second level index. Values of the dataframe will always be the values
of the master dataframe.
As an example, we will consider the following axis:
"xaxis":{
"parameter":"nb_tasks.tasks",
"label":"Number of tasks"
},
"yaxis":{
"label":"Execution time (s)"
},
"secondary_axis":{
"parameter":"elements",
"label":"N"
},
"color_axis":{
"parameter":"performance_variable",
"label":"Performance variable"
}
Available strategies are: - performance
This strategy should be seen as the "base" strategy. No transformation, other that a pivot, is done. For the given example, it produces the following dataframe
Out[1]: performance_variable communication_time computation_time elements nb_tasks.tasks 1.000000e+08 1 0.000032 0.622329 2 0.003264 0.183174 4 0.016100 0.062597 4.000000e+08 1 0.000032 2.243160 2 0.129757 4.704580 4 0.180105 3.855820 7.000000e+08 1 0.000027 4.903710 2 0.035590 5.069410 4 0.005627 4.569710 1.000000e+09 1 0.000021 4.833400 2 0.000502 5.130160 4 0.052584 5.329330
-
relative_performance
The relative performance strategy computes the proportion of the time that a a color_axis
variable takes with regards of the total.
Out[1]: performance_variable communication_time computation_time elements nb_tasks.tasks 1.000000e+08 1 0.005142 99.994858 2 1.750716 98.249284 4 20.458213 79.541787 4.000000e+08 1 0.001427 99.998573 2 2.684070 97.315930 4 4.462546 95.537454 7.000000e+08 1 0.000551 99.999449 2 0.697160 99.302840 4 0.122985 99.877015 1.000000e+09 1 0.000434 99.999566 2 0.009784 99.990216 4 0.977050 99.022950
The sum along the column axis will always be equal to 1.
-
speedup
The speedup strategy computes the speedup of the color_axis
variables. The minimum of the xaxis
values is taken as the base of the speedup.
For the example, this strategy will produce the following.
Out[1]: performance_variable communication_time ... half-optimal elements nb_tasks.tasks ... 1.000000e+08 1 1.000000 ... 1.0 2 0.009804 ... 1.5 4 0.001988 ... 2.5 4.000000e+08 1 1.000000 ... 1.0 2 0.000247 ... 1.5 4 0.000178 ... 2.5 7.000000e+08 1 1.000000 ... 1.0 2 0.000759 ... 1.5 4 0.004798 ... 2.5 1.000000e+09 1 1.000000 ... 1.0 2 0.041833 ... 1.5 4 0.000399 ... 2.5 [12 rows x 4 columns]
5.3. Plot types
Considering the same example axis as above, the software can generate the following figures:
-
scatter
-
stacked_bar
-
grouped_bar
-
table
5.4. Aggregations
Depending on the dashboard level that we are located at, it might be necessary to aggregate the data on the master dataframe.
For example, if we have all use cases, applications and machines on the dataframe, and we want to see how a certain use case performs on different machines, we can make use of the aggregations
field to group the data accordingly.
"aggregations":[
{"column":"date","agg":"max"},
{"column":"applications","agg":"filter:my_app"},
{"column":"use_cases","agg":"filter:my_use_case"},
{"column":"performance_variable","agg":"sum"}
]
The previous example will first get only the latest benchmarks (by getting the maximum date), then it will filter the application and the use case to find applications and use cases that correspond to "my_app" and "my_use_case". And finally it will compute the sum of all performance variables for the remaining rows.
Users must provide a column and an aggregation function as a string.
Available aggregations are:
- mean
: Computes the mean of the column
- mean
: Computes the sum of the column
- max
: Computes the maximum of the column
- min
: Computes the minimum of the column
- filter:value
: : Filters the column by value
.
The order of the aggregations list is important. |
5.5. Custom layouts
By providing the layout_modifiers
field, users can pass custom layout options for rendering the figures.
These options correspond to the accepted layout reference for Plotly: Plotly layout reference
It accepts a nested dictionnary just as Plotly does.
For example, we could customize a figure to have have its x-axis on a logscale.
"layout_modifiers":{
"xaxis":{
"type":"log"
}
}