Benchmark configuration
Configuring a benchmark can be quite extensive, as this framework focuses on flexibility. For this, the documentation will be divided in main sections.
The base of the configuration file is shown below.
{
"executable": "",
"use_case_name": "",
"timeout":"",
"platforms":{},
"options": [],
"outputs":{},
"scalability":{},
"sanity":{},
"parameters":{},
}
1. Fields on JSON root
Users can add any field used for refactoring. For example, one can do the following.
"output_directory":"/data/outputs" // This is a custom field
"options":["--output {{output_directory}}"]
Mandatory and actionable fields are:
Field name | Optional | type | description | Default |
---|---|---|---|---|
executable |
No |
str |
Path or name of the application executable. |
|
use_case_name |
No |
str |
Custom name given to the use case. Serves as an ID of the use case, must be unique across use cases. |
|
timeout |
No |
str |
Job execution timeout. Format: days-hours:minutes:seconds |
2. Platforms
The platforms
object lists all options and directories related to the benchmark execution for each supported platform. A platform present on this object does not imply that it will be benchmarked, but it rather lists all possible options.
The field is optional, if not provided, the builtin platform will be considered.
The syntax for builtin platform is the following:
"platforms": {
"builtin":{
"input_dir":"",
"append_app_option":[]
}
}
-
append_app_options
is in this case equivalent to theoptions
field on the configuration root. -
input_dir
indicates the path of the directory where input files can be found.
The following example shows how to configure the Apptainer platform:
"platforms":{
"apptainer":{
"image":{
"name":"{{machine.containers.apptainer.image_base_dir}}/my_image.sif"
},
"input_dir":"/input_data/",
"options":["--home {{machine.output_app_dir}}"],
"append_app_options":["--my_custom_option_for_apptainer"]
}
}
For any container, the image
field must be specified, specifically image.name
, containing the path of the image. Pulling images is not yet supported.
In this case, input_dir
represents the directory where input files will be found INSIDE the container.
The options
field contains a list of all the options to include on the container execution. It is equivalent to the machine’s containers.apptainer.options
field. However, users should only include application dependent options in this list.
The append_app_options
lists all the options to add to the application execution. It does the same as the options
field in the root of the file, but can be used for case handling.
3. Outputs
The outputs
field lists all the files where application outputs are exported.
Field name | Optional | type | description | Default |
---|---|---|---|---|
filepath |
No |
str |
Path of the file containing the outputs |
|
filepath |
No |
str (csv, json) |
Format of the output file |
All columns or fields present on the output file will be considered. Outputs are added to the ReFrame report as pervormance variables.
Soon, the same syntax as scalability files will be used. |
4. Scalability
Lists all the files where performance times can be found.
Field name | Optional | type | description | Default |
---|---|---|---|---|
directory |
No |
str |
Common directory where scalability files can be found. Used for refactoring fields. |
|
stages |
No |
list[Stage] |
List of scalability file objects describing them. |
Each stage object is described as follows
Field name | Optional | type | description | Default |
---|---|---|---|---|
name |
No |
str |
Prefix to add to the performance variables found in the file |
|
filepath |
No |
str |
partial filepath, relative to the |
|
format |
No |
str |
Format of the file. Supported values are (csv, json, tsv) |
|
variables_path |
Yes |
str |
Only if format is json. Defines where, in the JSON hierrarchy, performance variables will be found. Supports the use of a single wildcard ( |
An example of the scalability field is found and explained below.
"scalability": {
"directory": "{{output_directory}}/{{instance}}/cem/",
"stages": [
{
"name":"custom_execution_name",
"filepath": "instances/np_{{parameters.nb_tasks.tasks.value}}/logs/execution_timers.json",
"format": "json",
"variables_path":"execution.*"
},
{
"name":"construction",
"filepath": "logs/timers.json",
"format": "json",
"variables_path":"*.constructor"
}
]
}
-
directory
implies that scalability files can be found under{{output_directory}}/{{instance}}/cem/
. Whereoutput_directory
is defined above, andinstance
is a reserved keyword containing the hashcode of the test.
There are two scalability files in the example. Let’s suppose files are built like follows:
-
logs/execution_timers.json
{ "execution":{ "step1":0.5, "step2":0.7, "step3":1.0, }, "postprocess":{...} }
-
logs/timers.json
{ "function1":{ "constructor":1.0, "init":0.1, }, "function2":{ "constructor":1.0, "init":0.1, } }
Then, by specifying "variables_path":"exection.*"
, performance variables will be custom_execution_name.step1
, custom_execution_name.step2
and custom_execution_name.step3
.
And by specifying "variables_path":"*.constructor
for the other file, performance vairalbes will be construction.function1
, and construction.function2
.
Note how variables are prefixed with the value under name
, and that the wildcard (*
) determines the variable names.
Deeply nested and complex JSON scalability files are supported. |
5. Sanity
The sanity
field is used to validate the application execution.
The syntax is the following:
"sanity":{
"success":[],
"error":[]
}
-
The
success
field contains a list of patterns to look for in the standard output. If any of the patterns are not found, the test will fail. -
The
error
field contains a list of patters that will make the test fail if found in the standard output. If any of these paterns are found, the test will fail.
At the moment, only validating standard output is supported. It will soon be possible to specify custom log files. |
6. Parameters
The parameters
field list all parameters to be used in the test.
The cartesian product of the elements in this list will determine the benchmarks.
Parameters are accessible across the whole configuration file by using the syntax {{parameters.my_parameter.value}}
.
Each parameter is described by a name and a generator.
Valid generators are :
-
linspace
:
{ "name": "my_linspace_generator", "geomspace":{ "min":2, "max":10, "n_steps":5 } }
The example will yield [2,4,6,8,10]
. Min and max are inclusive.
-
geomspace
:
{ "name": "my_geomspace_generator", "geomspace":{ "min":1, "max":10, "n_steps":4 } }
The example will yield [2,16,128,1024]
. Min and max are inclusive.
-
range
:
{ "name": "my_range_generator", "geomspace":{ "min":1, "max":5, "step":1 } }
The example will yield [1,2,3,4,5]
. Min and max are inclusive.
-
geometric
:
{ "name": "my_geometric_generator", "geometric":{ "start":1, "ratio":2, "n_steps":5 } }
The example will yield [1,2,4,8,16]
.
-
repeat
:
{ "name": "my_repeat_generator", "repeat":{ "value":"a repeated value", "count":3 } }
The example will yield ["a repeated value", "a repeated value", "a repeated value"]
.
-
sequence
:
Sequence accepts
{
"name": "my_sequence_generator",
"sequence":[ 1, 2, 3, 4]
}
Sequence is the simplest generator. It will yield exactly the given list.
It accepts dictionnaries as items, which can then be accessed via the .
separator.
-
zip
and subparameters:
Parameters can contain subparameters, which can be accessed recursively via the .
separator. Its objective is to have parameters that depend on eachother, without producing a cartesian product.
Aditionnaly, parameters can be zipped together via the zip
generator.
The zip
generator takes a list of parameters to produce a list of python dictionaries. Each param inside the list can then have any desired generator from above.
{
"name": "my_zip_generator",
"zip":[
{
"name":"param1",
"sequence":[
{"val1":1,"val2":2},
{"val1":3,"val2":4},
{"val1":5,"val2":6}
]
},
{
"name":"param2",
"repeat":{
"value":"a repeated value",
"count":3
}
}
]
}
This example will yield [{'param1': {'val1': 1, 'val2': 2}, 'param2': 'a repeated value'}, {'param1': {'val1': 3, 'val2': 4}, 'param2': 'a repeated value'}, {'param1': {'val1': 5, 'val2': 6}, 'param2': 'a repeated value'}]
Zipped parameters need to have the same lenght. |
-
Special parameters
There is one special parameter: nb_tasks
. If included, should follow some rules for its subparameters.
Accepts exclusive_access
subparameter. Defaults to true
.
Either specify tasks_per_node
and tasks
subparameters, OR specify tasks_per_node
and nodes
subparameters, OR Specify only the tasks
parameter.
Specifying tasks
and nodes
is NOT currently supported.
The nb_tasks
parameter and its subparameters are directly accesses by ReFrame.
Other parameters have only an impact on the application execution, meaning that they should be passed as options to the executable.