Figures
In order to generate reports, the Feel++ benchmarking framework requires a figure description to specify what the website report page should contain.
These descriptions should be provided either with a specific JSON file with the structure containing uniquely
{
"plots":[]
}
Or by specifying the plots
field on the benchmark configuration JSON file.
Each figure description should contain the following fields
{
"title": "The figure title",
"plot_types": [], //List of figure types
"transformation": "", //Transformation type
"variables":[], // List of variables to consider
"names":[], //Respective labels for variables
"yaxis":{},
"xaxis":{},
"color_axis":{}, //Default: performance variables
"secondary_axis":{}
}
Figures will appear in the same order as they appear on the list.
Users can provide multiple plot_types in the same description field.
Only performance variables specified under the |
1. Axis
Each axis (with the exception of the yaxis
) take a parameter
and a label
field.
The yaxis
will always contain the performance values, therefore only the label
key should be specified.
The parameter
field of each axis should correspond to either a single dimension parameter specified on the benchmark configuration.
In the case of subparameters, the syntax should be the following: parameter.subparameter
.
By default, the color axis will contain the performance variables, but this can be customized.
2. Transformations
The ReFrame report will be used to create a Master DataFrame, which will contain all performance variables and their respective values, as well as all parameters and environments.
To explain how transformation and plot types work, we can consider the following example.
Out[1]: performance_variable value ... nb_tasks.exclusive_access elements 0 computation_time 5.329330 ... True 1.000000e+09 1 communication_time 0.052584 ... True 1.000000e+09 5 computation_time 4.569710 ... True 7.000000e+08 6 communication_time 0.005627 ... True 7.000000e+08 10 computation_time 3.855820 ... True 4.000000e+08 11 communication_time 0.180105 ... True 4.000000e+08 15 computation_time 0.062597 ... True 1.000000e+08 16 communication_time 0.016100 ... True 1.000000e+08 20 computation_time 5.130160 ... True 1.000000e+09 21 communication_time 0.000502 ... True 1.000000e+09 25 computation_time 5.069410 ... True 7.000000e+08 26 communication_time 0.035590 ... True 7.000000e+08 30 computation_time 4.704580 ... True 4.000000e+08 31 communication_time 0.129757 ... True 4.000000e+08 35 computation_time 0.183174 ... True 1.000000e+08 36 communication_time 0.003264 ... True 1.000000e+08 40 computation_time 4.833400 ... True 1.000000e+09 41 communication_time 0.000021 ... True 1.000000e+09 45 computation_time 4.903710 ... True 7.000000e+08 46 communication_time 0.000027 ... True 7.000000e+08 50 computation_time 2.243160 ... True 4.000000e+08 51 communication_time 0.000032 ... True 4.000000e+08 55 computation_time 0.622329 ... True 1.000000e+08 56 communication_time 0.000032 ... True 1.000000e+08 [24 rows x 9 columns]
We can see that this dataframe contains the parameters: - environment - platform - nb_tasks.tasks - nb_tasks.exclusive_access - elements - performance_variable
By having this common structure, we can make use of transformation strategies to manipulate values depending on the desired output.
Strategies will depend on the figure axis. All strategies will create a pivot dataframe that will contain the parameter specified as color_axis
as columns, xaxis
as first level index and secondary_axis
as second level index. Values of the dataframe will always be the values
of the master dataframe.
As an example, we will consider the following axis:
"xaxis":{
"parameter":"nb_tasks.tasks",
"label":"Number of tasks"
},
"yaxis":{
"label":"Execution time (s)"
},
"secondary_axis":{
"parameter":"elements",
"label":"N"
},
"color_axis":{
"parameter":"performance_variable",
"label":"Performance variable"
}
Available strategies are: - performance
This strategy should be seen as the "base" strategy. No transformation, other that a pivot, is done. For the given example, it produces the following dataframe
Out[1]: performance_variable communication_time computation_time elements nb_tasks.tasks 1.000000e+08 1 0.000032 0.622329 2 0.003264 0.183174 4 0.016100 0.062597 4.000000e+08 1 0.000032 2.243160 2 0.129757 4.704580 4 0.180105 3.855820 7.000000e+08 1 0.000027 4.903710 2 0.035590 5.069410 4 0.005627 4.569710 1.000000e+09 1 0.000021 4.833400 2 0.000502 5.130160 4 0.052584 5.329330
-
relative_performance
The relative performance strategy computes the proportion of the time that a a color_axis
variable takes with regards of the total.
Out[1]: performance_variable communication_time computation_time elements nb_tasks.tasks 1.000000e+08 1 0.005142 99.994858 2 1.750716 98.249284 4 20.458213 79.541787 4.000000e+08 1 0.001427 99.998573 2 2.684070 97.315930 4 4.462546 95.537454 7.000000e+08 1 0.000551 99.999449 2 0.697160 99.302840 4 0.122985 99.877015 1.000000e+09 1 0.000434 99.999566 2 0.009784 99.990216 4 0.977050 99.022950
The sum along the column axis will always be equal to 1.
-
speedup
The speedup strategy computes the speedup of the color_axis
variables. The minimum of the xaxis
values is taken as the base of the speedup.
For the example, this strategy will produce the following.
Out[1]: performance_variable communication_time ... half-optimal elements nb_tasks.tasks ... 1.000000e+08 1 1.000000 ... 1.0 2 0.009804 ... 1.5 4 0.001988 ... 2.5 4.000000e+08 1 1.000000 ... 1.0 2 0.000247 ... 1.5 4 0.000178 ... 2.5 7.000000e+08 1 1.000000 ... 1.0 2 0.000759 ... 1.5 4 0.004798 ... 2.5 1.000000e+09 1 1.000000 ... 1.0 2 0.041833 ... 1.5 4 0.000399 ... 2.5 [12 rows x 4 columns]
3. Plot types
Considering the same example axis as above, the software can generate the following figures:
-
scatter
-
stacked_bar
-
grouped_bar
-
table
4. Aggregations
Depending on the dashboard level that we are located at, it might be necessary to aggregate the data on the master dataframe.
For example, if we have all use cases, applications and machines on the dataframe, and we want to see how a certain use case performs on different machines, we can make use of the aggregations
field to group the data accordingly.
"aggregations":[
{"column":"date","agg":"max"},
{"column":"applications","agg":"filter:my_app"},
{"column":"use_cases","agg":"filter:my_use_case"},
{"column":"performance_variable","agg":"sum"}
]
The previous example will first get only the latest benchmarks (by getting the maximum date), then it will filter the application and the use case to find applications and use cases that correspond to "my_app" and "my_use_case". And finally it will compute the sum of all performance variables for the remaining rows.
Users must provide a column and an aggregation function as a string.
Available aggregations are:
- mean
: Computes the mean of the column
- mean
: Computes the sum of the column
- max
: Computes the maximum of the column
- min
: Computes the minimum of the column
- filter:value
: : Filters the column by value
.
The order of the aggregations list is important. |
5. Custom layouts
By providing the layout_modifiers
field, users can pass custom layout options for rendering the figures.
These options correspond to the accepted layout reference for Plotly: Plotly layout reference
It accepts a nested dictionnary just as Plotly does.
For example, we could customize a figure to have have its x-axis on a logscale.
"layout_modifiers":{
"xaxis":{
"type":"log"
}
}