There are three kinds of lies: lies, damn lies and benchmarks
This script can be found in the examples-directory of the PyFoam-distribution. It used to be called benchFoam.py and now is a command pyFoamBench.py
From the example-directory of the PyFoam distribution copy a configuration File (for instance data/default.cfg to a local directory. The benchmark can then be run with the command
An arbitrary number of configuration files can be specified. Configuration files can be found in the examples-directory of the PyFoam-source-distribution
The script then copies the specified cases from the $FOAM_TUTORIAL-directory to the local directory, modifies them and then runs the solver on them. It records the wallClock- and the CPU-time and writes the information to a file. The speedup is calculated by comparing the wallClock-time to a reference time.
If the benchmark is specified to be parallel, the specified LAM-machine is automatically booted and shutdown after running the benchmark.
The script tries to determine the maximum memory used. Because the getrusage-system call is not correctly implemented on Linux-machines (and on Mac OS X, too) this feature is untested.
1.1 Format of the config File
An example for a config-file is given below:
[General] name: default parallel: no nProcs: 2 machines: benchMachines casesDirectory: ~/myBenchmarks [AachenBomb with dieselFoam] solver: dieselFoam case: aachenBomb additional: ["chemkin"] prepare: [("blockMesh","")] controlDict: [("endTime",1e-4),("writeInterval",0.5e-4),("writeCompression","compressed")] baseline: 2894 weight: 1 filesToRemove: ["0/ft","0/fu"] parallelOK: no [Dam-break tutorial case] solver: interFoam case: damBreak prepare: [("blockMesh",""),("setFields","")] controlDict: [("endTime",0.5),("writeInterval",0.1)] baseline: 106.38 weight: 3 blockSplit: (2,2,1) parallelOK: yes [HotRoom with buoyantFoam] solver: buoyantFoam case: hotRoom utilities: ["setHotRoom"] prepare: [("blockMesh",""),("setHotRoom","")] controlDict: [("endTime",10),("deltaT",0.05),("writeInterval",100),("writeCompression","compressed")] setInitial: [("p","floor","1e5"),("p","ceiling","1e5"),("p","fixedWalls","1e5")] baseline: 826.375 weight: 1 blockSplit: 2 parallelOK: yes decomposition: simple 1
The file is split into sections. Each section starts with a section name given in square brackets. The General-section has to be present. It specifies some general information about the benchmark:
- the name of the benchmark. This is used in the names of the output-files and directories
- whether or not it is a parallel benchmark
- the number of CPUs used for the parallel benchmark
- name of the file used to boot the LAM-machine
- Directory where the benchmark cases reside. If not set the value of the $FOAM_TUTORIALS (where the standard tutorial cases reside) variable is used. In that directory the cases must be organized in the same way they are in $FOAM_TUTORIALS: a separate directory for every solver in which the cases for that solver reside.
Each of the other sections specifies a different benchmark case. The name of the Section is the name under which the benchmark will be known for screen-output. The options in the section are:
- the order in which the benchmarks will be executed. If unspecified or two numbers are the same the order will be unspecified
- name of the solver to execute
- name of the tutorial case for that solver
- a list of commands to execute in order to prepare the case for running the benchmark. Each command is given by a pair: the first value is the name of the command, the second value are the additional options that are inserted after the working directory and the casename (the usual calling convention for OpenFOAM-utilities). If the string %case% appears in the second value, it is replaced with the name of the case-directory.
- an optional list of utilities that have to be compiled in order to run the case. It is assumed that the sources of the utilities reside in the directory of the case (usually the case for the tutorial cases)
- values that are to be changed in the standard controlDict of the tutorial case in order to change the running time of the case
- time it takes for the case to run on a reference machine
- weight with which this case contributes to the overall-speedup of the benchmark suite
- optional value that is used to resize the mesh size in a blockMesh. If a scalar each number of cells is multiplied with that value. If a triple, then each direction is multiplied with the corresponding value.
- optional value that says whether or not this case can be run in parallel. If no value is set it is assumed that the case 'can not be run in parallel
- optional value. A list of files that should be removed from the case before it is prepared
- optional value. In which way a case should be decomposed for parallel runs. Default is metis. The other valid value is simple plus a number (0,1 or 2) that says which is the primary direction of decomposition
- optional value. A list of triples that set initial values. The elements of the triple are:
- the name of the field
- the name of the boundary
- the value
Because the script measures the wall-clock-time no other activity (users, server tasks) should take place on that machine.
The script should work on all systems compatible with PyFoam. Some systems have shortcomings:
- Because getrusage is not correctly implemented the Maximum-Memory-Usage can only be approximated by a separate thread that monitors the memory usage. Because this thread only monitors ever 10 seconds (to keep the performance impact low) it might miss peaks in the usage. Debian-based Linuxes (Ubuntu for instance) seem to have similar problems as Mac OS X (no CPU-Time).
- Mac OS X
- Threading seems to be strangely implemented. Therefor the Wall-Clock-Time may be some split-seconds off. Also no CPU-Time is available.
This describes the benchmark suite in the file standard.cfg distributed with PyFoam.
2.1 Choice of benchmark cases
The first criteria for the selection of the cases was that only standard tutorial cases are selected.
The next criterium was that the complete suite should run on the reference machine in one night. Because of this criterium most cases are not calculated for their whole duration. In addition to this the cases should comfortably fit into a machine with 0.5 Gigabytes of memory.
The running time of the individual cases was adjusted to be
- more than a quarter of an hour (in order to keep the influence of the startup-procedure low)
- less than an hour (to meet the overall-time requirement)
For small cases the blockMesh is refined for the case to have at least 10k cells. (this is still to small to produce reasonable results for parallel benchmarks but should assure that not the whole simulation fits into the cache of the processor)
2.2 The reference machine
Currently the reference machine in the distributed config-files is a Fedora 4 machine with a 1.8 GHz Pentium 4 and 1 Gigabyte of RAM. The installed OpenFOAM is version 1.2.
This machine was choosen because it is the slowest machine I have currently available.
2.3 A possible benchmark suite
The simulations in this suite were chosen to fit with the above requirements and give a cross-section of the available solvers in OpenFOAM:
|solver||case-name||Modification to original case||Memory||Features||Remarks/Problems|
|dieselFoam||aachenBomb||files are removed before running||275 MB||Lagrangian particles, chemical reactions with ChemKin||Parallel run fails for v1.2: the patch published here fixes this|
|dnsFoam||boxTurb16||Splitting the grid||45 MB||DNS||Solver is not parallel|
|bubbleFoam||bubbleColumn||11 MB||Two-phase solver|
|interFoam||damBreak||Splitting the grid||18 MB||Two-phase solver|
|rhoSonicFoam||forwardStep||Splitting the grid||25 MB||Super-sonic solver|
|buoyantFoam||hotRoom, pseudo-BCs are set||Splitting the grid||48 MB||Heat transfer|
|engineFoam||kivaTest||49 MB||Mesh motion, combustion|
|Xoodles||pitzDaily3D||pseudo-BCs are set||468 MB||Combustion, LES|
|simpleFoam||pitzDaily||54 MB||Steady-state solver|
|sonicTurbFoam||prism||pseudo-BCs are set, grid is split||37 MB||Super-sonic, turbulent|
2.4 Specification/Publication of benchmarks
Three main categories should be specified when talking about a benchmark (if one of them changes you're benchmarking a different system):
- Hardware/Operating system
- Most benchmarks only specify this complex. Important info is:
- CPU-type and clock-frequency
- Operating system + Version
- Modern CPUs are nothing without a good compiler. If you use a recompiled version of OpenFOAM specify the compiler (+version) you're using and the compiler-switches that are used for optimization
- OpenFOAM version
- Algorithms get better. So benchmarking version 1.3 may get you significantly different results than version 1.2 on the same machine
In addition to this information specify who did this benchmark and when.