Benchmarks standard v1
These are benchmarks generated with the benchFoam-script which is part of the PyFoam-distribution version 0.2.4. To compare with the results on this page use the config file standard_v1.cfg (this is not going to change anymore, standard.cfg is an ongoing effort and will eventually become standard_v2.cfg)
Contents
1 The suite
This is a short description of the cases in this benchmark suite. All the cases run with an unmodified OpenFOAM v1.2 in serial mode. Case 2 won't run in parallel. To run case 1 in parallel a patch has to be applied and parts of OpenFOAM have to be recompiled.
- Memory usage is rounded to the nearest whole number.
- Baseline-times are rounded to the nearest whole number
Case nr | solver | case-name | Modification to original case | Memory usage | Baseline time | Remarks/Problems |
---|---|---|---|---|---|---|
1 | dieselFoam | aachenBomb | files are removed before running | 275 MB | 2894 s | Parallel run fails for v1.2: the patch published here fixes this |
2 | dnsFoam | boxTurb16 | Splitting the grid | 45 MB | 408 s | Solver is not parallel |
3 | bubbleFoam | bubbleColumn | 11 MB | 410 s | ||
4 | interFoam | damBreak | Splitting the grid | 18 MB | 1606 s | |
5 | rhoSonicFoam | forwardStep | Splitting the grid | 25 MB | 687 s | |
6 | buoyantFoam | hotRoom, pseudo-BCs are set | Splitting the grid | 48 MB | 826 s | |
7 | engineFoam | kivaTest | 49 MB | 1338 s | : The time-step in the distributed tutorials is now a 10th of the time-step of the tutorial distributed with 1.2. A parameter was added to this case to reset it to the originial size to make the results comparable. | |
8 | Xoodles | pitzDaily3D | pseudo-BCs are set | 468 MB | 1106 s | |
9 | oodles | pitzDaily | 25 MB | 880 s | ||
10 | simpleFoam | pitzDaily | 54 MB | 869 s | ||
11 | sonicTurbFoam | prism | pseudo-BCs are set, grid is split | 37 MB | 650 s |
2 Results overview
2.1 Serial results
First column is a reference to a machine description below. Second column gives a brief summary of the machine. The OpenFOAM version used is in column 3. Then follows the overall speedup and the speedups for the individual cases.
The benchmarks are ordered by the overall speedup in ascending order.
Machine name | Machine description | OF version | Overall speedup | Individual speedups | Remarks | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | |||||
Apple iBook G4 | PPC G4 , 1.42 Ghz, 0.5 GB, Mac OS X 10.4 | 1.2 (Downloaded from Hrv) | 0.637 | 0.642 | 0.633 | 0.609 | 0.580 | 0.661 | 0.626 | 0.667 | 0.611 | 0.590 | 0.698 | 0.690 | |
Reference Machine | Pentium 4, 1.8 Ghz, 1GB, Linux | 1.2 Standard | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | |
Reference Machine | Pentium 4, 1.8 Ghz, 1GB, Linux | 1.3 Standard | 1.110 | 1.076 | 0.990 | 1.277 | 1.048 | 1.141 | 1.052 | 1.322 | 0.978 | 1.031 | 1.025 | 1.272 | |
George DualXeon | Dual Xeon, 2.8 Ghz, 4GB, Linux | 1.2 Standard | 1.777 | 1.821 | 1.845 | 1.769 | 1.653 | 1.660 | 1.847 | 1.776 | 1.852 | 1.782 | 1.787 | 1.753 | MultiThreading enabled |
George DualXeon | Dual Xeon, 2.8 Ghz, 4GB, Linux | 1.3 Standard | 2.078 | 2.004 | 1.933 | 2.370 | 1.849 | 2.132 | 2.083 | 2.508 | 1.818 | 1.979 | 1.822 | 2.361 | MultiThreading enabled |
Waltons Cluster | Pentium 4, 3 Ghz, 2GB, Linux | 1.2 Standard | 2.447 | 2.539 | 2.673 | 1.958 | 2.150 | 2.531 | 2.775 | 1.976 | 2.585 | 2.589 | 2.447 | 2.690 | Process migration with OpenMOSIX |
Waltons Cluster | Pentium 4, 3 Ghz, 2GB, Linux | 1.3 Standard | 3.273 | 4.904 | 2.955 | 2.620 | 2.460 | 3.193 | 3.197 | 4.786 | 2.776 | 2.890 | 2.486 | 3.739 | Process migration with OpenMOSIX |
Berta Dual-Opteron | Dual Opteron 248, 2.2 Ghz, 4GB, Linux | 1.2 Standard | 2.838 | 2.664 | 2.680 | 3.112 | 2.497 | 3.003 | 2.842 | 3.051 | 2.862 | 2.614 | 2.867 | 3.031 | |
Berta Dual-Opteron | Dual Opteron 248, 2.2 Ghz, 4GB, Linux | 1.3 Standard | 2.918 | 2.174 | 2.750 | 3.455 | 2.676 | 3.114 | 2.758 | 3.325 | 2.573 | 2.914 | 2.908 | 3.456 | |
Opti250 Dual-Opteron | Dual Opteron 250, 2.4 Ghz, 4GB, Linux | 1.2 Standard | 3.406 | 3.405 (849.8 s) | 3.482 (117.3 s) | 3.598 (114.0 s) | 2.726 (588.9 s) | 3.542 (194.0 s) | 3.408 (242.4 s) | 3.619 (369.7 s) | 3.263 (338.9 s) | 3.220 (273.2 s) | 3.546 (245.0 s) | 3.650 (177.9 s) | overall speedup stays the same if there full load on the second CPU |
MessRechner DualCore | 1.2 Standard | 3.673 | 3.409 (849.0 s) | 3.752 (108.8 s) | 3.981 (103.1 s) | 3.062 (524.5 s) | 3.779 (181.8 s) | 3.648 (226.6 s) | 3.916 (341.7 s) | 3.710 (298.2 s) | 3.432 (256.4 s) | 3.927 (221.3 s) | 3.786 (171.6 s) | results for one core only, if the benchmark runs on one core and an other process at the second core the overall speedup will drop to 3.251 | |
Opti252 Dual-Opteron | Dual Opteron 252, 2.6 Ghz, 4GB, Linux | 1.2 Standard | 4.044 | 3.811 (759.3 s) | 4.159 (98.2 s) | 4.278 (95.9 s) | 3.145 (510.5 s) | 4.202 (163.5 s) | 4.548 (181.7 s) | 4.209 (317.8 s) | 4.015 (275.4 s) | 3.785 (232.5 s) | 4.180 (207.9 s) | 4.143 (156.7 s) | |
Opti252 Dual-Opteron | Dual Opteron 252, 2.6 Ghz, 8GB, Linux | 1.4.1 compiled with gcc-4.2.1 | - | 4.416 (655.3 s) | 4.210 (97.0 s) | 5.263 (78.0 s) | 9.522 (168.7 s) | 5.292 (129.9 s) | 4.591 (180.0 s) | 6.078 (220.2 s) | error | 5.574 (157.9 s) | 4.348 (199.9 s) | 6.075 (106.9 s) | |
CentrinoDuo T2600 | CentrinoDuo, 2.16 Ghz, 2GB, Linux | 1.2 (Hrv's Zagreb Edition) | 3.220 | 3.036 | 2.432 | 2.877 | 2.745 | 3.417 | 3.444 | 3.139 | 3.100 | 3.369 | 4.272 | 3.587 | |
Opti285 Dual-core Opteron | Dual-core Opteron 285, 2.6 Ghz, 8GB, Linux | 1.4 compiled with gcc-4.1.0 | - | 4.286 (675.2 s) | 4.475 (91.3 s) | 5.555 (73.9 s) | 8.606 (186.6 s) | 5.139 (133.7 s) | 4.304 (192.0 s) | 5.836 (229.3 s) | Error | 5.801 (151.7 s) | 4.771 (182.1 s) | 5.490 (118.3 s) |
2.2 Parallel results
Please note that some of the cases in the suite are very small and therefor may lead to speedups smaller than 1 (especially for distributed memory machines).
Some of the cases included in the suite are too small to make a parallel machine 'look good'.
Machine name | Machine description | OF version | Nr of CPUs | Overall speedup | Individual speedups | Remarks | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ||||||
George DualXeon | Dual Xeon, 2.8 Ghz, 4GB, Linux | 1.2 Standard | 1 | 1.777 | 1.821 | 1.845 | 1.769 | 1.653 | 1.660 | 1.847 | 1.776 | 1.852 | 1.782 | 1.787 | 1.753 | MultiThreading enabled |
2 | 2.599 | 2.418 | 2.450 | 2.906 | 2.372 | 2.366 | 2.537 | 2.587 | 3.065 | 2.666 | 2.622 | |||||
George DualXeon | Dual Xeon, 2.8 Ghz, 4GB, Linux | 1.3 Standard | 1 | 2.078 | 2.004 | 1.933 | 2.370 | 1.849 | 2.132 | 2.083 | 2.508 | 1.818 | 1.979 | 1.822 | 2.361 | MultiThreading enabled |
2 | 3.062 (incomplete) | not working | 3.213 | 3.403 | 2.794 | 2.585 | 3.397 | 2.474 | 3.516 | 2.751 | 3.422 | |||||
Waltons Cluster | Pentium 4, 3 Ghz, 2GB, Linux | 1.2 Standard | 1 | 2.447 | 2.539 | 2.673 | 1.958 | 2.150 | 2.531 | 2.775 | 1.976 | 2.585 | 2.589 | 2.447 | 2.690 | Process migration with OpenMOSIX |
2 | Missing | |||||||||||||||
3 | 3.940 | 5.824 | 0.498 | 0.980 | 3.781 | 4.152 | 4.692 | 6.639 | 1.649 | 5.212 | 5.974 | |||||
4 | 4.608 | 7.074 | 0.456 | 0.953 | 3.858 | 4.825 | 5.361 | 8.688 | 1.547 | 6.221 | 7.101 | |||||
Berta Dual-Opteron | Dual Opteron 248, 2.2 Ghz, 4GB, Linux | 1.2 Standard | 1 | 2.838 | 2.664 | 2.680 | 3.112 | 2.497 | 3.003 | 2.842 | 3.051 | 2.862 | 2.614 | 2.867 | 3.031 | |
2 | 5.415 | 5.126 | 5.013 | 4.630 | 5.803 | 5.275 | 6.000 | 5.218 | 5.367 | 5.517 | 6.198 | |||||
Opti250 Dual-Opteron | Dual Opteron 250, 2.4 Ghz, 4GB, Linux | 1.2 Standard | 1 | 3.406 | 3.405 | 3.482 | 3.598 | 2.726 | 3.542 | 3.408 | 3.619 | 3.263 | 3.220 | 3.546 | 3.650 | |
2 | to be added | not working | 6.175 | 5.444 | 7.170 | 6.452 | 6.953 | 6.761 | 6.396 | 7.288 | 7.800 | |||||
MessRechner DualCore | 1.2 Standard | 1 core | 3.673 | 3.408 | 3.751 | 3.980 | 3.061 | 3.778 | 3.647 | 3.915 | 3.709 | 3.432 | 3.926 | 3.785 | ||
2 cores | 5.65 | 4.696 | 6.229 | 5.072 | 6.236 | 5.360 | 6.152 | 5.037 | 5.672 | 6.199 | 5.808 | |||||
Opti252 Dual-Opteron | Dual Opteron 252, 2.6 Ghz, 4GB, Linux | 1.2 Standard | 1 | 4.044 | 3.811 | 4.159 | 4.278 | 3.145 | 4.202 | 4.548 | 4.209 | 4.015 | 3.785 | 4.180 | 4.143 | |
2 | to be added | not working | 6.883 | 6.019 | 8.501 | 9.216 | 8.399 | 7.752 | 7.503 | 8.892 | 9.065 | |||||
CentrinoDuo T2600 | CentrinoDuo, 2.16 Ghz, 2GB, Linux | 1.2 (Hrv's Zagreb Edition) | 1 core | 3.220 | 3.036 | 2.432 | 2.877 | 2.745 | 3.417 | 3.444 | 3.139 | 3.100 | 3.369 | 4.272 | 3.587 | |
2 cores | 5.385 | 4.513 | 4.542 | 4.817 | 5.756 | 5.435 | 5.198 | 4.832 | 6.147 | 6.540 | 6.067 |
3 Machine description
3.1 Susi (gcds07)
aka gcds07
- Vendor
- none (Assembled machine)
- OS
- Linux - Fedora Core 4
- CPU
- Pentium 4 with 1.8 GHz
- Motherboard/Chipset
- to be inserted
- RAM
- 1 GB, have to check the type
- OpenFOAM
- 1.2, Original binary
- Compiler
- gcc 4.0.1 as provided with OpenFOAM
- Remarks
- The OpenFOAM installation and the data directories were on remote machines and mounted via NFS (Network is 100Mbit), On the machine a OpenGroupware-server was running, but the benchmarks were running at night when this server wasn't used so this shouldn't have an impact
- Benchmarks provided by
- --Bgschaid 15:51, 24 Jan 2006 (CET)
3.2 bg's iBook
- Vendor
- Apple
- Type
- iBook G4
- OS
- Mac OS X 10.4 (Tiger)
- CPU
- PowerPC with 1.42 GHz
- Motherboard/Chipset
- not applicable
- RAM
- 0.5 GB
- OpenFOAM
- 1.2, Binary downloaded from Hrv's site so it should be more advanced version than a stock 1.2
- Compiler
- gcc 4.0.1 as provided by Apple
- Remarks
- It's a notebook. I tried to turn off all the PowerSaving-options to increase performance
- Benchmarks provided by
- --Bgschaid 15:56, 24 Jan 2006 (CET)
3.3 George (gcdw98)
TODO
3.4 Waltons
TODO
3.5 Berta (gcdw50)
TODO
3.6 MessRechner (pc185)
aka pc185
- Vendor
- none (Assembled machine)
- OS
- Linux - Suse 10
- Motherboard/Chipset
- to be inserted
- RAM
- 4 GB, have to check the type
- OpenFOAM
- 1.2, Original binary
- Compiler
- gcc 4.0.1 as provided with OpenFOAM
- Benchmarks provided by
- --Jens 15:43, 6 Feb 2006 (CET)
3.7 Opti250
- Vendor
- Megware
- OS
- Linux - Suse 10
- CPU
- 2x AMD Opteron 250 (2,4 GHz)
- Motherboard/Chipset
- Thunder K8SRE (S2891)
- RAM
- 4 GB, have to check the type
- OpenFOAM
- 1.2, Original binary
- Compiler
- gcc 4.0.1 as provided with OpenFOAM
--Jens 09:55, 8 Feb 2006 (CET)
3.8 Opti252
- Vendor
- Selfbuild
- OS
- Linux - Ubuntu
- CPU
- 2x AMD Opteron 252 (2,6 GHz)
- Motherboard/Chipset
- Thunder K8WE (S2895)
- RAM
- 8 GB, have to check the type
- OpenFOAM
- 1.2, Original binary, 1.3, 1.4, 1.4.1
- Compiler
- gcc 4.0.1, gcc 4.2.1
- Benchmarks provided by
- --Jens 15:43, 6 Feb 2006 (CET)
3.9 CentrinoDuo T2600
- Vendor
- Dell
- OS
- Linux - Suse 10
- CPU
- 2x Intel Centrino (2,16 GHz)
- Motherboard/Chipset
- Intel 945PM
- RAM
- 2 GB, 677 MHz
- OpenFOAM
- 1.2, Hrv's Zagreb Edition
- Compiler
- gcc 4.0.2 as provided with OpenFOAM
- Benchmarks provided by
- --Marcus 17:00, 19 Feb 2006 (CET)
3.10 Opti285
- Vendor
- Supermicro
- OS
- Linux - Suse 10
- CPU
- 2-way Dual-Core AMD Opteron 285 (2.6 GHz)
- Motherboard/Chipset
- Supermicro H8DCE-HTe, nVidia(r) nForce Pro 2200 (CK804) /nVidia(r) nForce Pro 2050 (CKIO4) Chipset
- RAM
- 8 GB, DDR 400MHz
- OpenFOAM
- 1.4
- Compiler
- gcc 4.1.0
- Benchmarks provided by
--Huiyu Feng Fhy 01:57, 31 Aug 2007 (CEST)
4 Comparing OpenFOAM-versions
This is ancient history because using older OpenFOAM-versions is not recommended but there is a modified version of the standard_v1.cfg called standard_v1_pre12.cfg in PyFoam-distribution version 0.2.5 that makes it possible to run the benchmark suite on two older versions. The results on the reference machine are:
OF version | Overall speedup | Individual speedups | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ||
1.2 Standard | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 |
275 MB | 45 MB | 11 MB | 18 MB | 25 MB | 48 MB | 49 MB | 468 MB | 25 MB | 54 MB | 37 MB | ||
1.1 Recompiled | 0.993 | 0.974 | 1.040 | 1.025 | 0.760 | 0.966 | 0.984 | 1.009 | 1.026 | 1.059 | 1.054 | 1.029 |
306 MB | 51 MB | 11 MB | 21 MB | 28 MB | 52 MB | 53 MB | 508 MB | 26 MB | 68 MB | 41 MB | ||
1.0 Recompiled | 1.009 | 0.989 | 1.072 | 1.035 | 0.737 | 0.978 | 1.053 | 1.023 | 1.046 | 1.066 | 1.066 | 1.034 |
310 MB | 50 MB | 11 MB | 20 MB | 28 MB | 50 MB | 54 MB | 507 MB | 25 MB | 60 MB | 41 MB |
- Version 1.0 and 1.2 were recompiled but the standard-switches were used
- the tutorial cases delivered with the corresponding version were used
- in case 7 the deltaT had to be multiplied by 10 (otherwise the speedup compared with 1.2 would have been 0.11)
- if someone wants to test the parallel-versions he is welcome to do so