Benchmarks standard v1

From OpenFOAMWiki
Revision as of 06:12, 3 September 2007 by Jens (Talk | contribs)

These are benchmarks generated with the benchFoam-script which is part of the PyFoam-distribution version 0.2.4. To compare with the results on this page use the config file standard_v1.cfg (this is not going to change anymore, standard.cfg is an ongoing effort and will eventually become standard_v2.cfg)

Valid versions: OF version 12.png OF version 13.png

1 The suite

This is a short description of the cases in this benchmark suite. All the cases run with an unmodified OpenFOAM v1.2 in serial mode. Case 2 won't run in parallel. To run case 1 in parallel a patch has to be applied and parts of OpenFOAM have to be recompiled.

  • Memory usage is rounded to the nearest whole number.
  • Baseline-times are rounded to the nearest whole number
Benchmark suite
Case nr solver case-name Modification to original case Memory usage Baseline time Remarks/Problems
1 dieselFoam aachenBomb files are removed before running 275 MB 2894 s Parallel run fails for v1.2: the patch published here fixes this
2 dnsFoam boxTurb16 Splitting the grid 45 MB 408 s Solver is not parallel
3 bubbleFoam bubbleColumn 11 MB 410 s
4 interFoam damBreak Splitting the grid 18 MB 1606 s
5 rhoSonicFoam forwardStep Splitting the grid 25 MB 687 s
6 buoyantFoam hotRoom, pseudo-BCs are set Splitting the grid 48 MB 826 s
7 engineFoam kivaTest 49 MB 1338 s OF version 13.png: The time-step in the distributed tutorials is now a 10th of the time-step of the tutorial distributed with 1.2. A parameter was added to this case to reset it to the originial size to make the results comparable.
8 Xoodles pitzDaily3D pseudo-BCs are set 468 MB 1106 s
9 oodles pitzDaily 25 MB 880 s
10 simpleFoam pitzDaily 54 MB 869 s
11 sonicTurbFoam prism pseudo-BCs are set, grid is split 37 MB 650 s

2 Results overview

2.1 Serial results

First column is a reference to a machine description below. Second column gives a brief summary of the machine. The OpenFOAM version used is in column 3. Then follows the overall speedup and the speedups for the individual cases.

The benchmarks are ordered by the overall speedup in ascending order.

Serial results
Machine name Machine description OF version Overall speedup Individual speedups Remarks
1 2 3 4 5 6 7 8 9 10 11
Apple iBook G4 PPC G4 , 1.42 Ghz, 0.5 GB, Mac OS X 10.4 1.2 (Downloaded from Hrv) 0.637 0.642 0.633 0.609 0.580 0.661 0.626 0.667 0.611 0.590 0.698 0.690
Reference Machine Pentium 4, 1.8 Ghz, 1GB, Linux 1.2 Standard 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
Reference Machine Pentium 4, 1.8 Ghz, 1GB, Linux 1.3 Standard 1.110 1.076 0.990 1.277 1.048 1.141 1.052 1.322 0.978 1.031 1.025 1.272
George DualXeon Dual Xeon, 2.8 Ghz, 4GB, Linux 1.2 Standard 1.777 1.821 1.845 1.769 1.653 1.660 1.847 1.776 1.852 1.782 1.787 1.753 MultiThreading enabled
George DualXeon Dual Xeon, 2.8 Ghz, 4GB, Linux 1.3 Standard 2.078 2.004 1.933 2.370 1.849 2.132 2.083 2.508 1.818 1.979 1.822 2.361 MultiThreading enabled
Waltons Cluster Pentium 4, 3 Ghz, 2GB, Linux 1.2 Standard 2.447 2.539 2.673 1.958 2.150 2.531 2.775 1.976 2.585 2.589 2.447 2.690 Process migration with OpenMOSIX
Waltons Cluster Pentium 4, 3 Ghz, 2GB, Linux 1.3 Standard 3.273 4.904 2.955 2.620 2.460 3.193 3.197 4.786 2.776 2.890 2.486 3.739 Process migration with OpenMOSIX
Berta Dual-Opteron Dual Opteron 248, 2.2 Ghz, 4GB, Linux 1.2 Standard 2.838 2.664 2.680 3.112 2.497 3.003 2.842 3.051 2.862 2.614 2.867 3.031
Berta Dual-Opteron Dual Opteron 248, 2.2 Ghz, 4GB, Linux 1.3 Standard 2.918 2.174 2.750 3.455 2.676 3.114 2.758 3.325 2.573 2.914 2.908 3.456
Opti250 Dual-Opteron Dual Opteron 250, 2.4 Ghz, 4GB, Linux 1.2 Standard 3.406 3.405 (849.8 s) 3.482 (117.3 s) 3.598 (114.0 s) 2.726 (588.9 s) 3.542 (194.0 s) 3.408 (242.4 s) 3.619 (369.7 s) 3.263 (338.9 s) 3.220 (273.2 s) 3.546 (245.0 s) 3.650 (177.9 s) overall speedup stays the same if there full load on the second CPU
MessRechner DualCore 1.2 Standard 3.673 3.409 (849.0 s) 3.752 (108.8 s) 3.981 (103.1 s) 3.062 (524.5 s) 3.779 (181.8 s) 3.648 (226.6 s) 3.916 (341.7 s) 3.710 (298.2 s) 3.432 (256.4 s) 3.927 (221.3 s) 3.786 (171.6 s) results for one core only, if the benchmark runs on one core and an other process at the second core the overall speedup will drop to 3.251
Opti252 Dual-Opteron Dual Opteron 252, 2.6 Ghz, 4GB, Linux 1.2 Standard 4.044 3.811 (759.3 s) 4.159 (98.2 s) 4.278 (95.9 s) 3.145 (510.5 s) 4.202 (163.5 s) 4.548 (181.7 s) 4.209 (317.8 s) 4.015 (275.4 s) 3.785 (232.5 s) 4.180 (207.9 s) 4.143 (156.7 s)
Opti252 Dual-Opteron Dual Opteron 252, 2.6 Ghz, 4GB, Linux 1.4.1 compiled with gcc-4.2.1 - 4.416 (655.3 s) 4.210 (97.0 s) 5.263 (78.0 s) 9.522 (168.7 s) 5.292 (129.9 s) 4.591 (180.0 s) 6.078 (220.2 s) error 5.574 (157.9 s) 4.348 (199.9 s) 6.075 (106.9 s)
CentrinoDuo T2600 CentrinoDuo, 2.16 Ghz, 2GB, Linux 1.2 (Hrv's Zagreb Edition) 3.220 3.036 2.432 2.877 2.745 3.417 3.444 3.139 3.100 3.369 4.272 3.587
Opti285 Dual-core Opteron Dual-core Opteron 285, 2.6 Ghz, 8GB, Linux 1.4 compiled with gcc-4.1.0 - 4.286 (675.2 s) 4.475 (91.3 s) 5.555 (73.9 s) 8.606 (186.6 s) 5.139 (133.7 s) 4.304 (192.0 s) 5.836 (229.3 s) Error 5.801 (151.7 s) 4.771 (182.1 s) 5.490 (118.3 s)

2.2 Parallel results

Please note that some of the cases in the suite are very small and therefor may lead to speedups smaller than 1 (especially for distributed memory machines).

Some of the cases included in the suite are too small to make a parallel machine 'look good'.

Parallel results
Machine name Machine description OF version Nr of CPUs Overall speedup Individual speedups Remarks
1 2 3 4 5 6 7 8 9 10 11
George DualXeon Dual Xeon, 2.8 Ghz, 4GB, Linux 1.2 Standard 1 1.777 1.821 1.845 1.769 1.653 1.660 1.847 1.776 1.852 1.782 1.787 1.753 MultiThreading enabled
2 2.599 2.418 2.450 2.906 2.372 2.366 2.537 2.587 3.065 2.666 2.622
George DualXeon Dual Xeon, 2.8 Ghz, 4GB, Linux 1.3 Standard 1 2.078 2.004 1.933 2.370 1.849 2.132 2.083 2.508 1.818 1.979 1.822 2.361 MultiThreading enabled
2 3.062 (incomplete) not working 3.213 3.403 2.794 2.585 3.397 2.474 3.516 2.751 3.422
Waltons Cluster Pentium 4, 3 Ghz, 2GB, Linux 1.2 Standard 1 2.447 2.539 2.673 1.958 2.150 2.531 2.775 1.976 2.585 2.589 2.447 2.690 Process migration with OpenMOSIX
2 Missing
3 3.940 5.824 0.498 0.980 3.781 4.152 4.692 6.639 1.649 5.212 5.974
4 4.608 7.074 0.456 0.953 3.858 4.825 5.361 8.688 1.547 6.221 7.101
Berta Dual-Opteron Dual Opteron 248, 2.2 Ghz, 4GB, Linux 1.2 Standard 1 2.838 2.664 2.680 3.112 2.497 3.003 2.842 3.051 2.862 2.614 2.867 3.031
2 5.415 5.126 5.013 4.630 5.803 5.275 6.000 5.218 5.367 5.517 6.198
Opti250 Dual-Opteron Dual Opteron 250, 2.4 Ghz, 4GB, Linux 1.2 Standard 1 3.406 3.405 3.482 3.598 2.726 3.542 3.408 3.619 3.263 3.220 3.546 3.650
2 to be added not working 6.175 5.444 7.170 6.452 6.953 6.761 6.396 7.288 7.800
MessRechner DualCore 1.2 Standard 1 core 3.673 3.408 3.751 3.980 3.061 3.778 3.647 3.915 3.709 3.432 3.926 3.785
2 cores 5.65 4.696 6.229 5.072 6.236 5.360 6.152 5.037 5.672 6.199 5.808
Opti252 Dual-Opteron Dual Opteron 252, 2.6 Ghz, 4GB, Linux 1.2 Standard 1 4.044 3.811 4.159 4.278 3.145 4.202 4.548 4.209 4.015 3.785 4.180 4.143
2 to be added not working 6.883 6.019 8.501 9.216 8.399 7.752 7.503 8.892 9.065
CentrinoDuo T2600 CentrinoDuo, 2.16 Ghz, 2GB, Linux 1.2 (Hrv's Zagreb Edition) 1 core 3.220 3.036 2.432 2.877 2.745 3.417 3.444 3.139 3.100 3.369 4.272 3.587
2 cores 5.385 4.513 4.542 4.817 5.756 5.435 5.198 4.832 6.147 6.540 6.067

3 Machine description

3.1 Susi (gcds07)

aka gcds07

Vendor
none (Assembled machine)
OS
Linux - Fedora Core 4
CPU
Pentium 4 with 1.8 GHz
Motherboard/Chipset
to be inserted
RAM
1 GB, have to check the type
OpenFOAM
1.2, Original binary
Compiler
gcc 4.0.1 as provided with OpenFOAM
Remarks
The OpenFOAM installation and the data directories were on remote machines and mounted via NFS (Network is 100Mbit), On the machine a OpenGroupware-server was running, but the benchmarks were running at night when this server wasn't used so this shouldn't have an impact
Benchmarks provided by
--Bgschaid 15:51, 24 Jan 2006 (CET)

3.2 bg's iBook

Vendor
Apple
Type
iBook G4
OS
Mac OS X 10.4 (Tiger)
CPU
PowerPC with 1.42 GHz
Motherboard/Chipset
not applicable
RAM
0.5 GB
OpenFOAM
1.2, Binary downloaded from Hrv's site so it should be more advanced version than a stock 1.2
Compiler
gcc 4.0.1 as provided by Apple
Remarks
It's a notebook. I tried to turn off all the PowerSaving-options to increase performance
Benchmarks provided by
--Bgschaid 15:56, 24 Jan 2006 (CET)

3.3 George (gcdw98)

TODO

3.4 Waltons

TODO

3.5 Berta (gcdw50)

TODO

3.6 MessRechner (pc185)

aka pc185

Vendor
none (Assembled machine)
OS
Linux - Suse 10
Motherboard/Chipset
to be inserted
RAM
4 GB, have to check the type
OpenFOAM
1.2, Original binary
Compiler
gcc 4.0.1 as provided with OpenFOAM
Benchmarks provided by
--Jens 15:43, 6 Feb 2006 (CET)

3.7 Opti250

Vendor
Megware
OS
Linux - Suse 10
CPU
2x AMD Opteron 250 (2,4 GHz)
Motherboard/Chipset
Thunder K8SRE (S2891)
RAM
4 GB, have to check the type
OpenFOAM
1.2, Original binary
Compiler
gcc 4.0.1 as provided with OpenFOAM

--Jens 09:55, 8 Feb 2006 (CET)

3.8 Opti252

Vendor
Selfbuild
OS
Linux - Ubuntu
CPU
2x AMD Opteron 252 (2,6 GHz)
Motherboard/Chipset
Thunder K8WE (S2895)
RAM
8 GB, have to check the type
OpenFOAM
1.2, Original binary, 1.3, 1.4, 1.4.1
Compiler
gcc 4.0.1, gcc 4.2.1
Benchmarks provided by
--Jens 15:43, 6 Feb 2006 (CET)

3.9 CentrinoDuo T2600

Vendor
Dell
OS
Linux - Suse 10
CPU
2x Intel Centrino (2,16 GHz)
Motherboard/Chipset
Intel 945PM
RAM
2 GB, 677 MHz
OpenFOAM
1.2, Hrv's Zagreb Edition
Compiler
gcc 4.0.2 as provided with OpenFOAM
Benchmarks provided by
--Marcus 17:00, 19 Feb 2006 (CET)

3.10 Opti285

Vendor
Supermicro
OS
Linux - Suse 10
CPU
2-way Dual-Core AMD Opteron 285 (2.6 GHz)
Motherboard/Chipset
Supermicro H8DCE-HTe, nVidia(r) nForce Pro 2200 (CK804) /nVidia(r) nForce Pro 2050 (CKIO4) Chipset
RAM
8 GB, DDR 400MHz
OpenFOAM
1.4
Compiler
gcc 4.1.0
Benchmarks provided by

--Huiyu Feng Fhy 01:57, 31 Aug 2007 (CEST)

4 Comparing OpenFOAM-versions

Valid versions: OF version 10.png OF version 11.png

This is ancient history because using older OpenFOAM-versions is not recommended but there is a modified version of the standard_v1.cfg called standard_v1_pre12.cfg in PyFoam-distribution version 0.2.5 that makes it possible to run the benchmark suite on two older versions. The results on the reference machine are:

Different versions
OF version Overall speedup Individual speedups
1 2 3 4 5 6 7 8 9 10 11
1.2 Standard 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
275 MB 45 MB 11 MB 18 MB 25 MB 48 MB 49 MB 468 MB 25 MB 54 MB 37 MB
1.1 Recompiled 0.993 0.974 1.040 1.025 0.760 0.966 0.984 1.009 1.026 1.059 1.054 1.029
306 MB 51 MB 11 MB 21 MB 28 MB 52 MB 53 MB 508 MB 26 MB 68 MB 41 MB
1.0 Recompiled 1.009 0.989 1.072 1.035 0.737 0.978 1.053 1.023 1.046 1.066 1.066 1.034
310 MB 50 MB 11 MB 20 MB 28 MB 50 MB 54 MB 507 MB 25 MB 60 MB 41 MB
  • Version 1.0 and 1.2 were recompiled but the standard-switches were used
  • the tutorial cases delivered with the corresponding version were used
    • in case 7 the deltaT had to be multiplied by 10 (otherwise the speedup compared with 1.2 would have been 0.11)
  • if someone wants to test the parallel-versions he is welcome to do so