rp2ベンチマークメモ

以下ツールでのベンチマーク結果ペースト。

  1. unixbench
  2. stream

unixbench

総合ベンチマークツール結果

GitHub - kdlucas/byte-unixbench: Automatically exported from code.google.com/p/byte-unixbench

pi@raspberrypi:~/git/byte-unixbench/UnixBench$ ./Run
make all
make[1]: Entering directory '/home/pi/git/byte-unixbench/UnixBench'
Checking distribution of files
./pgms exists
./src exists
./testdir exists
./tmp exists
./results exists
make[1]: Leaving directory '/home/pi/git/byte-unixbench/UnixBench'
sh: 1: 3dinfo: not found

# # # # # # # ##### ###### # # #### # #
# # ## # # # # # # # ## # # # # #
# # # # # # ## ##### ##### # # # # ######
# # # # # # ## # # # # # # # # #
# # # ## # # # # # # # ## # # # #
#### # # # # # ##### ###### # # #### # #

Version 5.1.3 Based on the Byte Magazine Unix Benchmark

Multi-CPU version Version 5 revisions by Ian Smith,
Sunnyvale, CA, USA
January 13, 2011 johantheghost at yahoo period com

Use of uninitialized value in printf at ./Run line 1378.
Use of uninitialized value in printf at ./Run line 1380.
Use of uninitialized value in printf at ./Run line 1378.
Use of uninitialized value in printf at ./Run line 1380.
Use of uninitialized value in printf at ./Run line 1378.
Use of uninitialized value in printf at ./Run line 1380.
Use of uninitialized value in printf at ./Run line 1378.
Use of uninitialized value in printf at ./Run line 1380.
Use of uninitialized value in printf at ./Run line 1588.
Use of uninitialized value in printf at ./Run line 1590.
Use of uninitialized value in printf at ./Run line 1588.
Use of uninitialized value in printf at ./Run line 1590.
Use of uninitialized value in printf at ./Run line 1588.
Use of uninitialized value in printf at ./Run line 1590.
Use of uninitialized value in printf at ./Run line 1588.
Use of uninitialized value in printf at ./Run line 1590.

1 x Dhrystone 2 using register variables 1 2 3 4 5 6 7 8 9 10

1 x Double-Precision Whetstone 1 2 3 4 5 6 7 8 9 10

1 x Execl Throughput 1 2 3

1 x File Copy 1024 bufsize 2000 maxblocks 1 2 3

1 x File Copy 256 bufsize 500 maxblocks 1 2 3

1 x File Copy 4096 bufsize 8000 maxblocks 1 2 3

1 x Pipe Throughput 1 2 3 4 5 6 7 8 9 10

1 x Pipe-based Context Switching 1 2 3 4 5 6 7 8 9 10

1 x Process Creation 1 2 3

1 x System Call Overhead 1 2 3 4 5 6 7 8 9 10

1 x Shell Scripts (1 concurrent) 1 2 3

1 x Shell Scripts (8 concurrent) 1 2 3

4 x Dhrystone 2 using register variables 1 2 3 4 5 6 7 8 9 10

4 x Double-Precision Whetstone 1 2 3 4 5 6 7 8 9 10

4 x Execl Throughput 1 2 3

4 x File Copy 1024 bufsize 2000 maxblocks 1 2 3

4 x File Copy 256 bufsize 500 maxblocks 1 2 3

4 x File Copy 4096 bufsize 8000 maxblocks 1 2 3

4 x Pipe Throughput 1 2 3 4 5 6 7 8 9 10

4 x Pipe-based Context Switching 1 2 3 4 5 6 7 8 9 10

4 x Process Creation 1 2 3

4 x System Call Overhead 1 2 3 4 5 6 7 8 9 10

4 x Shell Scripts (1 concurrent) 1 2 3

4 x Shell Scripts (8 concurrent) 1 2 3

========================================================================
BYTE UNIX Benchmarks (Version 5.1.3)

System: raspberrypi: GNU/Linux
OS: GNU/Linux -- 4.1.19-v7+ -- #858 SMP Tue Mar 15 15:56:00 GMT 2016
Machine: armv7l (unknown)
Language: en_US.utf8 (charmap="UTF-8", collate="UTF-8")
CPU 0: ARMv7 Processor rev 5 (v7l) (0.0 bogomips)

CPU 1: ARMv7 Processor rev 5 (v7l) (0.0 bogomips)

CPU 2: ARMv7 Processor rev 5 (v7l) (0.0 bogomips)

CPU 3: ARMv7 Processor rev 5 (v7l) (0.0 bogomips)

13:50:07 up 4 min, 2 users, load average: 0.24, 0.25, 0.14; runlevel 3

1CPU

------------------------------------------------------------------------
Benchmark Run: Sat Apr 16 2016 13:50:07 - 14:18:14
4 CPUs in system; running 1 parallel copy of tests

Dhrystone 2 using register variables        2991922.6 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                      433.5 MWIPS (9.9 s, 7 samples)
Execl Throughput                                440.3 lps   (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks         75382.6 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks           21726.8 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        196338.2 KBps  (30.0 s, 2 samples)
Pipe Throughput                              168226.3 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                  32154.4 lps   (10.0 s, 7 samples)
Process Creation                               1225.7 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   1087.0 lpm   (60.1 s, 2 samples)
Shell Scripts (8 concurrent)                    302.9 lpm   (60.1 s, 2 samples)
System Call Overhead                         425012.3 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0    2991922.6    256.4
Double-Precision Whetstone                       55.0        433.5     78.8
Execl Throughput                                 43.0        440.3    102.4
File Copy 1024 bufsize 2000 maxblocks          3960.0      75382.6    190.4
File Copy 256 bufsize 500 maxblocks            1655.0      21726.8    131.3
File Copy 4096 bufsize 8000 maxblocks          5800.0     196338.2    338.5
Pipe Throughput                               12440.0     168226.3    135.2
Pipe-based Context Switching                   4000.0      32154.4     80.4
Process Creation                                126.0       1225.7     97.3
Shell Scripts (1 concurrent)                     42.4       1087.0    256.4
Shell Scripts (8 concurrent)                      6.0        302.9    504.8
System Call Overhead                          15000.0     425012.3    283.3
                                                                   ========
System Benchmarks Index Score                                         172.2

4CPU

------------------------------------------------------------------------
Benchmark Run: Sat Apr 16 2016 14:18:14 - 14:46:31
4 CPUs in system; running 4 parallel copies of tests

Dhrystone 2 using register variables       11549641.4 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     1736.4 MWIPS (9.9 s, 7 samples)
Execl Throughput                               1153.9 lps   (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        114481.0 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks           32409.5 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks        304039.7 KBps  (30.0 s, 2 samples)
Pipe Throughput                              644726.5 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                 116551.8 lps   (10.0 s, 7 samples)
Process Creation                               2499.5 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   2384.2 lpm   (60.1 s, 2 samples)
Shell Scripts (8 concurrent)                    312.8 lpm   (60.5 s, 2 samples)
System Call Overhead                        1603673.4 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   11549641.4    989.7
Double-Precision Whetstone                       55.0       1736.4    315.7
Execl Throughput                                 43.0       1153.9    268.4
File Copy 1024 bufsize 2000 maxblocks          3960.0     114481.0    289.1
File Copy 256 bufsize 500 maxblocks            1655.0      32409.5    195.8
File Copy 4096 bufsize 8000 maxblocks          5800.0     304039.7    524.2
Pipe Throughput                               12440.0     644726.5    518.3
Pipe-based Context Switching                   4000.0     116551.8    291.4
Process Creation                                126.0       2499.5    198.4
Shell Scripts (1 concurrent)                     42.4       2384.2    562.3
Shell Scripts (8 concurrent)                      6.0        312.8    521.3
System Call Overhead                          15000.0    1603673.4   1069.1
                                                                   ========
System Benchmarks Index Score                                         411.2

stream

メモリ速度ベンチマーク

MEMORY BANDWIDTH: STREAM BENCHMARK PERFORMANCE RESULTS

wget http://www.cs.virginia.edu/stream/FTP/Code/stream.c

pi@raspberrypi:~/git/stream$ gcc -O3 -march=armv7-a -mtune=cortex-a7 -mfpu=neon stream.c
pi@raspberrypi:~/git/stream$ ./a.out
-------------------------------------------------------------
STREAM version $Revision: 5.10 $
-------------------------------------------------------------
This system uses 8 bytes per array element.
-------------------------------------------------------------
Array size = 10000000 (elements), Offset = 0 (elements)
Memory per array = 76.3 MiB (= 0.1 GiB).
Total memory required = 228.9 MiB (= 0.2 GiB).
Each kernel will be executed 10 times.
 The *best* time for each kernel (excluding the first iteration)
 will be used to compute the reported bandwidth.
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 107861 microseconds.
   (= 107861 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function    Best Rate MB/s  Avg time     Min time     Max time
Copy:            2094.0     0.076502     0.076409     0.076677
Scale:           1061.7     0.150778     0.150696     0.150874
Add:              776.2     0.314904     0.309195     0.326431
Triad:            605.6     0.404288     0.396310     0.409284
-------------------------------------------------------------
Solution Validates: avg error less than 1.000000e-13 on all three arrays
-------------------------------------------------------------