Introducing the STHbench.sh benchmark script

Jeggs101

Well-Known Member
Dec 29, 2010
1,482
222
63
Worked on this a bit tonight (in a standalone script to make testing easier):

Code:
#! /bin/bash

wget -N http://www.craftychess.com/crafty-23.4.zip

unzip -o crafty-23.4.zip

cd crafty-23.4/

export target=LINUX
export CFLAGS="-Wall -pipe -O3 -fomit-frame-pointer $CFLAGS"
export CXFLAGS="-Wall -pipe -O3 -fomit-frame-pointer"
export LDFLAGS="$LDFLAGS -lstdc++"
make crafty-make

echo $? > ~/install-exit-status
chmod +x crafty
./crafty bench end
./crafty bench mt=4

cd ..
the mt=4 is not working because not compiled for SMP. I can't for the life of me get that working
 

Jeggs101

Well-Known Member
Dec 29, 2010
1,482
222
63
WARNING! Need to find a way to not show all of the debug text when you make this (maybe someone can help plz) but this is OpenSSL:

Code:
#!/bin/bash

# (c) 2013-2014 ServeTheHome.com and ServeThe.biz
# Special thanks to nitrobass24 and mir for contributions on the script
# Find more information at http://forums.servethehome.com/processors-motherboards/2519-introducing-sthbench-sh-benchmark-script.html

wget -N http://www.openssl.org/source/openssl-1.0.1e.tar.gz

tar -zxvf openssl-1.0.1e.tar.gz
mkdir openssl_/
cd openssl-1.0.1e/
./config --prefix=$HOME/openssl_/ no-zlib
make
echo \$? > ~/test-exit-status
make install
cd ..
rm -rf openssl-1.0.1e/
rm -rf openssl_/lib/

./openssl_/bin/openssl speed rsa4096
 

Chuckleb

Moderator
Mar 5, 2013
1,017
330
83
Minnesota
OpenSSL revised to remove noise and run in one directory

How's this revision? You wouldn't need a temporary directory since you shouldn't need to install and I hate forcing things to be written to $HOME when users may want to run in tmp or somewhere else. I wasn't sure what you do for cleanup so I removed that region. It should squelch most of the noise.

--

Code:
#!/bin/bash

# (c) 2013-2014 ServeTheHome.com and ServeThe.biz
# Special thanks to nitrobass24 and mir for contributions on the script
# Find more information at http://forums.servethehome.com/processors-motherboards/2519-introducing-sthbench-sh-benchmark-script.html

wget -N http://www.openssl.org/source/openssl-1.0.1e.tar.gz

tar -zxvf openssl-1.0.1e.tar.gz 2>&1 >> logfile
cd openssl-1.0.1e/
./config no-zlib 2>&1 >> logfile
make 2>&1 >> logfile
echo \$? > ~/test-exit-status

./apps/openssl speed rsa4096
 

Patrick

Administrator
Staff member
Dec 21, 2010
11,815
4,767
113
Thanks Jeggs101 and Chuckleb! Will give both a run. Easier to test these scripts than the bigger one.

Here is another question, does it make sense to keep crafty in there?
 

nitrobass24

Moderator
Dec 26, 2010
1,082
126
63
TX
Id get rid of Crafty all together, but thats just me.

If there is any time where you dont want "noise" just 2>&1 >> /dev/null
 

TangoWhiskey9

Active Member
Jun 28, 2013
402
59
28
OK site has given me a lot so here's a contribution for STREAM:

Code:
#!/bin/bash
wget -N http://www.cs.virginia.edu/stream/FTP/Code/stream.c
cc stream.c -O3 -march=native -fopenmp -o stream-me
./stream-me
Hey was also thinking you can use this to get CPU information AND figure out if the machine is in a Hypervisor. Would be helpful if you do ever make the web viewer.

Code:
lscpu
Looks like that leaves:
  1. pybench
  2. 7zip
  3. Memcached (one that nitrobass found)
  4. ... maybe a nginx or apache?
 

Patrick

Administrator
Staff member
Dec 21, 2010
11,815
4,767
113
Awesome guys. Give me a few minutes to update and rev a Dev008 test script.
 

Patrick

Administrator
Staff member
Dec 21, 2010
11,815
4,767
113
OK I uploaded Dev008. It is on the first post and is currently running through tests.
 

Chuckleb

Moderator
Mar 5, 2013
1,017
330
83
Minnesota
Heh, only reason I dumped to logfile was in case things broke ;). Habit.

Regarding crafty. I've been playing with it and it has been rather annoying. For our cluster benchmarks, everybody likes to run Linpack but that's finicky to get good numbers as you can optimize by adjusting the problem size to the machine. I was just playing with the NAS Parallel Benchmark suite and that's rather simple to include and build. The hardest part is that the official repo you are supposed to register and download. So if you wanted to host a copy of the tgz and a couple of config files, that could be a good replacement. An FT or LU benchmark with a size W could be quick. The size S is way to small.

I could get a script for that built up this evening for folks to try.
 

Patrick

Administrator
Staff member
Dec 21, 2010
11,815
4,767
113
I could get a script for that built up this evening for folks to try.
Crafty appears to be single threaded working in the current version. Very much agree on Linpack. An ARM CPU or Atom compared to a Xeon E5 is a huge margin of difference.

I am game for giving whatever you script a try. Rumor has it you will be getting some extra cold weather in Minnesota this weekend :)
 

Chuckleb

Moderator
Mar 5, 2013
1,017
330
83
Minnesota
We are going to have between -50 to -60 windchills which is colder then my researchers who are in Antarctica right now. The governor closed all K-12 schools, the University is still open though!
 

Chuckleb

Moderator
Mar 5, 2013
1,017
330
83
Minnesota
Here's NBP tests. I have it detect physical cores vs total "cores" which can include hyperthreaded cores. Many of these tests run better on real cores than HT cores. Not sure which ones to use but I chose BT and FT with size "A" for speed.

Example results:
AMD X2 4200+ (2 cores)
bt.A.x in 74s with 2274 Mop/s total

Dual E5-2665 (16 physical cores)
bt.A.x in 9.5s with 17718 Mop/s total

Dual E5-2665 (32 cores with HT on)
run 1: bt.A.x in 12.5s with 13419 Mop/s total
run 2: bt.A.x in 32.4s with 5190 Mop/s total

Getting consistent good numbers is hard without pinning memory and cores. The more you have, the more processes drift.

Lastly, should probably find a better home for the tgz if this is useful. :)


Code:
#!/bin/bash


# (c) 2013-2014 ServeTheHome.com and ServeThe.biz
# Special thanks to nitrobass24 and mir for contributions on the script
# Find more information at http://forums.servethehome.com/processors-motherboards/2519-introducing-sthbench-sh-benchmark-script.html


wget https://dl.dropboxusercontent.com/u/124184/NPB3.3.1.tar.gz


tar -zxvf NPB3.3.1.tar.gz
cd NPB3.3.1/NPB3.3-OMP/


# Use the provided makefile definitions
cp config/NAS.samples/make.def.gcc_x86 config/make.def


# Define which tests to build
echo "ft A" >> config/suite.def
#echo "mg A" >> config/suite.def
#echo "sp A" >> config/suite.def
#echo "lu A" >> config/suite.def
echo "bt A" >> config/suite.def
#echo "is A" >> config/suite.def
#echo "ep A" >> config/suite.def
#echo "cg A" >> config/suite.def
#echo "ua A" >> config/suite.def
#echo "dc A" >> config/suite.def


make suite


# Determine number of physical cores (not hyperthread) and set OMP to cores value
procs=$(grep "physical id" /proc/cpuinfo | sort -u | wc -l)
pcores=$(grep "cpu cores" /proc/cpuinfo |sort -u |cut -d":" -f2)
cores=$((procs*pcores))


export OMP_NUM_THREADS=$cores


bin/bt.A.x
bin/ft.A.x
 

Patrick

Administrator
Staff member
Dec 21, 2010
11,815
4,767
113
Thanks Chuckleb!!!!!!

OK hosting the tar.gz now

Code:
#!/bin/bash


# (c) 2013-2014 ServeTheHome.com and ServeThe.biz
# Special thanks to Chuckleb for contributions on the script
# Find more information at http://forums.servethehome.com/processors-motherboards/2519-introducing-sthbench-sh-benchmark-script.html


wget http://forums.servethehome.com/pjk/NPB3.3.1.tar.gz


tar -zxvf NPB3.3.1.tar.gz
cd NPB3.3.1/NPB3.3-OMP/


# Use the provided makefile definitions
cp config/NAS.samples/make.def.gcc_x86 config/make.def


# Define which tests to build
echo "ft A" >> config/suite.def
#echo "mg A" >> config/suite.def
#echo "sp A" >> config/suite.def
#echo "lu A" >> config/suite.def
echo "bt A" >> config/suite.def
#echo "is A" >> config/suite.def
#echo "ep A" >> config/suite.def
#echo "cg A" >> config/suite.def
#echo "ua A" >> config/suite.def
#echo "dc A" >> config/suite.def


make suite


# Determine number of physical cores (not hyperthread) and set OMP to cores value
procs=$(grep "physical id" /proc/cpuinfo | sort -u | wc -l)
pcores=$(grep "cpu cores" /proc/cpuinfo |sort -u |cut -d":" -f2)
cores=$((procs*pcores))


export OMP_NUM_THREADS=$cores


bin/bt.A.x
bin/ft.A.x
Will try to run this tonight.

For anyone that wants to just wget and run this script:
Code:
wget http://forums.servethehome.com/pjk/npb.sh
Also it looks like you will need to install gfortran (sudo apt-get install gfortran)
 

Patrick

Administrator
Staff member
Dec 21, 2010
11,815
4,767
113
Runs fairly quickly. Should it be doing a Class A on an 8 core Xeon E5?
 

Chuckleb

Moderator
Mar 5, 2013
1,017
330
83
Minnesota
I picked class A because it was a small run size so it could run relatively short on Atoms and older generation systems that have limited memory or slow CPUs. It appears A->C is 4x size per step, then D->F is 16x size per step. S and W were too small (way too quick). I don't mind either way, let the community decide. I should run a big test to see how long it runs.
 

Chuckleb

Moderator
Mar 5, 2013
1,017
330
83
Minnesota
I find STREAM to be a very fussy test. It's really dependent on memory layout on multiprocessor/multicore systems, and gets on some systems, you get higher numbers when you don't use all the cores vs the memory bus. This especially comes into play when you have any sort of hyperthreading turned on. I've modified your example to check for cores and to bind to physical cores, not virtual cores. This consistently gives me the same results on every run on my E5.

E5-2665 (GOMP_CPU_AFFINITY=0-15 and OMP_NUM_THREADS=16)
Copy: 25777
Scale: 24840
Add: 27739
Triad: 28496

E5-2665 (unbound, defaults to 32 threads)
Copy: 8471
Scale: 8120
Add: 11683
Triad: 12378

E5-2665 (OMP_NUM_THREADS=16)
Copy: 19890
Scale: 18397
Add: 21637
Triad: 21883





Code:
#!/bin/bash
wget -N http://www.cs.virginia.edu/stream/FTP/Code/stream.c
gcc stream.c -O3 -march=native -fopenmp -o stream-me


# Determine number of physical cores (not hyperthread) and set OMP to cores value
procs=$(grep "physical id" /proc/cpuinfo | sort -u | wc -l)
pcores=$(grep "cpu cores" /proc/cpuinfo |sort -u |cut -d":" -f2)
cores=$((procs*pcores))




export OMP_NUM_THREADS=$cores
export GOMP_CPU_AFFINITY=0-$((cores-1))
echo $GOMP_CPU_AFFINITY


./stream-me
 

Jeggs101

Well-Known Member
Dec 29, 2010
1,482
222
63
Stream produces strange results. Patrick even the pts one you use only does 12-14GBps on the C2000 and E3v3. Should be like 2x that.