Introducing the STHbench.sh benchmark script

Discussion in 'Processors and Motherboards' started by Patrick, Oct 6, 2013.

  1. Jeggs101

    Jeggs101 Well-Known Member

    Joined:
    Dec 29, 2010
    Messages:
    1,461
    Likes Received:
    214
    Worked on this a bit tonight (in a standalone script to make testing easier):

    Code:
    #! /bin/bash
    
    wget -N http://www.craftychess.com/crafty-23.4.zip
    
    unzip -o crafty-23.4.zip
    
    cd crafty-23.4/
    
    export target=LINUX
    export CFLAGS="-Wall -pipe -O3 -fomit-frame-pointer $CFLAGS"
    export CXFLAGS="-Wall -pipe -O3 -fomit-frame-pointer"
    export LDFLAGS="$LDFLAGS -lstdc++"
    make crafty-make
    
    echo $? > ~/install-exit-status
    chmod +x crafty
    ./crafty bench end
    ./crafty bench mt=4
    
    cd ..
    
    the mt=4 is not working because not compiled for SMP. I can't for the life of me get that working
     
    #81
  2. Jeggs101

    Jeggs101 Well-Known Member

    Joined:
    Dec 29, 2010
    Messages:
    1,461
    Likes Received:
    214
    WARNING! Need to find a way to not show all of the debug text when you make this (maybe someone can help plz) but this is OpenSSL:

    Code:
    #!/bin/bash
    
    # (c) 2013-2014 ServeTheHome.com and ServeThe.biz
    # Special thanks to nitrobass24 and mir for contributions on the script
    # Find more information at http://forums.servethehome.com/processors-motherboards/2519-introducing-sthbench-sh-benchmark-script.html
    
    wget -N http://www.openssl.org/source/openssl-1.0.1e.tar.gz
    
    tar -zxvf openssl-1.0.1e.tar.gz
    mkdir openssl_/
    cd openssl-1.0.1e/
    ./config --prefix=$HOME/openssl_/ no-zlib
    make
    echo \$? > ~/test-exit-status
    make install
    cd ..
    rm -rf openssl-1.0.1e/
    rm -rf openssl_/lib/
    
    ./openssl_/bin/openssl speed rsa4096
     
    #82
  3. Chuckleb

    Chuckleb Moderator

    Joined:
    Mar 5, 2013
    Messages:
    1,017
    Likes Received:
    330
    OpenSSL revised to remove noise and run in one directory

    How's this revision? You wouldn't need a temporary directory since you shouldn't need to install and I hate forcing things to be written to $HOME when users may want to run in tmp or somewhere else. I wasn't sure what you do for cleanup so I removed that region. It should squelch most of the noise.

    --

    Code:
    
    #!/bin/bash
    
    # (c) 2013-2014 ServeTheHome.com and ServeThe.biz
    # Special thanks to nitrobass24 and mir for contributions on the script
    # Find more information at http://forums.servethehome.com/processors-motherboards/2519-introducing-sthbench-sh-benchmark-script.html
    
    wget -N http://www.openssl.org/source/openssl-1.0.1e.tar.gz
    
    tar -zxvf openssl-1.0.1e.tar.gz 2>&1 >> logfile
    cd openssl-1.0.1e/
    ./config no-zlib 2>&1 >> logfile
    make 2>&1 >> logfile
    echo \$? > ~/test-exit-status
    
    ./apps/openssl speed rsa4096
    
    
     
    #83
  4. Patrick

    Patrick Administrator
    Staff Member

    Joined:
    Dec 21, 2010
    Messages:
    11,497
    Likes Received:
    4,440
    Thanks Jeggs101 and Chuckleb! Will give both a run. Easier to test these scripts than the bigger one.

    Here is another question, does it make sense to keep crafty in there?
     
    #84
  5. nitrobass24

    nitrobass24 Moderator

    Joined:
    Dec 26, 2010
    Messages:
    1,081
    Likes Received:
    125
    Id get rid of Crafty all together, but thats just me.

    If there is any time where you dont want "noise" just 2>&1 >> /dev/null
     
    #85
  6. TangoWhiskey9

    TangoWhiskey9 Active Member

    Joined:
    Jun 28, 2013
    Messages:
    402
    Likes Received:
    59
    OK site has given me a lot so here's a contribution for STREAM:

    Code:
    #!/bin/bash
    wget -N http://www.cs.virginia.edu/stream/FTP/Code/stream.c
    cc stream.c -O3 -march=native -fopenmp -o stream-me
    ./stream-me
    
    Hey was also thinking you can use this to get CPU information AND figure out if the machine is in a Hypervisor. Would be helpful if you do ever make the web viewer.

    Code:
    lscpu
    
    Looks like that leaves:
    1. pybench
    2. 7zip
    3. Memcached (one that nitrobass found)
    4. ... maybe a nginx or apache?
     
    #86
  7. Patrick

    Patrick Administrator
    Staff Member

    Joined:
    Dec 21, 2010
    Messages:
    11,497
    Likes Received:
    4,440
    Awesome guys. Give me a few minutes to update and rev a Dev008 test script.
     
    #87
  8. TangoWhiskey9

    TangoWhiskey9 Active Member

    Joined:
    Jun 28, 2013
    Messages:
    402
    Likes Received:
    59
    I like chuckleb's version. Only thing I'd change is instead of logfile change to /dev/null
     
    #88
  9. Patrick

    Patrick Administrator
    Staff Member

    Joined:
    Dec 21, 2010
    Messages:
    11,497
    Likes Received:
    4,440
    OK I uploaded Dev008. It is on the first post and is currently running through tests.
     
    #89
  10. Chuckleb

    Chuckleb Moderator

    Joined:
    Mar 5, 2013
    Messages:
    1,017
    Likes Received:
    330
    Heh, only reason I dumped to logfile was in case things broke ;). Habit.

    Regarding crafty. I've been playing with it and it has been rather annoying. For our cluster benchmarks, everybody likes to run Linpack but that's finicky to get good numbers as you can optimize by adjusting the problem size to the machine. I was just playing with the NAS Parallel Benchmark suite and that's rather simple to include and build. The hardest part is that the official repo you are supposed to register and download. So if you wanted to host a copy of the tgz and a couple of config files, that could be a good replacement. An FT or LU benchmark with a size W could be quick. The size S is way to small.

    I could get a script for that built up this evening for folks to try.
     
    #90
  11. Patrick

    Patrick Administrator
    Staff Member

    Joined:
    Dec 21, 2010
    Messages:
    11,497
    Likes Received:
    4,440
    Crafty appears to be single threaded working in the current version. Very much agree on Linpack. An ARM CPU or Atom compared to a Xeon E5 is a huge margin of difference.

    I am game for giving whatever you script a try. Rumor has it you will be getting some extra cold weather in Minnesota this weekend :)
     
    #91
  12. Chuckleb

    Chuckleb Moderator

    Joined:
    Mar 5, 2013
    Messages:
    1,017
    Likes Received:
    330
    We are going to have between -50 to -60 windchills which is colder then my researchers who are in Antarctica right now. The governor closed all K-12 schools, the University is still open though!
     
    #92
  13. Chuckleb

    Chuckleb Moderator

    Joined:
    Mar 5, 2013
    Messages:
    1,017
    Likes Received:
    330
    Here's NBP tests. I have it detect physical cores vs total "cores" which can include hyperthreaded cores. Many of these tests run better on real cores than HT cores. Not sure which ones to use but I chose BT and FT with size "A" for speed.

    Example results:
    AMD X2 4200+ (2 cores)
    bt.A.x in 74s with 2274 Mop/s total

    Dual E5-2665 (16 physical cores)
    bt.A.x in 9.5s with 17718 Mop/s total

    Dual E5-2665 (32 cores with HT on)
    run 1: bt.A.x in 12.5s with 13419 Mop/s total
    run 2: bt.A.x in 32.4s with 5190 Mop/s total

    Getting consistent good numbers is hard without pinning memory and cores. The more you have, the more processes drift.

    Lastly, should probably find a better home for the tgz if this is useful. :)


    Code:
    
    #!/bin/bash
    
    
    # (c) 2013-2014 ServeTheHome.com and ServeThe.biz
    # Special thanks to nitrobass24 and mir for contributions on the script
    # Find more information at http://forums.servethehome.com/processors-motherboards/2519-introducing-sthbench-sh-benchmark-script.html
    
    
    wget https://dl.dropboxusercontent.com/u/124184/NPB3.3.1.tar.gz
    
    
    tar -zxvf NPB3.3.1.tar.gz
    cd NPB3.3.1/NPB3.3-OMP/
    
    
    # Use the provided makefile definitions
    cp config/NAS.samples/make.def.gcc_x86 config/make.def
    
    
    # Define which tests to build
    echo "ft A" >> config/suite.def
    #echo "mg A" >> config/suite.def
    #echo "sp A" >> config/suite.def
    #echo "lu A" >> config/suite.def
    echo "bt A" >> config/suite.def
    #echo "is A" >> config/suite.def
    #echo "ep A" >> config/suite.def
    #echo "cg A" >> config/suite.def
    #echo "ua A" >> config/suite.def
    #echo "dc A" >> config/suite.def
    
    
    make suite
    
    
    # Determine number of physical cores (not hyperthread) and set OMP to cores value
    procs=$(grep "physical id" /proc/cpuinfo | sort -u | wc -l)
    pcores=$(grep "cpu cores" /proc/cpuinfo |sort -u |cut -d":" -f2)
    cores=$((procs*pcores))
    
    
    export OMP_NUM_THREADS=$cores
    
    
    bin/bt.A.x
    bin/ft.A.x
    
    
    
     
    #93
  14. sean

    sean Member

    Joined:
    Sep 26, 2013
    Messages:
    53
    Likes Received:
    25
    GitHub! GitHub would be a perfect spot for hosting the script. Then you could also do the curl URL | bash thing that's so hip now.
     
    #94
  15. Patrick

    Patrick Administrator
    Staff Member

    Joined:
    Dec 21, 2010
    Messages:
    11,497
    Likes Received:
    4,440
    Thanks Chuckleb!!!!!!

    OK hosting the tar.gz now

    Code:
    #!/bin/bash
    
    
    # (c) 2013-2014 ServeTheHome.com and ServeThe.biz
    # Special thanks to Chuckleb for contributions on the script
    # Find more information at http://forums.servethehome.com/processors-motherboards/2519-introducing-sthbench-sh-benchmark-script.html
    
    
    wget http://forums.servethehome.com/pjk/NPB3.3.1.tar.gz
    
    
    tar -zxvf NPB3.3.1.tar.gz
    cd NPB3.3.1/NPB3.3-OMP/
    
    
    # Use the provided makefile definitions
    cp config/NAS.samples/make.def.gcc_x86 config/make.def
    
    
    # Define which tests to build
    echo "ft A" >> config/suite.def
    #echo "mg A" >> config/suite.def
    #echo "sp A" >> config/suite.def
    #echo "lu A" >> config/suite.def
    echo "bt A" >> config/suite.def
    #echo "is A" >> config/suite.def
    #echo "ep A" >> config/suite.def
    #echo "cg A" >> config/suite.def
    #echo "ua A" >> config/suite.def
    #echo "dc A" >> config/suite.def
    
    
    make suite
    
    
    # Determine number of physical cores (not hyperthread) and set OMP to cores value
    procs=$(grep "physical id" /proc/cpuinfo | sort -u | wc -l)
    pcores=$(grep "cpu cores" /proc/cpuinfo |sort -u |cut -d":" -f2)
    cores=$((procs*pcores))
    
    
    export OMP_NUM_THREADS=$cores
    
    
    bin/bt.A.x
    bin/ft.A.x
    
    Will try to run this tonight.

    For anyone that wants to just wget and run this script:
    Code:
    wget http://forums.servethehome.com/pjk/npb.sh
    Also it looks like you will need to install gfortran (sudo apt-get install gfortran)
     
    #95
  16. Patrick

    Patrick Administrator
    Staff Member

    Joined:
    Dec 21, 2010
    Messages:
    11,497
    Likes Received:
    4,440
    Runs fairly quickly. Should it be doing a Class A on an 8 core Xeon E5?
     
    #96
  17. Chuckleb

    Chuckleb Moderator

    Joined:
    Mar 5, 2013
    Messages:
    1,017
    Likes Received:
    330
    I picked class A because it was a small run size so it could run relatively short on Atoms and older generation systems that have limited memory or slow CPUs. It appears A->C is 4x size per step, then D->F is 16x size per step. S and W were too small (way too quick). I don't mind either way, let the community decide. I should run a big test to see how long it runs.
     
    #97
  18. Chuckleb

    Chuckleb Moderator

    Joined:
    Mar 5, 2013
    Messages:
    1,017
    Likes Received:
    330
    I find STREAM to be a very fussy test. It's really dependent on memory layout on multiprocessor/multicore systems, and gets on some systems, you get higher numbers when you don't use all the cores vs the memory bus. This especially comes into play when you have any sort of hyperthreading turned on. I've modified your example to check for cores and to bind to physical cores, not virtual cores. This consistently gives me the same results on every run on my E5.

    E5-2665 (GOMP_CPU_AFFINITY=0-15 and OMP_NUM_THREADS=16)
    Copy: 25777
    Scale: 24840
    Add: 27739
    Triad: 28496

    E5-2665 (unbound, defaults to 32 threads)
    Copy: 8471
    Scale: 8120
    Add: 11683
    Triad: 12378

    E5-2665 (OMP_NUM_THREADS=16)
    Copy: 19890
    Scale: 18397
    Add: 21637
    Triad: 21883





    Code:
    #!/bin/bash
    wget -N http://www.cs.virginia.edu/stream/FTP/Code/stream.c
    gcc stream.c -O3 -march=native -fopenmp -o stream-me
    
    
    # Determine number of physical cores (not hyperthread) and set OMP to cores value
    procs=$(grep "physical id" /proc/cpuinfo | sort -u | wc -l)
    pcores=$(grep "cpu cores" /proc/cpuinfo |sort -u |cut -d":" -f2)
    cores=$((procs*pcores))
    
    
    
    
    export OMP_NUM_THREADS=$cores
    export GOMP_CPU_AFFINITY=0-$((cores-1))
    echo $GOMP_CPU_AFFINITY
    
    
    ./stream-me
    
     
    #98
  19. Jeggs101

    Jeggs101 Well-Known Member

    Joined:
    Dec 29, 2010
    Messages:
    1,461
    Likes Received:
    214
    Stream produces strange results. Patrick even the pts one you use only does 12-14GBps on the C2000 and E3v3. Should be like 2x that.
     
    #99
  20. Chuckleb

    Chuckleb Moderator

    Joined:
    Mar 5, 2013
    Messages:
    1,017
    Likes Received:
    330
    What kind of numbers are you getting and your configuration? IE: DDR3-1600 dual channel per chip? Etc.
     
    #100

Share This Page