1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

IvyBridge Xeon E5-2697v2 experiences

Discussion in 'Processors and Motherboards' started by Andreas, Sep 26, 2013.

  1. Andreas

    Andreas New Member

    Hadn't been here for a while.

    A week ago I received my two E5-2697v2 CPUs (12c, 24t, 2.7Ghz, 130 watt).
    Thought there is some interest sharing experience with these.
    When comparing to SB Xeon, I am referring to the E5-2687W I got a year ago.

    Packaging:
    The 12-core IB Xeons are a bit larger than the E5-2687W chip. No issues in the Asus m7B, the socket is spacy nough to cope with this. Not sure, if the narrow LGA-2011 sockets might have issues with these. recommend to be checked, if you are going this route.

    Power:
    Running on idle, the IB Xeon consumes more power than the SB Xeon (about 20 watt diff, probably due to larger core count)
    Under higher load (stanford folding) the newer chip consumes about 40 watt less than my SB Xeons.

    performance:
    in the average the 30-35% better than my SB Xeons. Of course dependent on the instruction mix, the ratio varies.

    Best case I had observed was 70%. There are most likely 2 possible root causes for this large deviation:
    1) heavy dependencies on single instructions which had been significantly improved
    2) Superlinear acceleration through the larger L3-cache, moving a bigger chunk of the working dataset up the memory hierarchy.

    Compatibility:
    works well with Asus Z9PE-D16 (BIOS 5105)
    does not start with (my) Asus Z9PE-D8 (BIOS 5103) CPU is listed as compatible with this BIOS version (Support call is open)

    cheers,
    Andy
  2. Patrick

    Patrick Administrator

    Andy - great information! Any interest in running Ubuntu benchmarks? :) Would love to get those for the test set to compare with the other processors we have looked at.
  3. Andreas

    Andreas New Member

    Patrick,
    if they are ready to run - happy to do. If I need to become a specialist before, my time budget might be a barrier.

    Let me know,
    Andy
  4. Patrick

    Patrick Administrator

    Andy - fairly easy:
    1. Install Ubuntu 64-bit server with OpenSSH
    2. Here are the commands to use:
    Update/ Install required packages

    sudo apt-get update && sudo apt-get upgrade -y && sudo apt-get install build-essential libx11-dev libglu-dev hardinfo crafty phoronix-test-suite -y

    hardinfo
    hardinfo | less | tee hardinfo.txt

    Ubuntu UnixBench 5.1.3

    wget https://byte-unixbench.googlecode.com/files/UnixBench5.1.3.tgz && tar -zxvf UnixBench5.1.3.tgz && cd UnixBench && time make && wget http://forums.servethehome.com/pjk/fix-limitation.patch && patch Run fix-limitation.patch &&./Run dhry2reg whetstone-double syscall pipe context1 spawn execl shell1 shell8 shell16 && cd ..

    c-ray 1.1
    wget http://www.futuretech.blinkenlights.nl/depot/c-ray-1.1.tar.gz && tar -zxvf c-ray-1.1.tar.gz && cd c-ray-1.1 && make && cat scene | ./c-ray-mt -t 32 -s 7500x3500 > foo.ppm | tee c-ray1.txt && cat sphfract | ./c-ray-mt -t 32 -s 1920x1200 -r 8 > foo.ppm | tee c-ray2.txt && cd ..

    crafty
    crafty bench

    (need to control X or control C out)
    PTS Tests
    phoronix-test-suite benchmark pts/stream pts/compress-7zip pts/openssl pts/pybench

    Menu key sequence: y n n 5 n

    Then it is pretty much just capturing the data points

    I do need to get this fully scripted, but the above is basically a few commands to copy/ paste in putty

    Argh the forums are parsing the links, if you want to e-mail me I can send you and a spreadsheet template to copy/ paste numbers into. Basically it takes the copy/ paste of 6 commands so you do not need to be an expert.
  5. Patrick

    Patrick Administrator

    BTW Andy - someone is scripting this over the weekend for us. By early next week should be as easy as running a wget for the script then executing the script.
  6. Andreas

    Andreas New Member

    ok, thanks Patrick.
    I'll wait then for the script to simplify my life :)

    Here is some folding performance info (& vs. E5-2687W)
    IvyBridge Xeons: Folding performance - [H]ard|Forum

    Astonishing performance reaching into 4P territory, best performance/watt of all systems.

    rgds,
    Andy
  7. Patrick

    Patrick Administrator

    Script delivered. Installing Ubuntu in a VM to give it a try.
  8. Patrick

    Patrick Administrator

  9. phoenix

    phoenix New Member

  10. Patrick

    Patrick Administrator

    Have a set of specific commands to download and run lmbench how you want? If so, let me know and I can add to the benchmarking script.
  11. phoenix

    phoenix New Member

    Below is (don't know how to attach) a script that'll output the results of lat_mem_rd into ./lmbench-3.0-a9/lat_mem_rd_core.txt
    It works on my i7-4800MQ laptop with fedora 19 and gcc 4.8.1, but the I'm not sure what the core #'s will be on the 2697 system (especially with HT on). If interested, could see the efficiency of the memory controller with numa on and off from various cores as in http://software.intel.com/en-us/forums/topic/333444 (e.g. numactl --cpunodebind=0 --membind=1 ), which also has comparisons for westmere and sandy bridge-ep. The structure of the triple ring bus and 2 memory controllers might make for more weirdness than we've seen on past intel cpu's.





    #!/bin/bash

    wget -O lmbench-3.0-a9.tgz http://downloads.sourceforge.net/pr....0-a9/&ts=1380742845&use_mirror=softlayer-dal
    sleep 10
    tar -xvf lmbench-3.0-a9.tgz
    cd lmbench-3.0-a9
    cd src
    make lmbench
    sleep 10
    cd ..
    for core in 0 3 4 7 8 11
    do
    echo "core $core: \n" >> lat_mem_rd_core.txt
    taskset -c $core bin/x86_64-linux-gnu/lat_mem_rd -t 1024 2>> lat_mem_rd_core.txt
    done
    Last edited: Oct 2, 2013
  12. Andreas

    Andreas New Member

    Patrick,
    thanks for packaging it up.
    I was travelling in the last few days (US, Portugal, Switzerland, Denmark) but are now back.
    I'll report as soon as I have something to report - Please gimme some time, the inbox is quite full :)
    Andy
  13. Andreas

    Andreas New Member

    Patrick,
    I started the script on 2 machines:

    1) Dual E5-2697v2
    2) Quad E5-4650

    How long will this run approx? minutes or hours?

    BTW, one quick observation.
    I have a power meter connected to the Quad Xeon machine:
    The computational density of the benchmark is very low (at least the part I watched). The board has an idle consumption of 135 watt (at the wall).
    With the multi-CPU version benchmarks (dhrystone, whetstone) the board consumes 150-155 watt)
    Pipe throughput = 160-170 watt

    To compare: With FAH cores, the board is usually in the 610 watt region. With Linpack = 650 watt
    Last edited: Oct 2, 2013
  14. Patrick

    Patrick Administrator

    Disk tests are turned off which helps a lot.

    My guess is under an hour on that hardware. On a Raspberry Pi it does take many hours though. There are some variables such as the amount of time it takes to download packages which can add a bit of time.

    phoenix - will give that one a try. The core thing is a bit of a quandry since this is meant for 1 - 64+ core systems.
  15. Andreas

    Andreas New Member

    Good, because I disassembled the 4p rig last week. Only the boot disk remained.
    That's how it looked like:
    [​IMG]

    [​IMG]
    Last edited: Oct 2, 2013
  16. Andreas

    Andreas New Member

    as it churns along
    16x concurrent shell scripts = 300 watt, 87% idle
    64x dhrystone 2 using register = 512 watt, 100% load
    64x DP whetstone = 490 watt, 100% load
    64x system call = 470 watt
    64x pipe throughput = 570 watt, 93% sys
    64x pipe based context switches = 580 watt, 93% sys
    64x process creation = 400 watt, 42% sys, 57% idle
    64x Excel throughput = 340 watt, 22% sys, 76% idle
    64x shell scripts (1 concurrent) = 390 watt, 6% user, 29% sys, 65% idle
    64x shell scripts (8 concurrent) = 390 watt, 7% user, 30% sys, 63% idle
    64x shell scripts (16 concurrent) = 390 watt, 7% user, 31% sys, 62% idle
    c-ray = 13183 milliseconds
    (unable to open book file, book is disabled)
    Last edited: Oct 2, 2013
  17. Patrick

    Patrick Administrator

    Not too bad at all! Love the pics of the old 4P. UnixBench is by FAR the longest component of the suite.
  18. Andreas

    Andreas New Member

    Patrick, send you p/n
    Last edited: Oct 2, 2013
  19. phoenix

    phoenix New Member

    the core id argument to taskset could depend on the cores detected by the system, for instance via /sys/devices/system/cpu/online
    if it'd be easiest to just run it over all cores detected, the '-t' argument to lat_mem_rd could be lowered to 256 to reduce the runtime

    Thanks for including it
    Last edited: Oct 2, 2013
  20. Patrick

    Patrick Administrator

    Wow! 14s for the "complex" c-ray render on the 2P. Not bad at all!

    phoenix - let me see what I can do.

Share This Page