1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Quanta LB6M 10GBE Switch and VSAN Setup

Discussion in 'VMware, VirtualBox, Citrix' started by alex1002, Mar 6, 2017.

  1. alex1002

    alex1002 Member

    Joined:
    Apr 9, 2013
    Messages:
    470
    Likes Received:
    17
    Good Day,
    I need some help with this setup and VSAN. I am testing 4 node VSAN Cluster. I have one dedicated switch LB6M and each server its own 10GBE PORT for VSAN.

    For some reason every 2-3 minutes one of the nodes becomes part for network partition 2 and a new group 2 is created.

    Under network partition theres group 1 and all four hosts are in it, every 2-3 minutes one host moves its self to network partition 2 and I see a new group 2 is created.

    Anyone can please help me with this switch and VSAN.

    Thank you
     
    #1
  2. alex1002

    alex1002 Member

    Joined:
    Apr 9, 2013
    Messages:
    470
    Likes Received:
    17
    NO advsie whats so ever? The vsan stays fine for a few moments, or even hours then when I try to add vms to the VSAN Datastore it timesouts and see I see the network partition group is been moved to another group 2.
     
    #2
  3. T_Minus

    T_Minus Moderator

    Joined:
    Feb 15, 2015
    Messages:
    5,720
    Likes Received:
    1,093
    #3
  4. Emulsifide

    Emulsifide Active Member

    Joined:
    Dec 1, 2014
    Messages:
    193
    Likes Received:
    80
    I'm not familiar with the LB6M, but I have used an LB4M, so I'm assuming the configuration environment is the same (except for the fact that the interconnects themselves are significantly different).

    I'm also not familiar with network partitioning. What is the purpose of it?

    VSAN works great on a flat network with all VMware functionality (VSAN, vMotion, Management, VM Data) separated into VLANs. Use a vDs switch across the nodes with network resource pools and NIOC to shape your traffic per the VSAN guidelines white paper:

    https://www.vmware.com/content/dam/...n/virtual-san-6.2-design-and-sizing-guide.pdf
     
    #4
  5. alex1002

    alex1002 Member

    Joined:
    Apr 9, 2013
    Messages:
    470
    Likes Received:
    17
    I tihnk my swithc config is messed up. Vmware requires multicast for the VSAN Traffic/switch. Not sure what to use on the LB6M to use do this. But looking at the manual is it the same LB4m.


    Multicast is a network requirement for Virtual SAN. Multicast is used to
    discover ESXi hosts participating in the cluster as well as to keep track
    of changes within the cluster. It is mandatory to ensure that multicast
    traffic is allowed between all the nodes participating in a Virtual SAN
    cluster.
    Multicast performance is also important, so one should ensure a high
    quality enterprise switch is used. If a lower-end switch is used for Virtual
    SAN, it should be explicitly tested for multicast performance, as unicast
    performance is not an indicator of multicast performance. Multicast
    Virtual SAN 6.2 Design and Sizing Guide
    VMwa re Stora g e a nd A v a ila b ili ty Doc um e nt a ti o n / 2 2
    performance can be tested by the Virtual SAN Health Service. While
    IPv6 is supported verify multicast performance as older networking
    gear may struggle with IPv6 multicast performance.
     
    #5
  6. Emulsifide

    Emulsifide Active Member

    Joined:
    Dec 1, 2014
    Messages:
    193
    Likes Received:
    80
    If you suspect multicast is not working, do the multicast performance test under Monitor, Virtual SAN, Proactive tests. Your VSAN health check should also be screaming at you about multicast with some major alarms.
     
    #6
  7. alex1002

    alex1002 Member

    Joined:
    Apr 9, 2013
    Messages:
    470
    Likes Received:
    17
  8. Emulsifide

    Emulsifide Active Member

    Joined:
    Dec 1, 2014
    Messages:
    193
    Likes Received:
    80
    That just shows the alarm and how it's triggered. Let's see a screenshot of your VSAN health status overall. Expand anything that has a major alarm. like this (which is available when you select your VSAN cluster on the left-hand side of the vCenter client):

    upload_2017-3-8_10-18-54.png
     
    #8
  9. alex1002

    alex1002 Member

    Joined:
    Apr 9, 2013
    Messages:
    470
    Likes Received:
    17
    Sorry. Here go with the Network errors I have seen. Failed.JPG
     
    #9
  10. Rand__

    Rand__ Well-Known Member

    Joined:
    Mar 6, 2014
    Messages:
    1,254
    Likes Received:
    147
    At your screenshot way down within proactive tests;)

    And is it always the same host or different ones?
    Have you checked whether the multicast config is the same on all nodes?
    Have you run large pings on your vsan vmk? There were some MTU errors too in your screenshot
     
    #10
  11. Emulsifide

    Emulsifide Active Member

    Joined:
    Dec 1, 2014
    Messages:
    193
    Likes Received:
    80
    Thanks for the screenshot. I understand now!! By network partition, you mean the VSAN cluster itself split off into two separate partitions because the error timeout has elapsed. This means, the VSAN is treating one or more nodes as failed and is now ignoring it completely. In your left-hand view in vCenter, do you have one or more nodes that are failing to communicate with another node?

    Are you using jumbo frames? The MTU issue that @Rand__ has brought up is definitely of concern. What does your virtual network infrastructure look like? Are you using vSwitches or vDs?
     
    #11
  12. alex1002

    alex1002 Member

    Joined:
    Apr 9, 2013
    Messages:
    470
    Likes Received:
    17
    Hi guys,
    This is the issue, all the sudden the network health is back to normal. Then after 5-10minues, even 4 hours it goes back to failed.
    Network.JPG
     
    #12
  13. alex1002

    alex1002 Member

    Joined:
    Apr 9, 2013
    Messages:
    470
    Likes Received:
    17
    I have dedicated network on each host with its own controller for VSAN, static IP each. MTU 9000 on the ports and also on the switch ports.
     
    #13
  14. Emulsifide

    Emulsifide Active Member

    Joined:
    Dec 1, 2014
    Messages:
    193
    Likes Received:
    80
    Strange. Try backing everything down to 1500 MTU and see if the problem stops.
     
    #14
  15. Rand__

    Rand__ Well-Known Member

    Joined:
    Mar 6, 2014
    Messages:
    1,254
    Likes Received:
    147
    I had a similar behavior when I was using the witness appliance a while back but you said you are on 4 phys hosts...
    All have the same patch level?

    Still the question whether its always the same host or if they are moving/random
     
    #15
  16. DaSaint

    DaSaint Active Member

    Joined:
    Oct 3, 2015
    Messages:
    184
    Likes Received:
    36
    Yea this sounds like Jumbo Frames type of issues... are you on 1500 MTU? ensure that all VMKernels used for vSAN have 1500 MTU cause a mismatch could cause what you are seeing... another easy way to see is in the cmd line.

    start with these to start looking at how they are configured...

    check the adapters and what MTU they are reading
    esxcli network nic list
    or
    esxcfg-nics -l

    check the kernels on all hosts see what MTU they are reading
    esxcli network ip interface list
    or
    esxcfg-vmknic -l

    next thing i would check is to make sure Multicast isnt getting hit as vSAN uses Multicast to communicate against other nodes as well
    do this on all 3 hosts... typically if theres a multicast type issue its in the physical switch.

    check your vsan multicast settings.
    esxcli vsan network list

    if your switch has IGMP Querier turned on for the VLAN/Ports you are using. See for example
    Quanta LB6M (10GbE) -- Discussion
     
    #16
  17. alex1002

    alex1002 Member

    Joined:
    Apr 9, 2013
    Messages:
    470
    Likes Received:
    17
    (FASTPATH Routing) #show igmpsnooping

    Admin Mode..................................... Disable
    Multicast Control Frame Count.................. 0
    IGMP Router-Alert check........................ Disabled
    Interfaces Enabled for IGMP Snooping........... None
    VLANs enabled for IGMP snooping................ None


    I cant find a way to enable it. Please advise
    I tried ip igmpsnooping interfacemode
     
    #17
  18. alex1002

    alex1002 Member

    Joined:
    Apr 9, 2013
    Messages:
    470
    Likes Received:
    17
    Heres my switch config

    !System Description "Quanta LB6M, 1.2.0.14, Linux 2.6.21.7"

    !System Software Version "1.2.0.14"

    !System Up Time "7 days 6 hrs 41 mins 55 secs"

    !Additional Packages FASTPATH QOS

    !Current SNTP Synchronized Time: Not Synchronized

    !

    network protocol none

    vlan database

    vlan 2

    vlan routing 2 1

    exit

    configure

    ip routing

    aaa authentication enable "enableList" enable

    line console

    exit

    line telnet

    exit

    line ssh

    exit

    spanning-tree configuration name "60-EB-69-BA-BF-72"

    !


    --More-- or (q)uit


    set igmp

    interface 0/1

    mtu 9216

    vlan pvid 2

    vlan participation include 2

    vlan tagging 2

    exit

    interface 0/2

    mtu 9216

    vlan pvid 2

    vlan participation include 2

    vlan tagging 2

    exit

    interface 0/3

    mtu 9216

    vlan pvid 2

    vlan participation include 2

    vlan tagging 2

    exit

    interface 0/4

    mtu 9216

    vlan pvid 2

    vlan participation include 2

    vlan tagging 2


    --More-- or (q)uit


    exit

    interface 0/5

    mtu 9216

    exit

    interface 0/6

    mtu 9216

    exit

    interface 0/7

    mtu 9216

    exit

    interface 0/8

    mtu 9216

    exit

    interface 0/9

    mtu 9216

    exit

    interface 0/10

    mtu 9216

    exit

    interface 0/11

    mtu 9216

    exit

    interface 0/12

    mtu 9216


    --More-- or (q)uit


    exit

    interface 0/13

    mtu 9216

    exit

    interface 0/14

    mtu 9216

    exit

    interface 0/15

    mtu 9216

    exit

    interface 0/16

    mtu 9216

    exit

    interface 0/17

    mtu 9216

    exit

    interface 0/18

    mtu 9216

    exit

    interface 0/19

    mtu 9216

    exit

    interface 0/20

    mtu 9216


    --More-- or (q)uit


    exit

    interface 0/21

    mtu 9216

    exit

    interface 0/22

    mtu 9216

    exit

    interface 0/23

    mtu 9216

    exit

    interface 0/24

    mtu 9216

    exit

    interface 0/25

    mtu 9216

    exit

    interface 0/26

    mtu 9216

    exit

    interface 0/27

    mtu 9216

    exit

    interface 0/28

    mtu 9216


    --More-- or (q)uit


    exit

    interface 2/1

    routing

    exit

    router rip

    exit

    router ospf

    exit

    exit
     
    #18
  19. alex1002

    alex1002 Member

    Joined:
    Apr 9, 2013
    Messages:
    470
    Likes Received:
    17
    Age Time (seconds)............................. 1200
    Response Time (seconds)........................ 1
    Retries........................................ 4
    Cache Size..................................... 6144
    Dynamic Renew Mode ............................ Disable
    Total Entry Count Current / Peak .............. 0 / 0
    Static Entry Count Configured / Active / Max .. 0 / 0 / 128

    IP Address MAC Address Interface Type Age
    --------------- ----------------- -------------- -------- -----------
     
    #19
  20. alex1002

    alex1002 Member

    Joined:
    Apr 9, 2013
    Messages:
    470
    Likes Received:
    17
    Ouch :( I'm a failure I wish I could figure this one out. Looks like switch cannot ping any of the hosts via the vsan ip


    Sent from my iPhone using Tapatalk
     
    #20

Share This Page