Diskless Boot Over Infiniband help

renderfarmer

Member
Feb 22, 2013
249
1
18
New Jersey
Hi all.

I'm interested in setting up diskless boot on my IB network.

If I were using Ethernet the options that I'm aware of for diskless boot are iSCSI and AoE.

Is AoE valid on an IB network?

Are there other options I should explore?
 

dba

Moderator
Feb 20, 2012
1,478
181
63
San Francisco Bay Area, California, USA
iSCSI works for "remote" disk boot. I also remember that Oracle uses PXE over IB on Linux. Google "Infiniband Enabled Diskless PXE" if that's the style of diskless boot you are after.

Hi all.

I'm interested in setting up diskless boot on my IB network.

If I were using Ethernet the options that I'm aware of for diskless boot are iSCSI and AoE.

Is AoE valid on an IB network?

Are there other options I should explore?
 

renderfarmer

Member
Feb 22, 2013
249
1
18
New Jersey
Thanks, dba. Yeah, I guess what I'm really after is a remote boot. That Oracle how-to is a pretty good first step to a proof of concept.

I have 5 render slaves that only do one thing; render. By the end of the year I'll have 9. They require little disk space beyond what the OS needs to be installed into. I would like them to be diskless and boot remotely from my file server.

The goal is to be able to administer one image and then copy that image as many times as is needed for each slave to boot from.
This i how PXE booting via vBlade using AoE is done.

I guess what I'm asking is whether iSCSI is the best solution for IPoIB remote boot or whethere there are alternatives that I should consider? Thanks!
 

renderfarmer

Member
Feb 22, 2013
249
1
18
New Jersey
Might want to consider using the regular lan ports for your iscsi/pxe boot so you arent affecting latency on your IB network during renders.
Thanks. I will. But adding a second network to my farm would be a last resort. Rendering isn't very dependent on latency. The only reason I'm on a high speed network is because some of the files I work with are pushing 2GB. But they only get transfered to the software once and then they just sit in RAM getting worked on for an hour or so until an image is passed back to the file server. Very low impact.

That site and the AoE equivalent by the same author are what imspired me to try to get this to work.

I just can't figure out whether iSCSI and AoE are the only two viable options and how IB factors into deciding which is better.
 

dba

Moderator
Feb 20, 2012
1,478
181
63
San Francisco Bay Area, California, USA
If PXE isn't your cup of tea then iSCSI may be your only good option. My understanding is that AoE does not work over non-Ethernet transports.

Thanks. I will. But adding a second network to my farm would be a last resort. Rendering isn't very dependent on latency. The only reason I'm on a high speed network is because some of the files I work with are pushing 2GB. But they only get transfered to the software once and then they just sit in RAM getting worked on for an hour or so until an image is passed back to the file server. Very low impact.

That site and the AoE equivalent by the same author are what imspired me to try to get this to work.

I just can't figure out whether iSCSI and AoE are the only two viable options and how IB factors into deciding which is better.
 

renderfarmer

Member
Feb 22, 2013
249
1
18
New Jersey
I have no problem using PXE whatsoever. So far it's the only thing that I've been able to understand well. Since I'm using mellanox products PXE is a necessary part of any remote boot solution: "FlexBoot enables remote boot over Ethernet, Boot over Ethernet (BoE), Boot over InfiniBand (BoIB) or Boot over iSCSI (Bo-iSCSI)".

Is it true that iSCSI requirtes two seperate network adapters? One for the iSCSI boot and another for normal network traffic?
 

nitrobass24

Moderator
Dec 26, 2010
1,083
127
63
TX
No it does not require separate adapters for iSCSI. It is typical in 1GBe network to separate storage/iscsi for performance/security reasons.
In a 10Gbe IB network, performance is probably not an issue based on what you mentioned above. I was assuming since its a cluster more or less that the nodes needed to have low latency access to the others, this does not sound like that.

I would always do iSCSI over AoE because iscsi is baked into Windows by default.
 

dba

Moderator
Feb 20, 2012
1,478
181
63
San Francisco Bay Area, California, USA
I can confirm what notrobass is saying: iSCSI will co-exist with other TCP/IP traffic just fine. In fact, I'm a big fan of "converged" networking - run one (or two) big physical connection(s) (10GigE or IB) to each server and then chop that up into multiple virtual connections - one for management, a few for VM traffic, another for cluster heartbeat, a big one for iSCSI, one more for VM replication, etc.

I have no problem using PXE whatsoever. So far it's the only thing that I've been able to understand well. Since I'm using mellanox products PXE is a necessary part of any remote boot solution: "FlexBoot enables remote boot over Ethernet, Boot over Ethernet (BoE), Boot over InfiniBand (BoIB) or Boot over iSCSI (Bo-iSCSI)".

Is it true that iSCSI requirtes two seperate network adapters? One for the iSCSI boot and another for normal network traffic?
 

renderfarmer

Member
Feb 22, 2013
249
1
18
New Jersey
No it does not require separate adapters for iSCSI. It is typical in 1GBe network to separate storage/iscsi for performance/security reasons.
In a 10Gbe IB network, performance is probably not an issue based on what you mentioned above. I was assuming since its a cluster more or less that the nodes needed to have low latency access to the others, this does not sound like that.

I would always do iSCSI over AoE because iscsi is baked into Windows by default.
Thanks guys, that's really helpful. I'm actually switching to QDR or maybe even FDR (if I can afford it) this summer. So it'll be even less of an issue. Render nodes only fetch jobs and referenced assets from a file server (or several file servers if its a really big facility). The render nodes don't ever interact with each other.

So now that I know I'm doing Flexboot off iSCSI I can focus my research more on how to do this.

Would either of you know why in that disklesswindows how-to website they install ccboot client on each VM? It's the only part that I'm not clear on.
 

wuffers

New Member
Dec 24, 2012
19
0
0
Hi all.

I'm interested in setting up diskless boot on my IB network.
Did you get your FlexBoot working? I just started looking into this as I will have a few VM hosts that I want to boot from SAN.

I have the FlexBoot firmware and some instructions on getting a patched isc-dhcp server going to give the HBAs a chance to get an IP. Haven't gotten that far yet. :)
 

renderfarmer

Member
Feb 22, 2013
249
1
18
New Jersey
Did you get your FlexBoot working? I just started looking into this as I will have a few VM hosts that I want to boot from SAN.

I have the FlexBoot firmware and some instructions on getting a patched isc-dhcp server going to give the HBAs a chance to get an IP. Haven't gotten that far yet. :)
Not yet, but that's next on my list. Since I've never done it before and form what I can tell the hard part is at the Windows end I'm going to get it working using Ethernet first.
 

RimBlock

Member
Sep 18, 2011
788
8
18
Singapore
I have been digging in to this but without the Ethernet constraints. It seems that is may not be possible currently.

I did however come across this guide here which uses PXE, iPXE to boot a boot a boot :).

The idea is to use PXE to get an iPXE image with Infiniband which can then be used to boot an image over Infiniband. The setup is pretty similar to the Mellanox Flexboot (without firmware flashing).

The downside is that it is fairly complicated with multiple points of failure.

Personally I am now going to look at using a small SSD to boot from and mount various directories over Infiniband on boot.

RB
 

wuffers

New Member
Dec 24, 2012
19
0
0
Thanks for that link, haven't seen it before on my searches.

I'm debating whether to go mirrored SSD for boot or Boot from SAN. Pros/cons for both but since budget is limited I'd rather try to test this out before going out to buy 5 sets of drives. I'll let you know how far I get lol.