Hey,
I am so lucky I have the priviledge of getting to play with a machine like this, and I would like to load up all the GPUs. My first instinct is to run some kind of rendering in Blender using CUDA.
Maybe the cards just don't want to do this, I don't know, but I have one hell of a headache here.
I have installed pop OS with Nvidia drivers. This works, and I can use nvidia-smi to see all 8 of the GPUs on the GPU board. I am running Nvidia driver 575.57.08 with CUDA toolkit 12.9.
I have tried to install blender and run it as sudo, where I ran into problems with XDG_RUNTIME_DIR. I set a group for all the GPUs so they are all in a group called video, and my user (called power) is also in that group.
Environment variables PATH and LD_LIBRARY_PATH is set to the correct values according to, funny enough, an AI. I didn't use AI for much, it could be wrong, and I don't think the environment variables are what's stopping me here.
Blender still refuses to see the CUDA devices, I get an error with CudaInit "Unknown CUDA error value".
Running deviceQuery from the toolkit shows me "system not yet initialized" and I think this is my problem. It might just be that these A100's just doesn't want to do what I want.
The end goal was basically just to evaluate the cooling needed for a production environment, where this OS does not matter, I just want to load up the GPUs.
Maybe I need some kind of AI model to be loaded up on them. Can I do that in pop OS? Is there an easy and quick way with limited setup for me to just load up the GPUs?
This is the first time I have the SMX board in hand, I have multiple PCIe GPUs, but I never loaded them up.
I was considering a Windows VM on each and run furmark, but the GPUs probably don't want to do that either.
What would be the most cool is to have an AI model make a picture for me, so I can hang it at the workplace, but I don't know the setup needed, or if pop OS can run any AI model - I have no experience with this.
I have some linux experience, but that is also limited, so I might be a lost cause needed to be spoon-fed the solution.
Any clues here? I would be very grateful!
I am so lucky I have the priviledge of getting to play with a machine like this, and I would like to load up all the GPUs. My first instinct is to run some kind of rendering in Blender using CUDA.
Maybe the cards just don't want to do this, I don't know, but I have one hell of a headache here.
I have installed pop OS with Nvidia drivers. This works, and I can use nvidia-smi to see all 8 of the GPUs on the GPU board. I am running Nvidia driver 575.57.08 with CUDA toolkit 12.9.
I have tried to install blender and run it as sudo, where I ran into problems with XDG_RUNTIME_DIR. I set a group for all the GPUs so they are all in a group called video, and my user (called power) is also in that group.
Environment variables PATH and LD_LIBRARY_PATH is set to the correct values according to, funny enough, an AI. I didn't use AI for much, it could be wrong, and I don't think the environment variables are what's stopping me here.
Blender still refuses to see the CUDA devices, I get an error with CudaInit "Unknown CUDA error value".
Running deviceQuery from the toolkit shows me "system not yet initialized" and I think this is my problem. It might just be that these A100's just doesn't want to do what I want.
The end goal was basically just to evaluate the cooling needed for a production environment, where this OS does not matter, I just want to load up the GPUs.
Maybe I need some kind of AI model to be loaded up on them. Can I do that in pop OS? Is there an easy and quick way with limited setup for me to just load up the GPUs?
This is the first time I have the SMX board in hand, I have multiple PCIe GPUs, but I never loaded them up.
I was considering a Windows VM on each and run furmark, but the GPUs probably don't want to do that either.
What would be the most cool is to have an AI model make a picture for me, so I can hang it at the workplace, but I don't know the setup needed, or if pop OS can run any AI model - I have no experience with this.
I have some linux experience, but that is also limited, so I might be a lost cause needed to be spoon-fed the solution.
Any clues here? I would be very grateful!