So, I just secured a NIB A4000 Ada SFF...

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

splifingate

Member
Oct 7, 2023
99
58
18
I finally have a NIB PNY A4000 Ada SFF is sitting on my desk!

Reading the stories, here, this seems like dipping 1/32 of my pinky toe in the pool, but . . .

My existing H/W is a functioning Optiplex 3070 SFF (with a PCIe extender). Current OS is win11

I also have an Apevia PFC500W, and the connectors necessary to make the conversion.

Additionally, I have a NIB HP HSTNS-PR17 arriving next week (that project will be fun!).

I've been entertaining going the n3rdware-cooler route.

(ultimately, I'm thinking I'll go with AM5 and Void/Debian/etc.)

The hardware details are trivial to deciding just how I want to approach my first attempts at reaching a best-soft approach to local inference.

Thoughts, and impressions?
 

MiniKnight

Well-Known Member
Mar 30, 2012
3,078
979
113
NYC
Sweet card. I'd try everything before trying the n3rdware route since the stock coolers are good.

I think you've got a good setup that's worth trying now and then tweaking as necessary. I'm planning on getting new AI boxes every 12 to 18 months now since hardware gets much better
 
  • Like
Reactions: splifingate

splifingate

Member
Oct 7, 2023
99
58
18
Sweet card. I'd try everything before trying the n3rdware route since the stock coolers are good.

I think you've got a good setup that's worth trying now and then tweaking as necessary. I'm planning on getting new AI boxes every 12 to 18 months now since hardware gets much better
Thank you for your thoughts.

I was originally inspired by Allan Witt's "AI Assistant with Willow and LLM" and "RTX 4000 SFF Ada for LLM", and I had this little Dell just sitting here...

Of course, the x16 slot is smack-up-against the PSU (hence my thinking about the n3rdware slimline-HS). I have found PSU options that take that consideration off in the near-term. I was also entertaining getting a MinisForum MS-01/-A1, and that route requires converting the A4000 from 2-slot to 1-slot.

Hardware is not really an issue: the performance of everything is just so incredibly fantastic these days!

My current pause is that I'm not that familiar with (or particularly keen on) using Windows, and most of his processes involve such. They also involve Docker, and I'm really not familiar with that either.

Have a good working understanding of Debian variants, but I'm really trying to avoid the whole systemd thing (if I can).

All that being said, what software works for you?
 

splifingate

Member
Oct 7, 2023
99
58
18
Getting up-to-speed (slowly), and I found a vid by Wendell:


Looks like the simplest route for me is to roll with Ubnt/Deb for the time being.

I guess it's about time I explored Containers....
 

splifingate

Member
Oct 7, 2023
99
58
18
Seems I got another non-functional unit off Teh Bay . . . "twice-bit, thrice-shy", and all that.

I will seek-out other avenues....
 

RimBlock

Active Member
Sep 18, 2011
869
31
28
Singapore
I got a A4500 from the same online market and it has been pretty good (it's essentially a 3080 with 20Gb vram and only around 240w). Mine was a Dell server pull and the pic was of a card, rather than a stock / generic photo. Seller seemed to be a small video studio. Price was around $1k but prices seem to have risen since then.

I run an AI stack on vSphere 7 (ESXi - w/pcie passthrough) -> RHEL 9 VM (free with RH Dev programme) -> Docker containers vi docker compose (Nvidia version with GPU container passthrough where required).

My current stack consists of the following containers;
  • Ollama (LLM Inference AI - Chat).
  • Open_WebUI (Web interface for ollama).
  • Stable Diffusion & Automatic1111 & Comfy_UI (Generative AI - Pics).
  • n8n (No-code automation).
  • Redis (AI Agent Memory).
  • Postgres DB (AI agent tool - Structured Data).
  • Qdrant (AI Tool - vStore - Unstructured data).
  • Traeffik (Reverse Proxy - needed for n8n).
It works quite well although trying to get the AI Agent to use the tools it should in order to answer / generate the desired output is challenging. Getting it to do so accurately and without fabrication is a major challenge.

Feel free to ask away if you would like some clarity on any of that.
 
  • Like
Reactions: splifingate

splifingate

Member
Oct 7, 2023
99
58
18
Thank you.

I purchased the DGX Spark at MC on release (my Reservation email didn't arrive).

It's going to be a hot minute before I can blend all these ideas/strategies together . . .