(Image credit: Nvidia) Running LLMs locally on your GPU requires a lot of VRAM, which can drive the rig's cost up exponentially these days. Amidst the ongoing AI boom, the best value lies in older, often forgotten silicon that's still capable, which is exactly what YouTuber H ardware Haven found. He took an Nvidia V100 server GPU with an SMX interface, which is similar to using a socketed processor, and converted it to a standard PCIe bus, which plugged into a consumer motherboard. It ended up performing quite well for its stature (and cost), even against modern SKUs. The contraption begins with an Nvidia Tesla V100 AI GPU that uses the SMX2 socket and is designed for rack-scale deployments. The SMX interface is a mezzanine-based connector that mounts GPUs flat against a specialized baseboard, similar to a CPU socket, and the GPU is then screwed down to the baseboard.…