GPU Passthrough for Home Servers

Quick Recommendation

For Plex/Jellyfin hardware transcoding: Get an Intel N100 mini PC with Quick Sync — it handles 10+ simultaneous transcodes at 8W. No discrete GPU needed.

For AI inference (Ollama, LocalAI): NVIDIA GeForce RTX 3060 12GB (~$200 used) is the best value — 12GB VRAM runs 7B-13B parameter models. Pair with Proxmox GPU passthrough.

For gaming VM + server on one box: NVIDIA RTX 4060 or AMD RX 7600 passed through to a Windows VM on Proxmox, while Linux containers run on the host.

Use Cases for GPUs in Home Servers

Use Case	GPU Required?	Best Option
Plex/Jellyfin transcoding	No — Intel Quick Sync is better	Intel iGPU (N100, 12th gen+)
AI/LLM inference (Ollama)	Yes — VRAM matters most	NVIDIA RTX 3060 12GB
Stable Diffusion / ComfyUI	Yes — VRAM + compute	NVIDIA RTX 3090 24GB
Gaming VM (single-player)	Yes — passthrough to VM	NVIDIA RTX 4060 / AMD RX 7600
Security camera AI (Frigate)	No — Intel iGPU or Coral TPU	Google Coral USB ($30)
Video encoding (FFmpeg)	Optional — GPU acceleration helps	NVIDIA NVENC (any GTX/RTX)

Why Intel Quick Sync Beats Discrete GPUs for Transcoding

Intel’s integrated GPU has dedicated hardware video encode/decode blocks that are more efficient than discrete GPUs for transcoding:

Method	Power Draw	Simultaneous 4K Transcodes	Cost
Intel N100 Quick Sync	6-10W	10-15	$150 (whole PC)
Intel 12th gen Quick Sync	15-25W	20+	$200-300 (whole PC)
NVIDIA RTX 3060 NVENC	30-80W	5-8 (software limit)	$200 (GPU alone)
CPU transcoding (8 cores)	80-150W	2-3	Varies

NVIDIA also software-limits concurrent NVENC sessions to 5 on consumer GPUs (bypassed with a driver patch, but janky). Intel has no such limit.

GPU Recommendations by Use Case

AI/LLM Inference

VRAM is the bottleneck. A 7B parameter model (Llama 3 7B, Mistral 7B) needs ~4-5GB VRAM. A 13B model needs ~8-10GB. A 70B model needs ~40GB.

GPU	VRAM	Performance (tokens/sec on 7B)	Power	Used Price
RTX 3060 12GB	12 GB	~35 t/s	170W TDP	~$200
RTX 3090 24GB	24 GB	~55 t/s	350W TDP	~$600
RTX 4060 8GB	8 GB	~40 t/s	115W TDP	~$280
RTX 4090 24GB	24 GB	~90 t/s	450W TDP	~$1,600
Tesla P40 24GB	24 GB	~25 t/s	250W TDP	~$150

Best value: RTX 3060 12GB. The 12GB VRAM is unusual for its tier — most xx60 cards have 6-8GB. This runs 7B-13B models comfortably. Used prices around $200 make it the clear winner for home AI.

Budget pick: Tesla P40. 24GB VRAM for ~$150 used. Slower compute than consumer cards, no video output (data center card), needs active cooling (add a $20 fan shroud). But 24GB VRAM for $150 is unbeatable for running larger models.

Gaming VM

GPU	Performance Tier	Power	Price
RTX 4060	1080p/1440p high	115W	~$280
RX 7600	1080p high	165W	~$230
RTX 4070	1440p/4K medium	200W	~$480
RX 7800 XT	1440p/4K medium	263W	~$420

AMD cards work well for passthrough but require more configuration (vendor-id reset workaround). NVIDIA is more straightforward for passthrough but requires the vfio driver trick on consumer GPUs.

Video Processing (FFmpeg/Handbrake)

Any NVIDIA GTX/RTX card with NVENC handles hardware-accelerated video encoding. The RTX 3060 is again the sweet spot — NVENC quality on 30-series is excellent, and the card is cheap used.

GPU Passthrough with Proxmox

GPU passthrough (VFIO/IOMMU) gives a VM direct access to the physical GPU — near-native performance.

Requirements

CPU with IOMMU support: Intel VT-d or AMD-Vi
Motherboard with IOMMU groups: Check your IOMMU groups — the GPU must be in its own group or with devices you can also pass through
Two GPUs (for gaming VM): One for Proxmox console (iGPU or second dGPU), one for passthrough. For headless AI, the host doesn’t need a display.

Setup Steps (Proxmox)

1. Enable IOMMU in BIOS:

Intel: Enable VT-d
AMD: Enable IOMMU/AMD-Vi

2. Enable IOMMU in Proxmox boot parameters:

# Edit GRUB config
nano /etc/default/grub

# Intel:
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt"
# AMD:
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on iommu=pt"

update-grub
reboot

3. Load VFIO modules:

# Add to /etc/modules
echo -e "vfio\nvfio_iommu_type1\nvfio_pci\nvfio_virqfd" >> /etc/modules

4. Blacklist GPU drivers on the host:

# /etc/modprobe.d/blacklist.conf
echo -e "blacklist nouveau\nblacklist nvidia\nblacklist radeon\nblacklist amdgpu" >> /etc/modprobe.d/blacklist.conf

5. Bind GPU to VFIO:

# Find GPU PCI IDs
lspci -nn | grep -i nvidia
# Example output: 01:00.0 VGA: NVIDIA [10de:2504]
#                 01:00.1 Audio: NVIDIA [10de:228e]

# /etc/modprobe.d/vfio.conf
echo "options vfio-pci ids=10de:2504,10de:228e disable_vga=1" >> /etc/modprobe.d/vfio.conf

update-initramfs -u
reboot

6. Add GPU to VM in Proxmox:

VM → Hardware → Add → PCI Device
Select your GPU
Check “All Functions” and “Primary GPU” (for gaming VMs)
Set machine type to q35, BIOS to OVMF (UEFI)

NVIDIA-Specific Passthrough Notes

NVIDIA consumer drivers detect virtual machines and refuse to load (Error 43). Fix:

# In the VM's .conf file (/etc/pve/qemu-server/<vmid>.conf)
# Add these CPU flags:
cpu: host,hidden=1,flags=+pcid
args: -cpu 'host,+kvm_pv_unhalt,+kvm_pv_eoi,hv_vendor_id=NV43FIX,kvm=off'

This hides the hypervisor from the NVIDIA driver.

LXC Container GPU Access (No Passthrough)

For Docker workloads like Jellyfin transcoding or Ollama, you don’t need full passthrough. Mount the GPU device into an LXC container:

# In the LXC container config (/etc/pve/lxc/<id>.conf)
lxc.cgroup2.devices.allow: c 226:* rwm
lxc.mount.entry: /dev/dri dev/dri none bind,optional,create=dir

This shares the GPU between the host and containers — multiple containers can use it simultaneously (unlike passthrough, which dedicates the GPU to one VM).

Power Consumption

GPU	Idle	Typical Load	Max (TDP)	Annual Idle Cost
Intel N100 iGPU	0W (part of CPU)	3-5W	10W	$0
RTX 3060 12GB	15W	80-120W	170W	$16
RTX 3090 24GB	25W	200-300W	350W	$26
RTX 4060 8GB	10W	70-100W	115W	$11
Tesla P40	30W	150-200W	250W	$32
No GPU (CPU transcode)	0W	60-120W (CPU)	Varies	Varies

GPUs draw significant idle power even when not processing. Consider powering down the GPU when not in use (supported on some Linux setups with nvidia-smi or echo auto > /sys/bus/pci/devices/0000:01:00.0/power/control).

Which GPU for Which Server?

Server Type	GPU Recommendation	Why
Intel N100 Mini PC	None (use iGPU)	No PCIe slot, iGPU is enough
Dell OptiPlex Micro/SFF	Low-profile RTX A2000	SFF only fits low-profile cards
Dell OptiPlex MT	RTX 3060	Full-height PCIe x16
DIY NAS Build	Usually none	NAS workloads don’t need GPU
Enterprise Server (R730)	RTX 3090 or Tesla P40	Full-size PCIe, adequate PSU
Proxmox cluster	RTX 3060 per node	GPU passthrough to VMs

FAQ

Can I use one GPU for multiple VMs?

Not with standard passthrough — a GPU is dedicated to one VM at a time. SR-IOV (Single Root I/O Virtualization) splits a physical GPU into virtual GPUs, but consumer GPUs don’t support it. NVIDIA A-series and Intel Data Center GPUs support SR-IOV. For home use, the workaround is sharing via LXC containers instead of VMs.

Do I need a GPU for Plex hardware transcoding?

No. Intel Quick Sync (iGPU) is the best option for Plex/Jellyfin transcoding. It’s faster, more power-efficient, and has no concurrent session limit. Even a $150 N100 mini PC handles 10+ simultaneous 4K-to-1080p transcodes.

Can I mine crypto with a GPU in my server?

You can, but it’s not profitable for most GPUs in 2026 given electricity costs. A RTX 3060 mining Ethereum Classic earns roughly $0.30/day while consuming $0.40/day in electricity (@$0.12/kWh). Not recommended.

What about AMD GPUs for AI?

AMD ROCm support has improved significantly but is still behind NVIDIA CUDA in the AI/ML ecosystem. Ollama supports ROCm for AMD RX 6000/7000 series. If you’re running Ollama specifically, AMD works. For broader AI workloads (ComfyUI, training), NVIDIA is the safer bet.

Can I use a GPU from a laptop (MXM/soldered)?

No. Laptop GPUs are not removable or usable in desktops. If you need a low-power GPU, look at the NVIDIA T400/T600 (30-40W TDP, low-profile, passive cooling options).

Quick Recommendation

Use Cases for GPUs in Home Servers

Why Intel Quick Sync Beats Discrete GPUs for Transcoding

GPU Recommendations by Use Case

AI/LLM Inference

Gaming VM

Video Processing (FFmpeg/Handbrake)

GPU Passthrough with Proxmox

Requirements

Setup Steps (Proxmox)

NVIDIA-Specific Passthrough Notes

LXC Container GPU Access (No Passthrough)

Power Consumption

Which GPU for Which Server?

FAQ

Can I use one GPU for multiple VMs?

Do I need a GPU for Plex hardware transcoding?

Can I mine crypto with a GPU in my server?

What about AMD GPUs for AI?

Can I use a GPU from a laptop (MXM/soldered)?

Related

Related Articles

Best Home Server OS in 2026

Best Hardware for Jellyfin Media Server

Best Hardware for Plex Transcoding

Proxmox VE System Requirements: Minimum & Recommended Hardware (2026)

Proxmox VE System Requirements: Minimum & Recommended (2026)

Proxmox vs ESXi vs Unraid: Hardware Needs

Get self-hosting tips in your inbox

Comments