{"id":764,"date":"2025-05-03T09:18:48","date_gmt":"2025-05-03T09:18:48","guid":{"rendered":"https:\/\/www.gpu4host.com\/blog\/?p=764"},"modified":"2025-05-03T11:40:56","modified_gmt":"2025-05-03T11:40:56","slug":"amd-gpu-passthrough-issue","status":"publish","type":"post","link":"https:\/\/www.gpu4host.com\/blog\/amd-gpu-passthrough-issue\/","title":{"rendered":"AMD GPU passthrough issue"},"content":{"rendered":"<div class='epvc-post-count'><span class='epvc-eye'><\/span>  <span class=\"epvc-count\"> 1,052<\/span><span class='epvc-label'> Views<\/span><\/div>\n<h2 class=\"wp-block-heading\"><strong>Resolving AMD GPU Passthrough Problems After VM Restart<\/strong><\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Setting up a GPU server with NVIDIA or AMD graphics in a virtualized setting can significantly boost performance for artificial intelligence, machine learning, and high-quality rendering workloads. However, all those users using AMD GPU passthrough in virtual machines (VMs) generally face a frustrating challenge: after restarting the VM, the GPU driver either fails to load or the system doesn\u2019t identify the GPU at all.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">If you have also experienced this, don\u2019t worry; you are not alone. This guide offers a comprehensive solution to fix the AMD GPU passthrough issue after a VM restart\u2014making sure that your <a href=\"https:\/\/www.infinitivehost.com\/gpu-dedicated-server\" target=\"_blank\" rel=\"noopener\">GPU dedicated server <\/a>remains dedicated to delivering high performance.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Whether you&#8217;re utilizing GPU4HOST, handling GPU hosting environments, or running complex GPU clusters for AI-based tasks, this article takes you through a practical, user-friendly fix.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Knowing About the AMD GPU Passthrough Issue<\/strong><\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The AMD GPU passthrough issue mainly happens in virtualized environments such as Proxmox or KVM\/QEMU when a VM is set up to utilize a dedicated AMD GPU. After restarting the VM:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>The AMD driver may not initialize correctly.<\/li>\n\n\n\n<li>The VM might sometimes hang or crash at the time of boot.<\/li>\n\n\n\n<li>You may get to see a black screen or no video result.<\/li>\n\n\n\n<li>lspci shows the GPU, but the operating system fails to bind the driver.<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">This issue is very common with AMD Radeon GPUs passed through to both Windows and Linux VMs, which are using VFIO (Virtual Function I\/O). Apart from NVIDIA GPU passthrough, AMD GPUs can behave completely differently just because of reset bugs and driver traits.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Why Does the AMD GPU Passthrough Issue Happen?<\/strong><\/h2>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>GPU Restart Bug<\/strong>: Various AMD cards, mainly consumer-level ones, have a shortage of a proper hardware reset operation. Once started by the host or virtual machine, they may not reset properly after reboot.<\/li>\n\n\n\n<li><strong>Driver State Issue<\/strong>: After a virtual machine reboot, the AMD GPU may keep previous state data that conflicts with the VM\u2019s fresh initialization procedure.<\/li>\n\n\n\n<li><strong>Improper VFIO Binding<\/strong>: If the VFIO drivers don\u2019t properly unbind and rebind at the time of the reboot cycle, the AMD GPU passthrough issue takes place.<\/li>\n<\/ol>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Step-by-Step Guide for AMD GPU Passthrough Issue After VM Reboot<\/strong><\/h2>\n\n\n\n<figure class=\"wp-block-image size-full\"><img fetchpriority=\"high\" decoding=\"async\" width=\"768\" height=\"288\" src=\"https:\/\/www.gpu4host.com\/blog\/wp-content\/uploads\/2025\/05\/Step-by-Step-Guide-for-AMD-GPU-Passthrough-Issue-After-VM-Reboot.webp\" alt=\"AMD GPU passthrough issue\" class=\"wp-image-768\" srcset=\"https:\/\/www.gpu4host.com\/blog\/wp-content\/uploads\/2025\/05\/Step-by-Step-Guide-for-AMD-GPU-Passthrough-Issue-After-VM-Reboot.webp 768w, https:\/\/www.gpu4host.com\/blog\/wp-content\/uploads\/2025\/05\/Step-by-Step-Guide-for-AMD-GPU-Passthrough-Issue-After-VM-Reboot-300x113.webp 300w, https:\/\/www.gpu4host.com\/blog\/wp-content\/uploads\/2025\/05\/Step-by-Step-Guide-for-AMD-GPU-Passthrough-Issue-After-VM-Reboot-480x180.webp 480w\" sizes=\"(max-width: 768px) 100vw, 768px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Let\u2019s effortlessly troubleshoot the issue practically. All the below-mentioned steps are tested on<a href=\"https:\/\/www.gpu4host.com\/\"> GPU server<\/a> with the help of Proxmox and QEMU\/KVM hypervisors.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Step 1: Enable ACS &amp; IOMMU in BIOS<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Make sure that your BIOS settings are set up correctly:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Allow IOMMU &amp; SR-IOV.<\/li>\n\n\n\n<li>For AMD CPUs, allow SVM (Secure Virtual Machine).<\/li>\n\n\n\n<li>For Intel CPUs (if you are mixing GPUs), enable VT-d.<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">This ensures hardware-grade isolation required for GPU passthrough.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Step 2: Utilize the Latest Linux Kernel &amp; VFIO Modules<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Simply update your host system:<\/p>\n\n\n\n<p class=\"has-text-color has-link-color wp-elements-dba165da31aac6a7b0412420d8da3924 wp-block-paragraph\" style=\"color:#18cc00\">sudo apt update &amp;&amp; sudo apt full-upgrade<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Install the modern kernel and make sure that VFIO modules are loaded at boot by including in\/etc\/modules:<\/p>\n\n\n\n<p class=\"has-text-color has-link-color wp-elements-909d36fc41cf477a3e80a4f2ddc00120 wp-block-paragraph\" style=\"color:#18cc00\">vfio<\/p>\n\n\n\n<p class=\"has-text-color has-link-color wp-elements-8369a52cd22ce4457c21dee94304cf2b wp-block-paragraph\" style=\"color:#18cc00\">vfio_iommu_type1<\/p>\n\n\n\n<p class=\"has-text-color has-link-color wp-elements-497500dcc390807a74cf2879627f294b wp-block-paragraph\" style=\"color:#18cc00\">vfio_pci<\/p>\n\n\n\n<p class=\"has-text-color has-link-color wp-elements-4107a9e0308557055df0d2e21b203262 wp-block-paragraph\" style=\"color:#18cc00\">vfio_virqfd<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Step 3: Classify Your AMD GPU &amp; Bind It to VFIO<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Utilize lspci to locate your AMD GPU:<\/p>\n\n\n\n<p class=\"has-text-color has-link-color wp-elements-30504774eccdf0fb71095bfbb9cbf9d1 wp-block-paragraph\" style=\"color:#18cc00\">lspci | grep VGA<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Just get the device ID:<\/p>\n\n\n\n<p class=\"has-text-color has-link-color wp-elements-5ce61434cd139cd78258f41af4d83c45 wp-block-paragraph\" style=\"color:#18cc00\">lspci -n -s 0a:00.0<\/p>\n\n\n\n<p class=\"has-text-color has-link-color wp-elements-b0edeb9a28ddbbd97ee182221bbd5944 wp-block-paragraph\" style=\"color:#18cc00\">Edit \/etc\/modprobe.d\/vfio.conf:<\/p>\n\n\n\n<p class=\"has-text-color has-link-color wp-elements-2f6556a56b3d9778def47efbd36889ca wp-block-paragraph\" style=\"color:#18cc00\">options vfio-pci ids=1002:67df,1002:aaf0<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Replace 1002:67df and 1002:aaf0 along with your GPU and audio device IDs.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Step 4: Avoid the Host from Grabbing the GPU<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Blacklist Radeon drivers:<\/p>\n\n\n\n<p class=\"has-text-color has-link-color wp-elements-8dac7cdc0de476fdfe513b3536899e93 wp-block-paragraph\" style=\"color:#18cc00\">echo &#8220;blacklist radeon&#8221; &gt;&gt; \/etc\/modprobe.d\/blacklist.conf<\/p>\n\n\n\n<p class=\"has-text-color has-link-color wp-elements-3d6ddc5860ec60bb32a7af5df93ff1c6 wp-block-paragraph\" style=\"color:#18cc00\">echo &#8220;blacklist amdgpu&#8221; &gt;&gt; \/etc\/modprobe.d\/blacklist.conf<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Update initramfs:<\/p>\n\n\n\n<p class=\"has-text-color has-link-color wp-elements-590f1cc2cf2b71f6d77b8ba12a47cf32 wp-block-paragraph\" style=\"color:#18cc00\">update-initramfs -u<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Restart your GPU server.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Step 5: Patch GPU Reset Bug (if required)<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Various AMD cards cannot be easily reset without a patch. Utilize the vendor-reset module:<\/p>\n\n\n\n<p class=\"has-text-color has-link-color wp-elements-0952ddc3f8a457bcacff72bf18588c93 wp-block-paragraph\" style=\"color:#18cc00\">git clone https:\/\/github.com\/gnif\/vendor-reset<\/p>\n\n\n\n<p class=\"has-text-color has-link-color wp-elements-4e8eea8a8a2fc71e40ed8a938bcffd30 wp-block-paragraph\" style=\"color:#18cc00\">cd vendor-reset<\/p>\n\n\n\n<p class=\"has-text-color has-link-color wp-elements-81e7401c8c4fc3763793c392cdda962e wp-block-paragraph\" style=\"color:#18cc00\">make<\/p>\n\n\n\n<p class=\"has-text-color has-link-color wp-elements-6818693003e78bc638b538ed66136055 wp-block-paragraph\" style=\"color:#18cc00\">sudo make install<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Allow it:<\/p>\n\n\n\n<p class=\"has-text-color has-link-color wp-elements-cfd1f4c8aa88583bd18581763b6e3dab wp-block-paragraph\" style=\"color:#18cc00\">echo &#8220;vendor-reset&#8221; &gt;&gt; \/etc\/modules<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This simply helps to reset AMD GPUs correctly after a reboot \u2014 necessary for constant AMD GPU passthrough issue cases.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Step 6: Add Proper VM Arguments for Passthrough<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Normally, edit your VM setup (for example, in Proxmox):<\/p>\n\n\n\n<p class=\"has-text-color has-link-color wp-elements-5b302636cebf68b8c05dac7fd95d17d6 wp-block-paragraph\" style=\"color:#18cc00\">hostpci0: 0a:00.0,x-vga=on,pcie=1<\/p>\n\n\n\n<p class=\"has-text-color has-link-color wp-elements-392e36f084b37e1f08c11f7edd44ce0e wp-block-paragraph\" style=\"color:#18cc00\">machine: q35<\/p>\n\n\n\n<p class=\"has-text-color has-link-color wp-elements-14c92165ba5e2b9c05726c5d2fe38fb0 wp-block-paragraph\" style=\"color:#18cc00\">cpu: host,hidden=1,flags=+pcid<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Also, make sure:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>x-vga=on is only utilized if you want to see a display.<\/li>\n\n\n\n<li>romfile is utilized if you&#8217;re passing the main GPU.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Step 7: Power Cycle Between Reboots<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Because AMD GPUs generally don\u2019t reset on restart, a complete power-off and power-on cycle may be needed to &#8220;clear&#8221; the memory and reset state of the GPU.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">If utilizing a <a href=\"https:\/\/www.gpu4host.com\/gpu-cluster\">GPU cluster<\/a> or GPU hosting node, consider scripting VM reboots to have a host reboot as a temporary escape.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Additional Tips for Production GPU Servers<\/strong><\/h2>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"768\" height=\"288\" src=\"https:\/\/www.gpu4host.com\/blog\/wp-content\/uploads\/2025\/05\/Additional-Tips-for-Production-GPU-Servers.webp\" alt=\"AMD GPU passthrough issue\" class=\"wp-image-769\" srcset=\"https:\/\/www.gpu4host.com\/blog\/wp-content\/uploads\/2025\/05\/Additional-Tips-for-Production-GPU-Servers.webp 768w, https:\/\/www.gpu4host.com\/blog\/wp-content\/uploads\/2025\/05\/Additional-Tips-for-Production-GPU-Servers-300x113.webp 300w, https:\/\/www.gpu4host.com\/blog\/wp-content\/uploads\/2025\/05\/Additional-Tips-for-Production-GPU-Servers-480x180.webp 480w\" sizes=\"(max-width: 768px) 100vw, 768px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">If you\u2019re utilizing NVIDIA or AMD GPU configurations, these best practices enhance stability:<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Utilize Dedicated GPU for Passthrough<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Prevent using your host\u2019s main GPU for passthrough. Utilize other AMD or <a href=\"https:\/\/www.gpu4host.com\/nvidia-a100-rental\">NVIDIA A100<\/a> GPUs in the case of GPU dedicated server deployments.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Separate GPU Audio Function<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Always try to bind both the GPU and its related audio device to VFIO.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Check GPU Health with Tools<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">On AMD: Utilize radeontop, sensors &amp; journalctl logs.<br>On NVIDIA: Utilize nvidia-smi to track AI tasks on AI GPU configurations.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Why is This Necessary for GPU4HOST Clients<\/strong><\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">At GPU4HOST, our technicians manage all these hardware-grade GPU issues so you don\u2019t have to worry about anything. But for clients who handle their VMs with AMD GPU passthrough themselves, knowing how to troubleshoot reboot problems is necessary to increase the potential of your GPU servers.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">If you are running:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>AI-based model training on an AI GPU<\/li>\n\n\n\n<li>Machine learning inference with containerized tasks<\/li>\n\n\n\n<li>High-quality rendering in a GPU cluster<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">This fix makes sure that your GPU dedicated server runs seamlessly post-reboot.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Conclusion<\/strong><\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The AMD GPU passthrough issue usually occurs after a VM reboot and can be a lot irritating, but with the correct method\u2014BIOS tuning, driver isolation, and vendor-reset\u2014your GPU server can easily recover and reboot flawlessly.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Deploying this big fix helps to get stable GPU passthrough for AMD cards in production-level GPU hosting settings. While NVIDIA GPU configurations, such as the NVIDIA A100, generally have improved reset support, AMD can still offer high performance once configured correctly.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Utilize this guide to harden your virtualized setting\u2014and get the potential of smooth GPU passthrough with <a href=\"https:\/\/www.gpu4host.com\/\">GPU4HOST<\/a>.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n","protected":false},"excerpt":{"rendered":"<p>1,052 Views Resolving AMD GPU Passthrough Problems After VM Restart Setting up a GPU server with NVIDIA or AMD graphics in a virtualized setting can significantly boost performance for artificial intelligence, machine learning, and high-quality rendering workloads. However, all those users using AMD GPU passthrough in virtual machines (VMs) generally face a frustrating challenge: after [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":771,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-764","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/www.gpu4host.com\/blog\/wp-json\/wp\/v2\/posts\/764","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.gpu4host.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.gpu4host.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.gpu4host.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.gpu4host.com\/blog\/wp-json\/wp\/v2\/comments?post=764"}],"version-history":[{"count":2,"href":"https:\/\/www.gpu4host.com\/blog\/wp-json\/wp\/v2\/posts\/764\/revisions"}],"predecessor-version":[{"id":770,"href":"https:\/\/www.gpu4host.com\/blog\/wp-json\/wp\/v2\/posts\/764\/revisions\/770"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.gpu4host.com\/blog\/wp-json\/wp\/v2\/media\/771"}],"wp:attachment":[{"href":"https:\/\/www.gpu4host.com\/blog\/wp-json\/wp\/v2\/media?parent=764"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.gpu4host.com\/blog\/wp-json\/wp\/v2\/categories?post=764"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.gpu4host.com\/blog\/wp-json\/wp\/v2\/tags?post=764"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}