If you've spent hours staring at a torch.cuda.is_available() that stubbornly returns False, or wrestled with cryptic CUDA version mismatch errors, you're in good company. PyTorch and CUDA installation issues are among the most common stumbling blocks in AI development — but once you understand the root causes, they're surprisingly straightforward to fix.
Common Symptoms and Error Patterns
Symptom A: GPU Not Detected After Installation
import torch
print(torch.cuda.is_available())
# → False (should be True)Symptom B: CUDA Version Mismatch at Runtime
RuntimeError: CUDA error: no kernel image is available for execution on the device
UserWarning: CUDA initialization: CUDA unknown error
Symptom C: pip Install Fails Entirely
ERROR: Could not find a version that satisfies the requirement torch==2.5.0+cu121
ERROR: No matching distribution found for torch==2.5.0+cu121
Symptom D: Import Errors at Python Runtime
ImportError: libcuda.so.1: cannot open shared object file: No such file or directory
OSError: CUDA_HOME environment variable is not set.
Root Cause Analysis — Why These Errors Happen
Cause 1: Python Version Incompatibility
PyTorch ships pre-built wheels for specific Python versions. If you're running Python 3.13 or another very recent release, the corresponding PyTorch wheels may not yet be available.
Cause 2: CUDA Driver vs. CUDA Toolkit Mismatch
The NVIDIA GPU driver and the CUDA toolkit (nvcc) are two separate things — and both need to be compatible with the PyTorch build you're trying to install. The version shown in nvidia-smi is the maximum CUDA version your driver supports; your PyTorch CUDA build must be at or below that version.
Cause 3: Missing or Incorrect Environment Variables
If CUDA_HOME or LD_LIBRARY_PATH aren't properly set, the Python runtime won't be able to locate the required CUDA libraries — even if they're installed.
Cause 4: Mixed Python Environments
When your system Python and a virtual environment (venv/conda) get mixed together, packages end up installed in unexpected locations. This is a very common hidden cause.
Cause 5: Stale pip Cache
An outdated pip cache can interfere with dependency resolution, causing installs to pull in wrong or incompatible versions.
Step-by-Step Solutions
Step 1: Audit Your Current Environment
Before making any changes, gather the full picture.
# Check Python version
python --version
python3 --version
# Check NVIDIA driver and CUDA version
nvidia-smi
# Check CUDA toolkit (nvcc)
nvcc --version
# Check currently installed PyTorch
pip show torch
python -c "import torch; print(torch.__version__); print(torch.cuda.is_available())"The "CUDA Version" shown in the top-right corner of nvidia-smi output indicates the maximum CUDA version your driver supports. Make sure your PyTorch CUDA build is at or below that version.
Step 2: Completely Uninstall the Existing PyTorch
If you suspect a version mismatch, wipe the slate clean.
pip uninstall torch torchvision torchaudio -y
pip cache purgeStep 3: Reinstall PyTorch with the Correct CUDA Build
The most reliable approach is to use the command generated by the official PyTorch install page — select your OS, package manager, and CUDA version, and run the resulting command exactly as shown.
# CUDA 12.1 build (example)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
# CUDA 11.8 build
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
# CPU-only (no GPU)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpuStep 4: Set Environment Variables (Linux/macOS)
# Find where nvcc lives
which nvcc
# → /usr/local/cuda/bin/nvcc
# Set CUDA_HOME accordingly
export CUDA_HOME=/usr/local/cuda
export LD_LIBRARY_PATH=$CUDA_HOME/lib64:$LD_LIBRARY_PATH
export PATH=$CUDA_HOME/bin:$PATHTo make these permanent, add them to your shell config:
echo 'export CUDA_HOME=/usr/local/cuda' >> ~/.bashrc
echo 'export LD_LIBRARY_PATH=$CUDA_HOME/lib64:$LD_LIBRARY_PATH' >> ~/.bashrc
source ~/.bashrcStep 5: Use conda for Reliable Multi-Version Management (Recommended)
If you need different CUDA versions across projects, conda is your best friend — it manages the CUDA toolkit itself, so you're not at the mercy of your system-level CUDA installation.
# Create a new conda environment with Python 3.11
conda create -n pytorch_env python=3.11 -y
conda activate pytorch_env
# Install PyTorch with CUDA 12.1 via conda
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia -yVerification — Confirming the Fix Worked
Run this script to validate your GPU is properly recognized:
import torch
# Show versions
print(f"PyTorch version: {torch.__version__}")
# Check CUDA availability
print(f"CUDA available: {torch.cuda.is_available()}")
# Show GPU details
if torch.cuda.is_available():
print(f"GPU: {torch.cuda.get_device_name(0)}")
print(f"CUDA version: {torch.version.cuda}")
# Actually run a tensor operation on GPU
x = torch.randn(3, 3).cuda()
print(f"Tensor device: {x.device}")
print("✅ GPU is working correctly!")
else:
print("❌ CUDA not available. Please revisit the steps above.")Expected output when everything is working:
PyTorch version: 2.5.0+cu121
CUDA available: True
GPU: NVIDIA GeForce RTX 4090
CUDA version: 12.1
Tensor device: cuda:0
✅ GPU is working correctly!
Using Antigravity IDE to Streamline the Process
Antigravity's integrated terminal makes it easy to run these diagnostic commands without switching contexts. If you hit an error, paste the full error message into the Antigravity chat and ask for help — the AI agent can often pinpoint the exact cause and suggest the right fix in seconds.
For putting PyTorch to work in real AI applications, check out Building Custom AI Pipelines with LangChain and Antigravity, which covers production-ready pipeline architectures. If you're interested in optimizing model performance after getting your environment set up, AI Model Quantization with Antigravity shows how to cut model size by 75% while preserving accuracy.
Prevention — Best Practices to Avoid Future Issues
1. Always isolate environments per project
# Using venv
python -m venv .venv
source .venv/bin/activate # Linux/macOS
.venv\Scripts\activate # Windows
# Using conda
conda create -n myproject python=3.11
conda activate myproject2. Pin your versions in requirements.txt
torch==2.5.0+cu121
torchvision==0.20.0+cu121
torchaudio==2.5.0+cu121
--extra-index-url https://download.pytorch.org/whl/cu121
3. After any NVIDIA driver update, re-verify PyTorch compatibility
Run nvidia-smi to check the new max CUDA version, then confirm your PyTorch build is still compatible.
4. Consider Docker for production-grade reproducibility
NVIDIA's official PyTorch Docker images (nvcr.io/nvidia/pytorch) bundle a pre-configured CUDA environment, eliminating most setup headaches and ensuring your dev environment matches production.
Looking back
PyTorch and CUDA installation errors almost always come down to version mismatches or misconfigured environment variables. The reliable fix is a three-step process: audit your current environment, cleanly uninstall the problematic packages, then reinstall using the exact command from the official PyTorch website.
Using conda to isolate environments per project is the single most effective prevention strategy — it keeps dependencies from bleeding across projects and makes your setup reproducible. Combine that with Antigravity IDE's agent capabilities for faster error diagnosis, and GPU-powered AI development becomes much less of a headache.
For a deep dive into building production-grade AI systems with Python, Programming PyTorch for Deep Learning by Ian Pointer is an excellent resource that walks through environment setup and GPU-accelerated training end to end.