
Docking is throughput work. GPUs make sense when you prepare clean inputs and run many replicas in parallel. This guide shows how to run GPU docking, batch thousands of ligands, and track cost per 10k ligands.
On most GPU renters your job runs inside an image. Two paths that work well:
A) Use a CUDA template + mount docking tools (simple)
B) Bring your own image (fastest to reuse)
NVIDIA_VISIBLE_DEVICES=allNVIDIA_DRIVER_CAPABILITIES=compute,utility/work, and waits (or starts the batch).You do not need Docker‑in‑Docker in general as your provider passes the host driver into your container.
Directory layout
/work
├── receptor/
│ ├── protein.pdbqt
│ └── grid/ # maps & config
├── ligands/
│ ├── L000001.pdbqt
│ ├── L000002.pdbqt
│ └── ...
└── jobs/
└── (outputs here)
Receptor
Ligands
This skeleton runs ligands in small chunks, keeps the GPU fed, and writes logs you can parse later. Replace the DOCK_CMD with your tool’s exact call (AutoDock‑GPU, Vina‑GPU, Uni‑Dock, etc.).
#!/usr/bin/env bash
set -euo pipefail
LIG_DIR=${1:-ligands}
OUT_DIR=${2:-jobs}
N_PAR=${3:-8} # concurrent processes per GPU (tune)
mkdir -p "$OUT_DIR"
# Edit this to your docking command. Example placeholders:
# autodock_gpu --lrigid receptor/protein.pdbqt --ffile receptor/grid/maps.fld \
# --lfile <ligand.pdbqt> --center_x ... --center_y ... --center_z ... --size_x ... --size_y ... --size_z ... \
# --exhaustiveness 8 --out <out.pdbqt> --log <out.log>
DOCK_CMD() { :; }
export -f DOCK_CMD
ls "$LIG_DIR"/*.pdbqt | awk '{print NR, $0}' | while read idx lig; do
base=$(basename "$lig" .pdbqt)
out="$OUT_DIR/$base"
mkdir -p "$out"
(
DOCK_CMD "$lig" "$out" >"$out/run.log" 2>&1 || echo "FAIL $base" >> "$OUT_DIR/failures.txt"
) &
# simple throttle
if (( $(jobs -rp | wc -l) >= N_PAR )); then wait -n; fi
done
wait
echo "Done. Outputs in $OUT_DIR/"
Tuning N_PAR
nvidia-smi for memory and utilization.If your instance has N GPUs, launch one batch per GPU and pin each process group:
# GPU 0
CUDA_VISIBLE_DEVICES=0 bash dock_batch.sh ligands jobs/gpu0 8 &
# GPU 1
CUDA_VISIBLE_DEVICES=1 bash dock_batch.sh ligands jobs/gpu1 8 &
wait
Keep outputs separated by GPU for easier auditing.
Record only what matters:
inputs: N_ligands, receptor, grid (center/size), exhaustiveness/seed
hardware: GPU model/VRAM, CUDA, driver
code: docking tool version + flags
metrics: ligands/hour, failures, wall_hours
Cost per 10k ligands
cost_per_10k = price_per_hour × wall_hours × (10_000 / N_ligands)
Report ligands/hour and cost per 10k per GPU. Keep the full command line in your Methods.
GPU OOM
Lower N_PAR, reduce exhaustiveness, or move to a larger‑VRAM GPU.
All jobs fail fast
Bad grid or incompatible flags. Re‑run a single ligand with verbose logging and compare to a known‑good CPU run.
Weird scores or poses
Input prep differences (protonation/tautomers/charges). Normalize your prep and test on a small curated set.
GPU idle
N_PAR too low or tool has long CPU pre/post stages. Parallelize prep or pre‑stage grids.
hardware:
gpu: "<model> (<VRAM> GB)"
driver: "<NVIDIA driver>"
cuda: "<CUDA version>"
software:
image: "<your docking image or CUDA template>"
tool: "AutoDock-GPU <ver> | Vina-GPU <ver> | Uni-Dock <ver>"
inputs:
receptor: "protein.pdbqt"
grid: { center: [<x>,<y>,<z>], size: [<sx>,<sy>,<sz>] }
ligands: "N=<…> from <path>"
run:
batch:
script: "dock_batch.sh"
N_PAR: <int>
flags: "<exhaustiveness, seed, etc.>"
outputs:
ligands_per_hour: "<…>"
wall_hours: "<…>"
cost_per_10k: "<…>"
failures: "<count> (<file path>)"
Start a GPU instance with a CUDA-ready template (e.g., Ubuntu 24.04 LTS / CUDA 12.6) or your own GROMACS image. Enjoy flexible per-second billing with custom templates and the ability to start, stop, and resume your sessions at any time. Unsure about FP64 requirements? Contact support to help you select the ideal hardware profile for your computational needs.