Deploying AlphaFold on AWS cloud infrastructure

Jonathan Alles

EVOBYTE Digital Biology

Introduction

If you’ve tried running AlphaFold on a workstation, you’ve likely experienced the “GPU out of memory” message or watched a progress bar inch along while databases sprawl across your SSD. Moving this workload to AWS changes the experience: you can provision the right GPU for the job, keep terabytes of sequence databases warm and shared, and scale from a single user to a queue of requests without rewriting the science. In this guide, you’ll see what AlphaFold actually needs under the hood, how those needs translate into AWS resources, and a practical workflow you can replicate—from a quick single-node run to a multi-user setup that feels like a service.

Before we dive in, a quick note on dates and versions. As of April 6, 2026, the open-source AlphaFold v2 inference pipeline remains the standard for self-hosted runs. AlphaFold 3 has an inference repo and additional access requirements; the architecture you’ll design here still applies, but the packaging and inputs differ. We’ll focus on AlphaFold v2 for clarity, then call out where you might alter things for newer models.

What AlphaFold needs: code, models, and genetic databases

AlphaFold’s code does the neural-network inference, but most of the runtime is spent in feature generation: building multiple sequence alignments (MSAs) and finding templates. That step draws on several genetic and structural databases, plus model parameter files. The official AlphaFold repository documents both the “full” and “reduced” database sets and ships a script, download_all_data.sh, that pulls everything into a directory you point at runtime. In the full setup you’ll see BFD, MGnify, PDB70, PDB mmCIFs and PDB seqres (for multimer), UniRef30 (formerly UniClust30), UniProt, and UniRef90. The cumulative footprint is hefty: roughly 556 GB to download and about 2.6 TB uncompressed, so fast storage matters. The reduced preset drops to a smaller bundle (notably small_bfd) and trades a bit of accuracy for faster downloads and lower disk use, which is often ideal for pilot runs and CI checks.

That same repo includes the Dockerfile you’ll build, the run_docker.py wrapper that wires GPUs and volumes, and explicit flags you’ll pass at runtime: a FASTA path, a max_template_date that gates which PDB entries can be used, a data directory for the databases, and an output directory for results. Because the container relies on the host NVIDIA driver, you’ll want the NVIDIA Container Toolkit installed on the instance. If you prefer Singularity/Apptainer for HPC clusters, that path is also documented, but Docker on a GPU-enabled EC2 host is the fastest way to start.

Mapping AlphaFold to AWS building blocks

Translating AlphaFold’s needs to cloud resources is simpler when you break it down by function—compute for inference and MSAs, storage for databases and outputs, and a container runtime that can see your GPUs.

On compute, you pick an EC2 GPU instance that balances VRAM, cost, and availability. For long sequences or multimer predictions where memory headroom keeps you out of trouble, Hopper-class p5 instances with NVIDIA H100 GPUs are the most capable option on AWS and scale up to eight GPUs per host. For steady, cost-sensitive inference, G-family instances are attractive; the g6 line brings NVIDIA L4 GPUs and is widely available, with sizes that fit single-GPU workloads comfortably. In practice, many teams use a g6 size for routine monomer jobs and reserve p5 for the occasional large or urgent target.

Storage is where most of the architecture decisions live. If you’re the only user, a large gp3 EBS volume attached to a single instance is enough; you download databases once to that volume, then snapshot it to reuse later. For groups and pipelines, a shared, high-throughput filesystem makes life easier. Amazon FSx for Lustre mounts like POSIX storage on many instances at once and integrates natively with Amazon S3, so you can keep a durable copy of the AlphaFold databases in a bucket and mirror them to FSx for performance. That way, your compute nodes see a fast, shared path for MSAs and templates, while S3 remains the long-term system of record.

Finally, the runtime. You can start from the AWS Deep Learning AMIs, which already include recent CUDA drivers and the NVIDIA Container Toolkit. That saves setup time and avoids the most common GPU-container pitfalls. On these AMIs, the Docker build for AlphaFold is a one-liner, and nvidia-smi works in containers out of the box.

A quick word on hardware expectations. AlphaFold’s neural network is GPU-hungry, and MSAs can be CPU- and RAM-hungry, especially for longer sequences. If you routinely fold sequences in the 600–1500 residue range or run multimer, plan for larger VRAM; vendors suggest at least a mid-range configuration for a smooth experience and much more for very long chains. On Hopper-class GPUs, you’ll have plenty of headroom; on L4-sized GPUs, you may prefer the reduced_dbs setting or split work by sequence length.

A minimal, repeatable deployment on EC2

Imagine a bench scientist drops a 780-residue enzyme on your desk and asks for overnight structures. You don’t need a whole platform yet; you need a reliable, repeatable single-node run. Here’s the flow teams use to go from zero to first prediction, then capture it for reuse.

Start by launching a GPU instance in your preferred Region. A g6.4xlarge is a reasonable default for monomer predictions; if you expect very long sequences, jump to a p5.2xlarge. Choose a Deep Learning Base GPU AMI so CUDA, drivers, and nvidia-container-toolkit are ready. Add a gp3 EBS data volume—3 TB aligns with the full database footprint and leaves room for outputs. After boot, mount the data volume, clone the AlphaFold repo, and kick off the database download script in the background. This step pulls the model parameters and all required databases, which can take hours depending on bandwidth.

When the download finishes, build the Docker image, verify that nvidia-smi works inside a container, and run your first job by pointing at your FASTA and the database directory. The wrapper script handles GPU visibility, mounts the databases and output path, and sets the network flags. When you’re done, create an EBS snapshot of the data volume; that one action trims your next cold start to minutes because you can just attach a volume from the snapshot to a fresh instance.

Here’s a minimal command you’ll recognize the moment you log in. It’s short for a reason—you don’t need more than this to run a monomer with reduced databases and get a feel for end-to-end timing.

python3 docker/run_docker.py \
  --fasta_paths=/data/input/target.fasta \
  --max_template_date=2024-12-31 \
  --data_dir=/data/af_download_data \
  --output_dir=/data/output/alphafold_run \
  --db_preset=reduced_dbs \
  --model_preset=monomer

The outputs include ranked PDB files, JSON with prediction confidences, and MSAs you can cache. If you prefer the full database set, drop the db_preset flag and make sure your storage can handle the size and I/O. The AWS Machine Learning Blog has a step-by-step walkthrough of this exact pattern, including volume setup and snapshots, which makes it a handy checklist the first time you do it. (aws.amazon.com)

Multi-user architecture on AWS

Once colleagues see fast turnarounds, requests multiply. At that point, two questions define your design. How will different users submit jobs, and how will you avoid redownloading or duplicating databases for each node?

A neat scaling path is to lift the single-node layout into a shared, batch-friendly blueprint. Put the AlphaFold databases in an S3 bucket as your durable copy, then expose them to compute through an FSx for Lustre filesystem mounted on all GPU nodes. FSx’s S3 integration lets you import the bucket’s directory tree when you create the filesystem and export changes back later, so a database refresh is one admin action followed by a remount across the fleet. Users never think about where the data live; they just see a path like /fsx/alphafold/databases.

On the compute side, use AWS Batch with an EC2 compute environment that allows your chosen GPU families. Your job definition simply wraps the same AlphaFold container you built earlier, mounts the FSx path read-only, and writes results to an S3 prefix or a project-specific EBS volume. Batch manages queuing and capacity for you, so one user’s 20-job array doesn’t starve everyone else. You can sprinkle in cost controls—Spot for g6 where queues are flexible and On-Demand p5 for urgent or long sequences. The operational win here is consistency: the very same Docker image, database path, and flags work whether a single scientist runs a job or your LIMS sends a hundred.

If Batch feels like overkill and you’re Kubernetes-first already, the same pattern works in Amazon EKS. Mount FSx for Lustre via the CSI driver, expose a small submission service that takes FASTA and parameters, and spin Pods on GPU node groups. Either way, the data path is identical and the knobs you tune are GPU size, queue policy, and Spot strategy.

For persistence and portability, many teams pair the shared FSx layer with an S3 mirror. This takes two short commands: copy in when you hydrate a new filesystem and copy out before you tear it down. It’s the same idea as an EBS snapshot but for a shared filesystem and with object-store economics.

# Sync databases from S3 to FSx
aws s3 sync s3://my-alphafold-dbs/ /fsx/alphafold/databases/ --only-show-errors

# Later, export changes back to S3 (for example, after a DB refresh)
aws s3 sync /fsx/alphafold/databases/ s3://my-alphafold-dbs/ --delete --only-show-errors

Two operational tips close the loop. First, standardize environment variables like DATA_DIR and OUTPUT_DIR across shells, Batch job definitions, and notebooks; this keeps both humans and automation from guessing paths. Second, codify refreshes. AlphaFold’s README documents how to update UniProt, UniRef, MGnify, and PDB seqres in place, so put that into a maintenance runbook or a small pipeline that cuts over atomically and logs the database versions used. That single source of truth pays off when you compare results months apart. (github.com)

Summary / Takeaways

Running AlphaFold well is mostly about getting the boring parts right. The code is polished and containerized; the heavy lifting is in placing terabytes of databases on fast, shared storage and picking the right GPU for sequence length and throughput. On AWS, the smallest useful setup is a single GPU EC2 instance with a large EBS volume; you download once, snapshot once, and reuse forever. The easiest way to scale to multiple users is to shift databases to S3 plus FSx for Lustre and let AWS Batch schedule containerized jobs on g6 or p5 nodes. Throughout, start with the reduced databases to move quickly, then flip to the full set for production predictions. And because hardware and drivers are a common stumbling block, favor Deep Learning AMIs so Docker sees the GPU on day one.

If you’re ready to try this, launch one GPU instance, run the download script, and fold a familiar protein with the reduced preset. When the results look right, capture the volume as a snapshot. Your second instance will come online in minutes, and you’ll feel exactly why the cloud is such a good fit for this workload.