Compute - GPU and CPU rental

GPU and CPU power that performs exactly as promised, every single run.

Launch RTX 4090, RTX 5090, RTX 6000-series, or vCPU instances for AI inference, open-source and open-weight models, fine-tuning experiments, notebooks, rendering, batch jobs, APIs, and development environments. Get fixed GPU rates, per-second billing, regional deployment paths, and infrastructure built for quality at the right price.

Launch and instance Talk to sales

RTX 4090 from €-/hr

RTX 5090 from €-/hr

RTX 6000-series for enterprise

Predictable pricing

Per-second billing

Templates and OS images

SSH access

Team organizations

Public Compute API

France, UAE, and USA deployment paths

Workload guidance

Tell us what you want to run.
We'll point you to the right instance.

Choose the right path before you spend a euro:

Teams already running compute on Hivenet

Researchers, AI teams, studios, and industry groups run their GPU and CPU workloads on Compute with Hivenet because performance holds, pricing stays predictable, and they keep control of the environment.

Hivenet makes our work much easier. With Jupyter notebooks, fast access to GPUs, and reliable infrastructure across regions, we’ve been able to speed up our research on green enzymes for industrial use. It feels like a real step forward in compute platforms.

Jupyter notebooks, fast GPU access, green enzymes research.

Joseph Heenan

CEO @ Proteineer

The possibility of easily obtaining instances with graphics cards for a really cheap price. I use it a few times a month and it's really perfect.

Nicolas B

CEO small business (on G2)

What we like about Hivenet is that it matches how we think about AI: sustainable, efficient, and grounded in Europe. The distributed model gives us security, the option to choose European models strengthens our sovereignty message, and small details like pausing instances or fair pricing just make it practical for us to use day to day.

Sustainable, European, sovereignty, fair pricing.

Pablo Fernández

CEO @ ArtinLeap

We've moved over from AWS and GCP to hive. It's a great way to save on costs.

Sam Arrington

(on Trustpilot)

Hivenet’s GPUs have been key to scaling our AI work. They let us run advanced models smoothly, so our interactions with students stay fast and responsive. That reliability has made a real difference for us.

GPU reliability for scaling AI work with students.

Artem Gorelov

CEO @ Mytutor

Super quick setup. No BS/straightforward pricing. Fast cold starts. Reliable machines.

Anonymous

(on Trustpilot)

←

→

A strong home for open-source AI workloads.

Open-source models give teams a way off the closed-API treadmill. Hivenet gives those models the infrastructure they deserve: the right GPU for their class, full control over the serving stack, and pricing that makes running your own model the obvious call. This is the workload Compute was built for — summarization, extraction, classification, RAG, support automation, code assistance, and internal tools, on the open-source models you actually ship.

Take full control of the serving stack on Compute, or hand it off to a managed endpoint with Hivenet Inference. Either way, you run the open-source models you choose on hardware that fits them.

Qwen workloads

Test Qwen model classes against your prompts, latency target, context length, and cost-performance needs. Smaller Qwen models can fit efficient GPU workflows, while larger classes need testing before production.

DeepSeek distilled models

Run distilled DeepSeek workloads where the model size fits RTX 4090 or RTX 5090 hardware. Treat larger reasoning workloads as benchmark candidates before production use.

Llama workloads

Run Llama-class workloads for RAG, summarization, internal tools, assistants, experiments, and model-serving tests. Choose the instance based on model size, precision, concurrency, and latency target.

Mistral, Falcon, Gemma, Phi, and smaller models

Use right-sized GPU capacity for smaller and mid-sized model families where throughput, latency, and cost control matter more than maximum model size.

Anyone can quote a GPU price. We deliver the whole outcome.

Compute with Hivenet is built for teams that need predictable performance, the right workload fit, and real operational control. If all you want is the rock-bottom marketplace rate, a spot-style GPU provider may look cheaper until your job gets preempted, your costs swing, and your engineers lose a day. If you need predictable pricing, regional deployment, repeatable environments, and a clean path from experiment to production, this is the decision Hivenet was built for.

Quality at the right price

Fixed, published GPU rates mean you know the spend before you run. Per-second billing keeps short jobs honest, charged for the time they actually use. Predictable performance at the right price, not the cheapest number on a marketplace.

The right instance, before you waste a run

Match the model, batch size, latency target, and operating model to the right hardware, instead of gambling on a GPU name and paying for the mistake.

Repeatable workloads

Templates, OS images, SSH workflows, and your own stack take you from notebooks and tests to repeatable production workloads without rebuilding every time.

Sovereign and compliant

Deploy suitable workloads across available regions, including France, the UAE, and the USA, on enterprise-grade infrastructure operated by Hivenet end-to-end, with the control and exit path to keep the workload yours.

Real help on serious workloads

Talk to Hivenet when the workload needs a human: instance choice, model fit, production setup, or migration. No ticket-queue limbo.

Ready for your whole team

Organizations, shared billing, role-based access, and the Public Compute API for when the workload is bigger than one person clicking through a console.

Run any workload, big or small.

Whatever the size, there's a right path for it. Start with vCPU for general-purpose compute, step up to RTX 4090 for testing and research, RTX 5090 for specialized AI throughput, and RTX 6000-series for enterprise-scale work. Small experiment or production deployment, the platform fits the job.

Workload

Recommended path

Why

Web app, API, dev database, or background service

vCPU

General-purpose workloads usually do not need GPU acceleration

Batch scripts or preprocessing

vCPU or GPU

Start with vCPU unless the job uses CUDA or parallel acceleration

Jupyter, PyTorch, or model experiments

RTX 4090 or RTX 5090

GPU acceleration helps with ML workflows and iterative testing

ComfyUI, Stable Diffusion, Flux, or rendering

RTX 4090 or RTX 5090

Image and rendering workloads benefit directly from GPU acceleration

Sub-13B model serving

RTX 4090

Strong cost-performance for smaller open-weight models

Sub-30B model serving

RTX 5090

More VRAM and memory bandwidth for stronger single-GPU throughput

70B-class model testing

8× RTX 5090 host

Possible with tensor parallelism, but latency and throughput should be tested

Enterprise-scale production workloads

RTX 6000-series

Enterprise headroom for larger model classes and sustained serving

Managed OpenAI-compatible endpoint

Hivenet Inference

Compute gives you an instance; Inference gives you a managed API endpoint

Swipe left to see more

Right-size the instance. Maximize compute per euro.

Compute with Hivenet hands you enterprise-grade infrastructure with full control of the instance, the environment, and the stack. Each GPU has a job.

RTX 4090 · The testing and research GPU

The cost-efficient workhorse for testing, research, and dev.

Reach for RTX 4090 when the workload is smaller, cost-sensitive, or still in development. It is the GPU your team prototypes on, runs research experiments on, and develops against before scaling up. It works best on Hivenet because per-second billing and predictable rates make iterative testing cheap to repeat.

Specs

24 GB GDDR6X VRAM

About 1 TB/s memory bandwidth

Ada Lovelace Tensor Cores

From €0.40/hr

Per-second billing

Best for

Sub-13B inference

Llama 3.1 8B

Mistral 7B

Qwen 7B/14B

Phi-4

Fine-tuned 7B-class models

ComfyUI and image generation

Cost-efficient GPU development

Launch RTX 4090

RTX 5090 · The specialized AI GPU

Specialized single-GPU throughput for demanding AI.

Reach for RTX 5090 when you need more single-GPU headroom, stronger throughput, and support for larger practical model classes. It is the specialist for production inference and high-concurrency serving. Best on Hivenet because you get this class of GPU at predictable rates and can benchmark against bare metal.

Specs

32 GB GDDR7 VRAM

1.79 TB/s memory bandwidth

5th-gen Tensor Cores

PCIe 5.0

From €0.75/hr

Per-second billing

Best for

Sub-30B inference

High-concurrency small and medium models

Qwen, Llama, Mistral, Gemma, Phi, and distilled DeepSeek workloads

Rendering and creative pipelines

CUDA-heavy experiments

HPC-style pipelines that fit the hardware profile

Launch RTX 5090

RTX 6000-series · The enterprise GPU

Enterprise-grade headroom for production-scale workloads.

Reach for the RTX 6000-series when production scale, larger model classes, or enterprise deployments need more headroom than a single consumer-class GPU. It is the enterprise tier for teams running serious, sustained workloads. Best on Hivenet because you get enterprise capacity with the same predictable pricing and end-to-end control.

Specs

32 GB GDDR7 VRAM

1.79 TB/s memory bandwidth

5th-gen Tensor Cores

PCIe 5.0

Per-second billing

Best for

Enterprise production deployments

Larger model classes

Sustained high-utilization serving

Demanding multi-tenant or regulated workloads

Talk to sales

vCPU · General-purpose compute

General-purpose compute, no GPU premium.

Reach for vCPU when your workload does not need GPU acceleration.

Specs

CPU-only instances

Flexible vCPU options

Fixed RAM, disk, and bandwidth options

Per-second billing

Simple setup for everyday compute work

Configuration-based pricing

Best for

Web apps

APIs

Development environments

Testing databases

Automation

CI/CD

Preprocessing

Background services

Launch vCPU

Match the open-source model to the right GPU.

Use the fit table to match common open-source model classes to RTX 4090, RTX 5090, or RTX 6000-series capacity before you launch.

Model or workload class

Fit on RTX 4090

Fit on RTX 5090

Notes

Llama 3.1 8B

Strong fit

Good for efficient inference and development

Mistral 7B

Strong fit

Good for low-cost inference and experiments

Qwen 7B/14B

Strong fit

Good fit for smaller open-weight workloads

Phi-4

Strong fit

Good for smaller model workflows

Qwen 32B-class workloads

Test fit

Strong fit

RTX 5090 gives more headroom

Mistral Small 3 24B

Test fit

Stronger fit

Better suited to RTX 5090 for practical serving

DeepSeek-R1 distilled 7B/8B/14B

Strong fit

Good distilled-model candidates

DeepSeek-R1 distilled 32B

Test fit

Stronger fit

Benchmark before production use

70B-class models

Not single-GPU fit

Multi-GPU only

Requires tensor parallelism and latency testing

Full DeepSeek V3

Not a fit

Requires larger frontier-class infrastructure

Kimi K2 production serving

Not a fit

Requires larger frontier-class infrastructure

Swipe left to see more

Ask us to check your workload

Clear rates. Billed by the second. Priced for the outcome.

Some jobs run for minutes. Some run overnight. Some sit idle while a test takes longer than expected. Per-second billing helps you pay for what you actually use, with predictable performance at the right price.

RTX 4090

Starting price

€ - /h

Per-second billing

→

Sub-13B inference, fine-tuning, image generation, cost-efficient GPU work

RTX 5090

Starting price

€ - /h

Per-second billing

→

Sub-30B inference, demanding GPU jobs, strongest single-GPU performance on Compute

vCPU

Starting price

€ - /h

Per-second billing

→

Web APIs, dev databases, CI/CD, background services, general-purpose compute

No long commitment required. Use prepaid credits, stop instances when you do not need active compute, and terminate instances when you are done and no longer need local storage.

Compare pricing

Need an instance or a managed endpoint?

Use Compute with Hivenet when you want GPU or CPU infrastructure and control over the stack. Use Hivenet Inference when you want an OpenAI-compatible managed endpoint without having to operate the inference layer yourself.

I want root-level control over an instance

Compute with Hivenet

I want to run vLLM, TGI, SGLang, llama.cpp, PyTorch, or Docker myself

Compute with Hivenet

I want an OpenAI-compatible endpoint without operating the stack

Hivenet Inference

I want help building a private AI system on my data

Hivenet Private AI

I want CPU-only compute for general-purpose work

Compute with Hivenet vCPU

Build it once. Use it forever.

Set up the environment one time, save it as a template, and relaunch the exact same workload whenever you need it. No rebuilding from scratch, no drift between runs. Choose a container or virtual machine, pick a region, select GPU or vCPU, choose an OS or template, add your SSH key, configure HTTPS, TCP, or UDP if needed, and connect.

SSH into a VM

Run Docker

Open a Jupyter notebook

Serve a model with vLLM

Expose an API over HTTPS

Run PyTorch experiments

Use ComfyUI

Run background jobs

Reuse templates for repeat workloads

# Connect to your instance
ssh ubuntu@your-instance

# Run your stack in a container
docker run --gpus all -p 8000:8000 your-image

# Or serve an open source model with vLLM
python -m vllm.entrypoints.openai.api_server \
--model Qwen/Qwen2.5-7B-Instruct \
--port 8000

# Relaunch the same environment, every time
curl -X POST https://api.hivenet.example/v1/instances \
-H "Authorization: Bearer $HIVENET_API_KEY" \
-d '{"template": "my-vllm-template", "gpu": "rtx5090", "region": "fr"}'

Start a Compute instance

Built for real teams and real automation.

Compute with Hivenet supports the way technical teams actually work: shared access, shared billing, role separation, and programmatic control for when console-only workflows stop being enough.

Team access without shared logins

Create an organization, invite teammates, assign roles, and switch between personal and organization workspaces.

Shared billing

Keep organization credits separate from personal credits and let the right person manage top-ups and payment methods.

Role-based access

Separate who owns billing, who manages members and resources, and who creates and operates instances.

Public Compute API

Create, start, stop, terminate, list, tag, and update instances programmatically. Manage SSH keys, billing, organization workflows, and quota requests through versioned API paths.

Built for automation

Use request IDs, machine-readable errors, pagination, rate-limit signaling, and the OpenAPI specification to connect Compute to scripts, CI/CD workflows, and internal tools.

Control the instance, not just the billing plan.

Use SSH, templates, OS images, GPU or vCPU instances, region choices, and per-second billing to run workloads your way. For teams with larger or production workloads, Hivenet can help review fit, architecture, and migration path before you commit.

Here is why you should trust us

Predictable GPU rates

Per-second billing

France, UAE, and USA regions

SSH access

Templates and OS images

GPU and vCPU options

Clear distinction between raw compute and managed inference

Workload guidance for larger deployments

FAQ

Common questions

Bring the workload.
‍Get the instance that nails it.

Start with vCPU for general-purpose compute, RTX 4090 for testing and research, RTX 5090 for specialized AI, or RTX 6000-series for enterprise-scale workloads. Talk to us if you want help choosing the right path the first time.

Launch an instance Compare pricing Talk to an engineer

GPU and CPU power that performs exactly as promised, every single run.

Tell us what you want to run.We'll point you to the right instance.

General compute

Testing and research

Specialized AI throughput

Enterprise workloads

Managed API endpoint

Teams already running compute on Hivenet

Jupyter notebooks, fast GPU access, green enzymes research.

Sustainable, European, sovereignty, fair pricing.

GPU reliability for scaling AI work with students.

A strong home for open-source AI workloads.

Qwen workloads

DeepSeek distilled models

Llama workloads

Mistral, Falcon, Gemma, Phi, and smaller models

Anyone can quote a GPU price. We deliver the whole outcome.

Quality at the right price

The right instance, before you waste a run

Repeatable workloads

Sovereign and compliant

Real help on serious workloads

Ready for your whole team

Run any workload, big or small.

Right-size the instance. Maximize compute per euro.

RTX 4090 · The testing and research GPU

The cost-efficient workhorse for testing, research, and dev.

RTX 5090 · The specialized AI GPU

Specialized single-GPU throughput for demanding AI.

RTX 6000-series · The enterprise GPU

Enterprise-grade headroom for production-scale workloads.

vCPU · General-purpose compute

General-purpose compute, no GPU premium.

Match the open-source model to the right GPU.

Clear rates. Billed by the second. Priced for the outcome.

RTX 4090

RTX 5090

vCPU

Need an instance or a managed endpoint?

I want root-level control over an instance

I want to run vLLM, TGI, SGLang, llama.cpp, PyTorch, or Docker myself

I want an OpenAI-compatible endpoint without operating the stack

I want help building a private AI system on my data

I want CPU-only compute for general-purpose work

Build it once. Use it forever.

Built for real teams and real automation.

Team access without shared logins

Shared billing

Role-based access

Public Compute API

Built for automation

Control the instance, not just the billing plan.

Common questions

Bring the workload.‍Get the instance that nails it.

30% Off Hivenet Plans!

Tell us what you want to run.
We'll point you to the right instance.

Bring the workload.
‍Get the instance that nails it.