> ## Documentation Index
> Fetch the complete documentation index at: https://danswer-mintlify-deep-research-1773355783.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

# Resourcing

> Resource requirements for deploying Onyx

## Resourcing Overview

| Resource | Minimum                     | Preferred                             |
| -------- | --------------------------- | ------------------------------------- |
| CPU      | 4 vCPU                      | 8+ vCPU                               |
| RAM      | 10 GB                       | 16+ GB                                |
| Disk     | 32 GB + \~2.5x indexed data | 500 GB for organizations \<5000 users |

<Warning>
  Vespa (the vector database used by Onyx) does not allow writes once disk usage hits 75%.
</Warning>

## Local Deployment (Docker)

You can control the resources available to Docker in the **Resources** section of the Docker Desktop settings menu.

<Info>
  Often old, unused Docker images take up sizeable disk space. To clean up unused images,
  run `docker system prune --all`.
</Info>

## Cloud Providers (AWS, GCP, etc.)

For small to mid scale deployments, we recommend deploying Onyx to a single instance in your cloud provider of choice.

When evaluating your instance, follow the Preferred resources in the table above.

| Provider     | Recommended Instance Type                       |
| ------------ | ----------------------------------------------- |
| AWS          | `m7g.xlarge`                                    |
| GCP          | `e2-standard-4` or `e2-standard-8`              |
| Azure        | `D4s_v3`                                        |
| DigitalOcean | Meet the preferred resources in the table above |

<Accordion title="Vespa on older CPUs">
  **Vespa requires Haswell (2013) or later CPUs.**

  For older CPUs, use the `vespaengine/vespa-generic-intel-x86_64` image in your Docker Compose file.
  This generic image is slower.

  For more details, see [Vespa CPU Support](https://docs.vespa.ai/en/cpu-support.html).
</Accordion>

## Container-Specific Resourcing

For more efficient scaling, you can dedicate resources to each Onyx container using Kubernetes or AWS EKS.

See the [Onyx Helm chart](https://github.com/onyx-dot-app/onyx/blob/main/deployment/helm/charts/onyx/values.yaml)
`values.yaml` for our default requests and limits.

| Component                | CPU        | Memory  |
| ------------------------ | ---------- | ------- |
| `api_server`             | 1          | 2 Gi    |
| `background`             | 2          | 8 Gi    |
| `indexing_model_server`  | 2          | 4 Gi    |
| `inference_model_server` | 2          | 4 Gi    |
| `postgres`               | 2          | 2 Gi    |
| `vespa`                  | >= 4       | >= 8 Gi |
| `nginx`                  | 250m (1/4) | 128 Mi  |

<Info>
  The `vespa` recommendation is the minimum for a production deployment. With 50GB of documents,
  we recommend at least 10 CPU, 20Gi Memory.
</Info>

All together, this comes out to a total available node size of at least \~14 CPU and \~30GB of Memory.

## How Resource Requirements Scale

The main driver of resource requirements is the number of indexed documents.
This primarily affects the index component of Onyx (a [Vespa](https://vespa.ai/) vector database),
which is responsible for storing the vectorized documents and handling search requests.

<Note>
  Vespa's resource requirements scale linearly with the document count.
</Note>

Based on our experience with large scale deployments, in addition to the previously mentioned minimums, Vespa needs:

* \~3GB of memory for each additional 1GB of documents
* \~1 CPU for each additional 2GB of documents

These are our rough estimates. Other factors that may affect resource requirements include:

* The embedding model
* Whether you have quantization and dimensional reduction enabled

### Resourcing Example

For a deployment with 10GB of text content, your `index` component will need:

* CPU: 4 + 10 \* 0.5 = 9 cores
* Memory: 4 + 10 \* 3 = 34GB

If deploying in a single instance, this would be *in addition to* the base requirements. Overall, that would take us to

> \= 13 CPU and >= 50GB of memory.

Given these requirements, a `m7g.4xlarge` or a `c5.9xlarge` EC2 instance would be appropriate.

If deploying with Kubernetes or AWS EKS, this would give a per-component resource allocation of:

| Component                | CPU | Memory |
| ------------------------ | --- | ------ |
| `api_server`             | 1   | 2 Gi   |
| `background`             | 2   | 8 Gi   |
| `indexing_model_server`  | 2   | 4 Gi   |
| `inference_model_server` | 2   | 4 Gi   |
| `postgres`               | 2   | 4 Gi   |
| `vespa`                  | 10  | 34 Gi  |

Total available node size: \~20 CPU and \~60GB of Memory.

## Next Steps

<CardGroup cols={2}>
  <Card title="Guide: Deploy Onyx Locally" icon="microchip" href="/deployment/local/docker">
    Deploy Onyx locally with Docker.
  </Card>

  <Card title="Guide: Deploy on AWS" icon="microchip" href="/deployment/cloud/aws/ec2">
    Deploy Onyx on an EC2 instance.
  </Card>
</CardGroup>
