Cluster Details & Setup

Ellf lets you plug in your own data processing cluster hosted locally or on your cloud infrastructure and under your control. The cluster provides all data, models and custom code you use and create, and serves the annotation interface and performs actions like model training or data analysis. This means that no sensitive data ever leaves your servers or has to pass through our servers in the process. See this table for an overview of which data is stored where.

The clusters CLI lets you set up and launch your own cluster without requiring specific infrastructure or DevOps expertise. It also lets you manage your cluster and check its status. In the UI, clicking ClusterSetup will show details and configuration options. In the top bar of the app, you will also see the cluster status:

cluster.example.com

Disconnected

cluster.example.com

Installation

Under the hood, the cluster uses Kubernetes and can be installed to run locally on your machine, or in the cloud under your own cloud provider like AWS, GCP or Azure.

Use Ellf to set up the cluster for you

If you’ve connected Ellf to your coding assistant, it will be able to set up your cluster for you, either locally on your machine or on a cloud account you control. This is also helpful if you’re running into infrastructure issues unrelated to Ellf.

Local

Install the cluster on your local machine or workstation

GCP

Install the cluster on Google Cloud Platform (GCP)

AWS

Install the cluster on Amazon Web Services (AWS)

Installing the cluster locally

Setting up the cluster on your local machine is the quickest and easiest option to get started. Under the hood, it uses K3s, a lightweight Kubernetes distribution.

Prerequisites

a Linux machine (bare metal or VM) with at least 4 GB RAM and 20 GB disk
K3s installed:
```
$ curl -sfL https://get.k3s.io | sh -
```

Helm installed:

$ curl -fsSL https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash

the Ellf CLI installed and authenticated

Once K3s is running, make sure your kubeconfig is accessible:

Set up kubeconfig

$ mkdir -p ~/.kube
$ sudo cp /etc/rancher/k3s/k3s.yaml ~/.kube/config
$ sudo chown $USER ~/.kube/config

If your system defaults to the K3s kubeconfig path instead of ~/.kube/config, set the KUBECONFIG variable in your shell profile:

$ export KUBECONFIG=~/.kube/config

Verify that kubectl can reach the cluster:

$kubectl get nodes✔ your-machine Ready control-plane …

Register, configure and deploy

Next, you can use the infra CLI to register your cluster with Ellf. For a local cluster, use localhost as the domain and k3s as the cloud provider.

$ellfinfraregister--name my-cluster--domain localhost--cloud-provider k3s

This writes a cluster-creds.json file with the credentials needed for the next steps. Next, install cert-manager for TLS certificate management:

Install infrastructure

$ellfinfrasetup--cert-manager

K3s ships with Traefik pre-installed, so you don’t need to pass --traefik. The CLI detects this automatically and skips Traefik installation if it’s already present.

Generate the Helm values file from your credentials:

Generate values

$ellfinfrainit-values--creds cluster-creds.json--cloud-provider k3s--domain localhost

Review the generated values.yaml and update the PostgreSQL password, then deploy:

Deploy cluster

$ellfinfradeploy--values values.yaml--wait

Once the deployment completes, verify the cluster is healthy:

Check cluster status

$ellfclusterscheck✔ Cluster is healthy

Installing the cluster in the cloud

For a flexible, stable and production-ready workflow, you typically want to host the cluster in the cloud and share it across your organization. This requires you to have access to a project in your cloud provider account, with permission to install new resources.

GCP

Install the cluster on Google Cloud Platform (GCP)

AWS

Install the cluster on Amazon Web Services (AWS)

Azure

Install the cluster on Microsoft Azure

Google Cloud (GCP)

The CLI includes Terraform support for provisioning GCP infrastructure (VPC, GKE, Cloud SQL and Artifact Registry). Make sure you have an authenticated gcloud CLI with permission to create resources in your project, then download the Terraform files for GCP:

Download GCP Terraform files

$ellfinfraget-terraform--dest .--cloud-provider gcp

This writes the Terraform files into a gcp/ directory. Next, describe your cluster in a config.yaml in the same directory. The deployment.gcp block tells the CLI which provider to use – if you don’t pass --cloud-provider to the Terraform commands, it’s read from here:

config.yaml

name: my-cluster
deployment:
  gcp:
    project: my-gcp-project
    zone: europe-west1-d
  domain: cluster.yourdomain.com
  cluster_name: my-cluster

To apply the Terraform to provision the infrastructure, run this from the directory containing your config.yaml and the downloaded gcp/ folder:

Provision GCP infrastructure

$ellfinfraterraform--tf-init--tf-apply

Once the infrastructure is up, start the cluster. Unlike a local install, you don’t run register, setup and deploy separately – infra start reads the same config.yaml, fetches kubectl credentials for you (gcloud container clusters get-credentials), registers the cluster with Ellf, installs Traefik and cert-manager, and deploys the Helm chart:

Start the cluster

$ellfinfrastart

For automatic TLS certificates via Let’s Encrypt, set acme_email (and optionally tls_secret_name) under deployment in config.yaml before running start. cert-manager is only installed when an ACME email is present:

config.yaml

deployment:
  acme_email: you@yourdomain.com
  tls_secret_name: cluster-tls

Amazon Web Services (AWS)

The CLI also includes Terraform support for provisioning AWS infrastructure (VPC, EKS, RDS, ECR, S3 and EFS). Make sure you have authenticated AWS credentials with permission to create resources, then download the Terraform files for AWS:

Download AWS Terraform files

$ellfinfraget-terraform--dest .--cloud-provider aws

This writes the Terraform files into an aws/ directory. Next, describe your cluster in a config.yaml in the same directory, with an aws block under deployment:

config.yaml

name: my-aws-cluster
deployment:
  aws:
    region: eu-west-1
  domain: cluster.yourdomain.com
  cluster_name: my-aws-cluster

The deployment.aws block tells the CLI which provider to use – if you don’t pass --cloud-provider to the Terraform commands, it’s read from here. To apply the Terraform to provision the infrastructure, run this from the directory containing your config.yaml and the downloaded aws/ folder:

Provision AWS infrastructure

$ellfinfraterraform--tf-init--tf-apply

Once the infrastructure is up, start the cluster. Unlike a local install, you don’t run register, setup and deploy separately — infra start reads the same config.yaml, fetches kubectl credentials for you (aws eks update-kubeconfig), registers the cluster with Ellf, installs Traefik and cert-manager, and deploys the Helm chart:

Start the cluster

$ellfinfrastart

config.yaml

deployment:
  acme_email: you@yourdomain.com
  tls_secret_name: cluster-tls

Cost-conscious defaults

The AWS module defaults to a single NAT gateway, one t3.medium system node, and a small RDS instance – sized for evaluation, not production load. Review the variables in the downloaded aws/ Terraform files before applying to a shared or production account. Note that AWS IAM is currently scoped at the node level (a shared node role) rather than per-pod via IRSA.

Microsoft Azure

Coming soon: Azure Terraform modules are still under development.

TLS & Remote Access

If your cluster is running on a machine without a public domain (e.g. a workstation on your local network), you’ll need TLS and an SSH tunnel to access it from another machine. Modern browsers require HTTPS for features like service workers, the clipboard API and secure WebSocket connections. Ellf will automatically assign your cluster a subdomain of our domain ellf.run for this purpose, but the cluster itself will still run on your machine or cloud.

Setting up TLS

The CLI can create a self-signed certificate authority (CA) and issue a TLS certificate for your cluster using cert-manager. This is the recommended approach for local clusters where Let’s Encrypt isn’t an option (ACME HTTP-01 challenges require a publicly reachable domain).

Create certificates

$ellfinfratls--self-signed

This creates a CA and leaf certificate via cert-manager and updates the broker ingress to serve HTTPS. It then prints the commands you need to run to trust the CA on your machine.

To get the trust commands in a format you can pipe directly to your shell:

Pipe trust commands

$ellfinfratls--self-signed--output shell| bash

You can check the current TLS status at any time:

Check the TLS status

$ellfinfratls--status✔ ClusterIssuer: ellf-local-ca✔ Certificate: ellf-tls (expires 2027-03-24)✔ Ingress TLS: ellf-tls

Accessing the cluster over SSH

To access a cluster running on a remote machine from your laptop, open an SSH tunnel that forwards the HTTPS port:

Open SSH tunnel

$ssh -N -L 8443:localhost:443 your-workstation

Then override the cluster URL the CLI dials for the current shell. ELLF_BROKER_HOST only changes which address the CLI talks to – the cluster you logged into (its cluster_id) is still resolved against the central API server, so token exchange works against the right cluster:

Configure CLI

$exportELLF_BROKER_HOST=https://localhost:8443

You can now open https://localhost:8443 in your browser and use the CLI normally. The self-signed CA you installed ensures the browser trusts the certificate without warnings.

One-step remote setup

If you’re setting up TLS from your laptop (without direct kubectl access to the workstation), the --setup flag handles everything over SSH – it creates the certificates on the remote machine, copies the CA back, and prints the trust commands:

$ellfinfratls--setup your-workstation

Moving chat assistant data to your cluster

When you start using Ellf, the in-app chat logs, project plans and artifacts to enable handover to your local coding assistant, are stored in our database. This allows users to get started with project planning immediately and before setting up and installing their cluster. However, once your cluster is running, you can migrate the assistant data.

Coming soon: This feature is still under construction.

Worker classes

By default, the cluster is set up to provide the workers base, small, medium and gpu, with the following specs. The machine types are specific to what’s available via your cloud provider and can also be customized.

	base	small	medium	gpu
Machine type (GCP)	`n2-standard-2`	`n2-standard-2`	`g1-small`	`n1-standard-2`
Cores per job	2	2	2	2
Memory per job	1024 MB	1024 MB	1024 MB	1024 MB
Max. memory per job	4096 MB	4096 MB	4096 MB	4096 MB
GPU	`null`	`null`	`null`	`nvidia-tesla-t4`
Preemptible	`false`	`false`	`false`	`false`

Note on preemptible instances

Preemptible VMs are instances that your cloud provider may terminate at any point to reclaim compute, which generally makes them cheaper. However, for workers running annotation tasks, it’s important that the instances used are not preemptible – otherwise, your running tasks may randomly stop and your annotators won’t be able to continue working and in the worst case, you’ll use data and important state.

Customizing your cluster

Using a custom domain

Because the cluster needs to be available via HTTPS over the internet, it needs to be assigned a domain. By default, Ellf assigns it a subdomain of our domain ellf.run, e.g. yourorg.ellf.run, but you can also use your own custom domain or subdomain instead.

You’ll need access to manage your domain’s DNS and create an A Record pointing your domain or desired subdomain to the public IP of your cloud project. The specifics of how and where to do this will depend on your domain registrar or DNS management solution. You don’t need to worry about the SSL certificate and HTTPS – this will all be taken care of automatically when you launch the cluster.

yourdomain.com	Record	Example Value	Resulting Cluster Domain
`@`	A	`34.160.5.141`	`yourdomain.com`
`cluster`	A	`34.160.5.141`	`cluster.yourdomain.com`

Once the DNS record is in place, register the cluster with your custom domain and deploy with Let’s Encrypt for automatic certificate management:

$ellfinfraregister--name my-cluster--domain cluster.yourdomain.com--cloud-provider gcp

Set up TLS with Let's Encrypt

$ellfinfrasetup--traefik--cert-manager--acme-email you@yourdomain.com

Generate the values file with your domain and ACME email, then deploy. The SSL certificate will be provisioned automatically on first deploy.

Using multiple clusters Advanced usage

For most use cases, a single data processing cluster is sufficient and keeps your setup simple. However, for advanced use cases, it’s possible to connect more than one cluster to your Ellf organization. This gives you more control over where and how your data is stored, including using different cloud providers for different data, projects and privacy requirements. It also lets you manage permissions separately, e.g. to give users and developers only access to certain data.

To add a second cluster, run infra register with a different name and domain. Each cluster will get its own credentials file and values configuration:

$ellfinfraregister--name us-east-cluster--domain us-east.yourdomain.com--cloud-provider aws

After successful setup, you should now see a dropdown menu for the cluster status in the top bar, which lets you switch between the available clusters.