DocumentationPricingJoin Waitlist

Getting Started

  • Introduction
  • Installation & Setup

Features

  • Modules
  • Platform
    • Projects
    • Chat with Ellf
    • Annotation
    • User Management
  • Recipes

Cluster

  • Details & Setup
  • Tasks, Actions & Agents
  • Assets & Files

API Reference

  • Command-Line Interface
  • Recipe Development
  • Recipes API

Platform

When working on NLP projects, you typically need to run lots of things: data processing, annotation, training, evaluation, testing and many more iterative steps. This is hard to solve with only a coding assistant. Ellf comes with a platform that makes it easy to run and monitor your different processes, locally and in the cloud, and collaborate on them with your team.

Projects

Projects let you group tasks, actions and agents together within your organization, and manage user access and permissions on a per-project basis. For example, if you’re working on a new model or component, you would typically create a project for it and create all related tasks and actions within that project. You can manage projects in the web UI or via the ellf projects CLI.

Within a project, you can create, run and manage tasks, actions and agents:

Annotate entitiesSTARTED

Named Entity Recognitionner
Annotations
2452
Annotators
3

Train a pipelineSTOPPED

Train a spaCy pipelinetrain
Executions
42
Last execution
2 hours ago

Auto-labeler for NERSTARTED

Gemini Annotation Agentgemini_agent

The projects dropdown at the top shows all available projects and lets you switch between them, edit the current project and create a new one. Projects require a human-readable name and description shown across the app.

You can also create projects on the CLI via ellf projects create. When successful, it will output the details of the project that was created.

Creating a project
$ellfprojectscreate"My first project""Getting started with Ellf"
============================ Project ============================
id            838e401c-f070-44de-b6a7-3db840e9dc7c
created       2026-01-16 10:25:22.511856+00:00
updated       2026-01-16 10:25:22.511863+00:00
org_id        8b428d0e-b44a-40b3-8886-53a9018f2c81
name          My first project
description   Getting started with Ellf

Once your project is created, it will show up in the overview in the UI, as well as the list of projects in ellf projects list. You can also get detailed info for a given project or delete a project.

If you’re working on the CLI, you can use the ellf config command to configure the default project, task, action or agent so you don’t have to keep repeating the name or ID in the commands. For example:

Setting the default project
$ellf

config

project

"My first project"✔

Default project set as 838e401c-f070-44de-b6a7-3db840e9dc7c

You’ll then be able to leave out the --project argument on other commands and it will be picked up automatically. You can do the same to set the current working task, action or agent.

Projects also contain project plans created by Ellf’s project planning module. These are Markdown documents that are continuously updated and outline the end-to-end development plan, including step-by-step workflows and higher-level strategic goals. Ellf refers to the project plan across its different modules and also uses it to keep the in-app chat and your local coding assistant in sync.

PROJECT_PLAN.md

NLP Project Plan: Fraud Report Classifier

Problem Statement

Build a pipeline that processes analyst-written fraud investigation summaries and produces four outputs: fraud type (multi-class, 6 labels), affected product, urgency level, and legal escalation flag. Urgency and legal escalation are derived from business rules applied downstream of model predictions — not model outputs themselves. Starting from scratch on annotation.

Pipeline Overview

Analyst report text

│─→ [Fraud Type Classifier] ← supervised textcat, 6-class exclusive

|─→ [Social Engineering Detector] ← supervised textcat, binary

|─→ [Product Extractor] ← rules (PhraseMatcher on known product names)

|─→ [Amount Extractor] ← regex / MONEY NER for dollar amounts

|─→ [Business Logic Layer] ← NOT a model

|─→ urgency: fraud_type × amount × product → policy thresholds

|─→ escalate: urgency == HIGH → legal team routing

Key architecture decision: Urgency and legal escalation are policy decisions, not language-understanding tasks. Internal thresholds ($100K, insider fraud type) are encoded as rules owned by the fraud team, not embedded in model weights.

Components

Component Approach Rationale
Fraud type Supervised textcat, 6-class exclusive Core NLP task; domain-specific; needs training
Social engineering vector Supervised textcat, binary Method flag separate from outcome type; binary is fast to annotate
Product extraction PhraseMatcher rules Analyst reports name products explicitly; rules are fast and auditable
Amount extraction Regex / spaCy MONEY entity Structured format; rules are sufficient
Urgency Business logic rules Thresholds ($100K, fraud type) are policy, not language
Legal escalation Business logic rules Derived from urgency; policy-owned

Data Strategy

  • Source: Analyst-written fraud investigation summaries (free text, high quality, domain-consistent language)
  • No existing labels — annotating from scratch
  • insider_fraud and insurance_fraud estimated at 10–15% each — random sampling sufficient

Annotation Plan

Fraud type classifier

  • Recipe: textcat.correct with LLM pre-annotation
  • Labels: account_takeover, money_laundering, card_fraud, application_fraud, insider_fraud, insurance_fraud (exclusive)
  • Target volume: 400–500 annotated examples
  • Evaluation split: Set aside ~100 examples before annotation starts (document-level split)
  • Pilot first: Annotate 50–75 manually (no LLM). Fix schema before scaling.

Social engineering vector

  • Recipe: textcat.binary
  • Label: social_engineering_vector (true/false)
  • Pass: Second pass after fraud type annotation is stable
  • Target volume: 200–300 examples

Schema decision

  • social_engineering is a method, not an outcome — removed from fraud type labels
  • Added as a separate binary attribute to avoid confusable label pairs

Evaluation Strategy

Test set

  • Hold out ~100 examples before any annotation begins (document-level split)
  • Never used in training; kept constant as the ground truth benchmark

Metrics

  • Per-class F1 for all 6 fraud types — do not rely on macro average alone
  • insider_fraud and insurance_fraud tracked separately
  • Confusion matrix — expected confusables: card_fraud ↔ account_takeover, application_fraud ↔ money_laundering

Baselines

  • Most-frequent-class baseline before any model evaluation

Training curves

  • Train on 25/50/75/100% after each annotation batch
  • Rising at 100% → annotate more; flat → investigate schema or architecture

Memorisation check

  • Train on pilot examples, evaluate on those same examples — must be near-perfect

Roadmap

Phase What Output
1 — Pilot Read reports, write guidelines, annotate manually Stable schema + 75 examples
2 — Baseline Train first model, memorisation check Go/no-go on schema
3 — Scale annotation LLM-assisted textcat.correct to 400–500 examples Training dataset
4 — Train & evaluate Full training run, per-class F1, error analysis v1 fraud type model
5 — SE vector pass Binary annotation pass + training v1 SE vector model
6 — Rules layer Product PhraseMatcher, amount regex, urgency/escalation rules Complete pipeline

Next Steps

  1. Hold out ~100 docs as the test set before touching any annotation tool
  2. Read 50–75 reports manually to validate the schema
  3. Write one-page annotation guidelines with label definitions and tiebreaker rules
  4. Annotate the pilot batch using the textcat recipe
  5. Train a quick baseline and run the memorisation check
  6. If training curves are rising, scale with LLM-assisted annotation
  7. Build the rules layer (product PhraseMatcher, amount regex) in parallel

Decision Log

  • Urgency: Business logic. Thresholds ($100K, insider fraud type) are policy rules.
  • Legal escalation: Business logic derived from urgency. Policy-owned.
  • Product extraction: PhraseMatcher rules first. Analyst reports name products explicitly.
  • Fraud type: Supervised textcat, 6-class exclusive. Core NLP task.
  • Social engineering: Binary vector flag, not a fraud type label. It's a method, not an outcome.
  • Evaluation-first: 100-doc test set held out before annotation begins.

Chat with Ellf

The platform includes a chat widget for chatting with Ellf from any point in the app. The agent is able to perform common actions and navigate the app for you. Typing @ lets you reference available resources, including tasks, actions, agents, assets, datasets and registered users in your organization.

TasksText classification
TasksAnnotate entities
AgentsGPT-5
Datasetslegal-contracts
Datasetsmedical-records
AssetsDocuments
Assign @Agents/Gemini to @|

Things the in-app chat can do include:

  • answer questions about project planning and NLP strategy
  • review the project and suggest next steps, improvements or ideas
  • pre-populate the form for creating new tasks, actions and agents
  • start and stop tasks, actions and agents
  • assign agents and users to tasks
  • compute statistics about data and running jobs
  • navigate to specific pages and sub-pages

When you ask the agent to create, run or delete things for you, it will ask for an additional confirmation. Under the hood, it will perform actions under your user account, which means that the agent only ever has access to what the user can access.

Performing actions via the chat with confirmation

Start @Actions/Company Reports Training and show me the logs.

Start jobCompany Reports Training838e401c-f070-44de-b6a7-3db840e9dc7c

Handover to coding assistant

At some point during a conversation, the agent may need to go back to developing locally and writing code. In that case, you can hand over the conversation to your coding assistant like Claude Code via the /ellf-todo command.

Handover from in-app chat to Claude Code

Can you implement this?

Sure! Here’s the handoff summary:

Request ID523060d8-b2db-4fd5-b541-5ee119007501
Statusplanned
DescriptionWrite loader.py for documents from data warehouse

In your local Claude Code, run /ellf-todo to pick it up – it will load our full conversation context automatically.

✳︎Claude Code
› /ellf-todo
✔ Successfully updated coding request 523060d8-b2db-4fd5-b541-5ee119007501
⏺︎Context summary: Phase 1 of an e-commerce aspect extraction project. The task is to write loader.py for the internal data warehouse API that segments sentences and yields {id, text, date} dictionaries.

This will load the conversation and continue it in your coding assistant with the ability to create files and develop code. By default, it will pick up and use the latest conversation, but you can also call the command explicitly with the conversation ID or select from the available past conversations.


Annotation

Ellf ships with our popular annotation solution Prodigy for efficient data labelling, and brings it into a collaborative cloud environment. When you start a task, Ellf will spin up the annotation server using one of the available recipes for NLP tasks like span annotation, text classification or relation extraction. You can then assign human annotators as well as automated annotation agents, and view their progress in the app and on the CLI using the tasks command. The annotation process runs entirely on your cluster, so your data stays fully private and under your control.

Annotate entitiesSTARTED

Named Entity Recognitionner

Annotate categoriesSTOPPED

Text Classificationtextcat

The app shows you an overview of your available tasks, running and completed annotation sessions, as well as the created data. To create annotations yourself, you can click the Annotate now button on the task page. Annotators assigned to a task are able to access the annotation UI only.

Ellf’s data annotation module also makes your coding assistant proficient at annotation best practices and helps you design your label scheme, structure your annotation task and configure the right annotation interface. The Prodigy module knows Prodigy’s developer API and helps you implement fully custom workflows and interfaces.

Prodigy documentation

Annotation interfaces


User management

Organizations are your top-level account for Ellf and include your team members and projects. You’d typically have one organization for your company, although it’s also possible to have multiple orgs for different groups and departments if you need more fine-grained access control. To invite users to your organization, you can click SettingsInvite in the platform.

Roles and Permissions

Ellf provides three basic roles for users: Administrator, who has access to everything and can manage the organization, Developer, who has access to everything needed for working with and customizing Ellf, and Annotator, who only has access to the annotator dashboard and annotation UI for projects and tasks they’re added to.

 AdministratorDeveloperAnnotator
Manage organization and billing
Invite developers and annotators
Access projects dashboard
Access annotator dashboard
Manage all projects
Manage project they’re in
Create, manage and run tasks, actions and agents
Upload data and code to cluster
Manage data and code on cluster
Add new clusters
Create annotations

About developer access to the cluster

Developers need to be able to upload code to the cluster and run it, so there’s no point in restricting their access within the cluster, since this could be circumvented by code anyway. If you need more fine-grained permissions for different cluster resources, you can set up and connect multiple clusters.

The in-app chat agent will perform all actions as the currently authenticated user, so it is only able to do what the user has access to.

Read next
Cluster Details & Setup

from the makers of spaCy and Prodigy

Navigation

  • Home
  • Documentation

Platform

  • Pricing
  • Waitlist

Resources

  • Case Studies
  • Blog
Terms & ConditionsPrivacy PolicyImprint© 2026 Explosion