promptel vs DSPy: Declarative Specification vs Programmatic Compilation
A practical comparison of promptel and DSPy for prompt engineering — when to use a declarative specification language, when to use a programmatic compiler, and how they compose.
Why we get asked this question
Of all the comparisons we field, promptel vs DSPy is the one we hear most often. They both touch the same problem — making prompts first-class engineering artefacts rather than ad-hoc strings — but they attack it from different angles and arrive at very different artefacts. If you have an hour to evaluate, this post is designed to save you most of it.
The 30-second version: promptel is a specification language; DSPy is a Python framework. They are complementary, not competing. Most production teams we talk to end up using both.
What promptel is
promptel is a declarative, typed, portable prompt specification language. You write a prompt as a YAML file that declares the input schema, the output schema, the model parameters, the constraints, and the prompt body:
# summarise.prompt.yaml
kind: Prompt
version: "1.0"
name: summarise
description: "Produce a 3-bullet summary of input text."
input:
type: object
properties:
text:
type: string
minLength: 50
description: "The text to summarise."
output:
type: object
required: [bullets]
properties:
bullets:
type: array
items: { type: string, maxLength: 120 }
minItems: 3
maxItems: 3
models:
- name: openai-gpt4o
provider: openai
model: gpt-4o
temperature: 0.3
- name: anthropic-sonnet
provider: anthropic
model: claude-3-5-sonnet
temperature: 0.3
body: |
You are a precise summariser.
Produce exactly 3 bullets, each ≤ 120 chars.
No preamble, no explanation.
That file is the artefact. It can be diffed, reviewed in a PR, versioned independently of the application code, type-checked against a schema, and run against any of the listed models without code changes. The companion tool, blogus, extracts embedded prompts from your existing codebase and converts them into promptel specifications.
What DSPy is
DSPy is a Python framework for programming — not specifying — LLM behaviour. You write a Python module that describes the computation as a sequence of LLM calls, and DSPy’s compiler (“teleprompter”) optimises the prompts and the few-shot examples empirically against a metric you provide:
import dspy
class Summarise(dspy.Signature):
"""Produce a 3-bullet summary of input text."""
text: str = dspy.InputField()
bullets: list[str] = dspy.OutputField()
summariser = dspy.ChainOfThought(Summarise)
teleprompter = dspy.MIPROv2(metric=summary_quality)
optimised = teleprompter.compile(
summariser,
trainset=eval_set,
valset=val_set,
)
The output is also an artefact — but it’s a Python module with optimised prompts and demos baked in, not a declarative file. The artefact is what you ship, but it’s not what you diff, review, or hand to a non-engineer.
The five dimensions that matter
| Dimension | promptel | DSPy |
|---|---|---|
| What you write | A YAML file | Python code |
| What the artefact is | A spec file checked into git | A compiled Python module |
| Who can review it | Engineers, PMs, legal — anyone who can read YAML | Engineers only |
| Where the optimisation happens | Outside the spec — the spec is a contract, the optimiser is separate | Inside the framework — the compiler runs at build time |
| Portability | Run on any model with a promptel runtime (currently JS, Python WIP) | Tied to Python and the DSPy runtime |
| Version control story | Git diff of a YAML file is the prompt diff | Git diff of a Python module whose prompts are templated strings is hard to read |
| Type safety | Schema-enforced inputs and outputs | Optional via dspy.Signature, but enforcement is weaker |
| Telemetry | Bring your own (we recommend perishable for token proxying) | Built-in DSPy tracing, but exporting requires work |
| Multi-provider | Native — same spec, different models: entries | Supported, but each provider needs an explicit configuration |
| Bootstrap / few-shot learning | Manual; you bring your own examples | Native — the teleprompter generates and selects demos |
| Cost optimisation | Bring your own (route-switch plugs in) | Native — the teleprompter optimises for cost as a metric |
When to use which
Use promptel when:
- The prompt is a contract — it crosses an organisational boundary (engineer ↔ PM, vendor ↔ customer) and needs to be diffable, reviewable, and portable.
- You are writing prompts that downstream teams (or customers) will run against their own models. The spec is the API.
- You are regulated (GDPR, EU AI Act, HIPAA, SOX) and need the prompt itself to be auditable. promptel + mpl gives you a tamper-evident prompt + audit trail combo.
- You need the prompt to be portable across providers without rewriting the integration code.
Use DSPy when:
- You are optimising prompts empirically against a metric and need the compiler to generate few-shot demos.
- You are doing research on prompt optimisation itself and want to compare teleprompters (MIPROv2, COPRO, etc.).
- Your prompt logic is programmatic — branches, retries, tool calls — and the spec is too rigid.
- You are prototyping and want to iterate fast without writing schema files.
Use both when:
- You want the spec in promptel (so it can be reviewed,
diffed, audited) and the optimisation in DSPy (so the
teleprompter can find the best prompts and demos for that
spec). We do this internally: promptel is the artefact
that ships, DSPy is the build tool that optimises the
body:field before it is committed.
A workflow that uses both
The team at Skelf Research uses promptel for everything that crosses a human boundary, and DSPy for everything that benefits from automated optimisation. The pattern is:
- Spec the prompt in promptel. Write the input schema, the output schema, the model list, the constraints. This is what the PM, the reviewer, and the auditor see.
- Compile the body with DSPy. Use the teleprompter to
find the best body and few-shot demos against a labelled
eval set. The output of compilation is the content of
the
body:field. - Commit both. The promptel spec goes in
prompts/summarise.prompt.yaml. The eval set, the teleprompter config, and the metric live inprompts/summarise.eval/and are also checked in. - Run anywhere. The same promptel spec runs on OpenAI, Anthropic, or a local llama.cpp deployment, with no code changes.
- Audit with mpl. If you need tamper-evident audit trails, run the spec through mpl-proxy and you get a cryptographic record of every invocation, its inputs, its outputs, and the model that served it.
Things people get wrong
- They treat them as alternatives. They aren’t. promptel is the artefact; DSPy is the build tool.
- They use DSPy for compliance. DSPy is great for optimisation but its audit story is weak. If you need compliance, use mpl on top of whatever prompts you ship, whether they came from promptel or DSPy.
- They use promptel for prototyping. The schema overhead is real; for throwaway code, raw strings are fine. Reserve promptel for prompts that will outlive the afternoon.
- They assume “declarative” means “simple”. promptel schemas can express complex things — JSON-Schema-style composition, conditional fields, format constraints. The first week is overhead; the second month is leverage.
What to read next
- Formalising Prompts as First-Class Research Objects — the promptel thesis in full
- Prompt Lifecycle Management: From Extraction to Deployment — the blogus side of the workflow
- Intelligent LLM Routing: Spending Compute Where It Matters — the routing layer that costs the spec across providers
- DSPy documentation — for the teleprompter side
- The
promptelrepository —github.com/Skelf-Research/promptel