YAML Specification
Overview
Dagu workflows are defined using YAML files. Each file represents a DAG (Directed Acyclic Graph) that describes your workflow steps and their relationships.
Basic Structure
# Workflow metadata
description: "What this workflow does"
tags: [production, etl] # Optional: for organization
# Scheduling
schedule: "0 * * * *" # Optional: cron expression
# Execution control
maxActiveRuns: 1 # Max concurrent runs
maxActiveSteps: 10 # Max parallel steps
timeoutSec: 3600 # Workflow timeout (seconds)
# Parameters
params:
- KEY: default_value
- ANOTHER_KEY: "${ENV_VAR}"
# Environment variables
env:
- VAR_NAME: value
- PATH: ${PATH}:/custom/path
# Workflow steps (type: graph requires explicit depends)
type: graph
steps:
- name: step-name # Optional
command: echo "Hello"
depends: previous-step # Optional
# Lifecycle handlers
handlerOn:
success:
command: notify-success.sh
failure:
command: cleanup-on-failure.shRoot Fields
Metadata Fields
| Field | Type | Description | Default |
|---|---|---|---|
name | string | Workflow name | Filename without extension |
description | string | Human-readable description | - |
tags | array | Tags for categorization | [] |
group | string | Group name for organization | - |
Scheduling Fields
| Field | Type | Description | Default |
|---|---|---|---|
schedule | string/array | Cron expression(s) | - |
skipIfSuccessful | boolean | Skip if already succeeded today | false |
restartWaitSec | integer | Wait seconds before restart | 0 |
Schedule Formats
# Single schedule
schedule: "0 2 * * *"
# Multiple schedules
schedule:
- "0 9 * * MON-FRI" # 9 AM weekdays
- "0 14 * * SAT,SUN" # 2 PM weekends
# With timezone
schedule: "CRON_TZ=America/New_York 0 9 * * *"
# Start/stop schedules
schedule:
start:
- "0 8 * * MON-FRI" # Start at 8 AM
stop:
- "0 18 * * MON-FRI" # Stop at 6 PM
restart:
- "0 12 * * MON-FRI" # Restart at noonExecution Control Fields
| Field | Type | Description | Default |
|---|---|---|---|
maxActiveRuns | integer | Max concurrent workflow runs (-1 = unlimited) | 1 |
maxActiveSteps | integer | Max parallel steps | 1 |
timeoutSec | integer | Workflow timeout in seconds | 0 (no timeout) |
delaySec | integer | Initial delay before start (seconds) | 0 |
maxCleanUpTimeSec | integer | Max cleanup time (seconds) | 5 |
preconditions | array | Workflow-level preconditions | - |
runConfig | object | User interaction controls when starting DAG | - |
Data Fields
| Field | Type | Description | Default |
|---|---|---|---|
params | array | Default parameters | [] |
env | array | Environment variables | [] |
secrets | array | External secret references resolved at runtime and exposed as environment variables | [] |
dotenv | string/array | .env files to load | [".env"] |
workingDir | string | Working directory for the DAG. Sub-DAGs inherit parent's workingDir if not set. | Directory of DAG file (or inherited from parent for sub-DAGs) |
shell | string/array | Default shell program (and args) for all steps; accepts string ("bash -e") or array (["bash", "-e"]). Step-level shell overrides. | System shell with errexit on Unix when no step shell is set |
logDir | string | Custom log directory | System default |
logOutput | string | Log output mode: separate (stdout/stderr to separate files) or merged (both to single file) | separate |
histRetentionDays | integer | History retention days | 30 |
maxOutputSize | integer | Max output size per step (bytes) | 1048576 |
Container Configuration
| Field | Type | Description | Default |
|---|---|---|---|
container | string/object | Container configuration. Can be a string (existing container name to exec into) or an object (container configuration with exec or image field). | - |
The container field supports two modes:
- Image mode: Create a new container from a Docker image (
imagefield required) - Exec mode: Execute in an existing running container (
execfield or string form)
String Form (Exec Mode)
container: my-running-container # Exec into existing container with defaultsObject Form - Image Mode
container:
name: my-workflow # Optional: custom container name
image: python:3.11 # Required for image mode
pullPolicy: missing # always, missing, never
env:
- API_KEY=${API_KEY}
volumes:
- /data:/data:ro
workingDir: /app
platform: linux/amd64
user: "1000:1000"
ports:
- "8080:8080"
network: host
startup: keepalive # keepalive | entrypoint | command
command: ["sh", "-c", "my-daemon"] # when startup: command
waitFor: running # running | healthy
logPattern: "Ready to accept connections" # optional regex
restartPolicy: unless-stopped # optional: no|always|unless-stopped
keepContainer: false # Keep container after DAG run
shell: ["/bin/bash", "-c"] # Optional: shell wrapper for step commands (enables pipes, &&, etc.)Object Form - Exec Mode
container:
exec: my-running-container # Required for exec mode
user: root # Optional: override user
workingDir: /app # Optional: override working directory
env: # Optional: additional environment variables
- DEBUG=true
shell: ["/bin/sh", "-c"] # Optional: shell wrapper for step commandsField Availability by Mode
| Field | Exec Mode | Image Mode |
|---|---|---|
exec | Required | Not allowed |
image | Not allowed | Required |
user, workingDir, env, shell | Optional | Optional |
| All other fields | Not allowed | Optional |
Note: A DAG‑level
containeris started once (image mode) or attached to (exec mode) and kept alive while the workflow runs; each step executes viadocker execinside that container. This means step commands do not pass through the image'sENTRYPOINT/CMD. If your image's entrypoint dispatches subcommands, invoke it explicitly in the step command (see Execution Model and Entrypoint Behavior). For exec mode, the container must be running; Dagu waits up to 120 seconds. For image mode, readiness waiting (running/healthy and optional logPattern) times out after 120 seconds with a clear error including the last known state.
Secrets
The secrets block defines environment variables whose values are fetched at runtime from external providers. Each item is an object with the following fields:
| Field | Type | Description |
|---|---|---|
name | string | Environment variable name exposed to steps (required) |
provider | string | Secret provider identifier (required). Built-in providers are env and file. |
key | string | Provider-specific key (required). For the env provider this is the source environment variable; for file it is a file path. |
options | object | Provider-specific options (optional). Values must be strings. |
Example:
secrets:
- name: DB_PASSWORD
provider: env
key: PROD_DB_PASSWORD # Read from process environment
- name: API_TOKEN
provider: file
key: ../secrets/api-token # Relative paths resolve using workingDir then DAG file directorySecret values are injected after DAG-level variables and built-in runtime variables, meaning they take precedence over everything except step-level overrides. Resolved values never touch persistent storage and are automatically masked in logs and captured step output.
See Secrets for provider reference, masking behavior, and extension tips.
SSH Configuration
| Field | Type | Description | Default |
|---|---|---|---|
ssh | object | Default SSH configuration for all steps | - |
ssh:
user: deploy
host: production.example.com
port: "22" # Optional, defaults to "22"
key: ~/.ssh/id_rsa # Optional, defaults to standard keys
password: "${SSH_PASSWORD}" # Optional; prefer keys for security
strictHostKey: true # Optional, defaults to true for security
knownHostFile: ~/.ssh/known_hosts # Optional, defaults to ~/.ssh/known_hosts
shell: "/bin/bash -e" # Optional: shell (string or array) for remote executionWhen configured at the DAG level, all steps using SSH executor will inherit these settings:
# DAG-level SSH configuration
ssh:
user: deploy
host: app.example.com
key: ~/.ssh/deploy_key
shell: /bin/bash # Commands wrapped in shell
steps:
# These steps inherit the DAG-level SSH configuration
- command: systemctl status myapp
- command: systemctl restart myapp
# Step-level config overrides DAG-level
- type: ssh
config:
user: backup # Override user
host: db.example.com # Override host
key: ~/.ssh/backup_key # Override key
shell:
- /bin/sh # Override shell with explicit flags
- -e
command: mysqldump mydb > backup.sqlImportant Notes:
- SSH and container fields are mutually exclusive at the DAG level
- Step-level SSH configuration completely overrides DAG-level configuration (no partial overrides)
- Password authentication is supported but not recommended; prefer key-based auth
- Default SSH keys are tried if no key is specified:
~/.ssh/id_rsa,~/.ssh/id_ecdsa,~/.ssh/id_ed25519,~/.ssh/id_dsa ssh.shellat the DAG level accepts either a string (e.g.,"/bin/bash -e") or array form (e.g.,["/bin/bash","-e","-o","pipefail"]). Dagu tokenizes the value into the remote shell executable and arguments before wrapping commands.- For step-level SSH executor configs (
executor.config.shell), use string form (e.g.,"/bin/bash -e"). Array syntax is not supported when decoding the YAML map into the executor config.
LLM Configuration
| Field | Type | Description | Default |
|---|---|---|---|
llm | object | Default LLM configuration for all chat steps | - |
llm:
provider: openai
model: gpt-4o
system: "You are a helpful assistant."
temperature: 0.7
steps:
- type: chat
messages:
- role: user
content: "What is 2+2?"When configured at the DAG level, all chat steps inherit the LLM configuration. Step-level llm: completely overrides DAG-level configuration (no field-level merging).
See Chat Executor for full documentation.
Redis Configuration
| Field | Type | Description | Default |
|---|---|---|---|
redis | object | Default Redis configuration for all redis steps | - |
redis:
host: localhost
port: 6379
password: ${REDIS_PASSWORD}
db: 0
steps:
- name: cache-lookup
type: redis
config:
command: GET
key: cache:user:${USER_ID}
output: CACHED_DATAWhen configured at the DAG level, all redis steps inherit the connection settings. Step-level config values override DAG-level defaults (field-level merging).
Available fields:
url- Redis URL (redis://user:pass@host:port/db)host- Redis host (alternative to URL)port- Redis port (default: 6379)password- Authentication passwordusername- ACL username (Redis 6+)db- Database number (0-15)tls- Enable TLS connectionmode- Connection mode:standalone,sentinel,clustersentinelMaster- Sentinel master namesentinelAddrs- Sentinel addressesclusterAddrs- Cluster node addresses
See Redis Executor for full documentation.
Working Directory and Volume Resolution
When using container volumes with relative paths, the paths are resolved relative to the DAG's workingDir:
# DAG with working directory and container volumes
workingDir: /app/project
container:
image: python:3.11
volumes:
- ./data:/data # Resolves to /app/project/data:/data
- .:/workspace # Resolves to /app/project:/workspace
- /abs/path:/other # Absolute paths are unchanged
steps:
- command: python process.pyWorking Directory Inheritance:
- Steps inherit
workingDirfrom the DAG if not explicitly set - Step-level
workingDiroverrides DAG-levelworkingDir - Both
dirandworkingDirset the working directory (use one or the other) - Sub-DAGs (via
call) inherit the parent'sworkingDirwhen executed locally, unless they define their own explicitworkingDir
# Example of workingDir inheritance
workingDir: /project # DAG-level working directory
steps:
- command: pwd # Outputs: /project
- workingDir: /custom # Override DAG workingDir
command: pwd # Outputs: /customSub-DAG Working Directory Inheritance:
When a parent DAG calls a child DAG using call:, the child inherits the parent's working directory for local execution:
# Parent DAG with workingDir
workingDir: /app/project
steps:
- call: child-workflow # Child inherits /app/project as workingDir
---
# Child DAG without explicit workingDir
name: child-workflow
steps:
- command: pwd # Outputs: /app/project (inherited from parent)
---
# Child DAG with explicit workingDir (overrides inheritance)
name: child-with-custom-wd
workingDir: /custom/path
steps:
- command: pwd # Outputs: /custom/pathNote: Working directory inheritance only applies to local execution. For distributed execution (using
workerSelector), sub-DAGs use their own context on the worker node.
Queue Configuration
| Field | Type | Description | Default |
|---|---|---|---|
queue | string | Queue name | - |
OpenTelemetry Configuration
| Field | Type | Description | Default |
|---|---|---|---|
otel | object | OpenTelemetry tracing configuration | - |
otel:
enabled: true
endpoint: "localhost:4317" # OTLP gRPC endpoint
headers:
Authorization: "Bearer ${OTEL_TOKEN}"
insecure: false
timeout: 30s
resource:
service.name: "dagu-${DAG_NAME}"
service.version: "1.0.0"
deployment.environment: "${ENVIRONMENT}"See OpenTelemetry Tracing for detailed configuration.
Notification Fields
| Field | Type | Description | Default |
|---|---|---|---|
mailOn | object | Email notification triggers | - |
errorMail | object | Error email configuration | - |
infoMail | object | Success email configuration | - |
waitMail | object | Wait status email configuration | - |
smtp | object | SMTP server configuration | - |
mailOn:
success: true
failure: true
wait: true # Email when DAG is waiting for approval
errorMail:
from: alerts@example.com
to: oncall@example.com # Single recipient (string)
# Or multiple recipients (array):
# to:
# - oncall@example.com
# - manager@example.com
prefix: "[ALERT]"
attachLogs: true
infoMail:
from: notifications@example.com
to: team@example.com # Single recipient (string)
# Or multiple recipients (array):
# to:
# - team@example.com
# - stakeholders@example.com
prefix: "[INFO]"
attachLogs: false
waitMail:
from: dagu@example.com
to: approvers@example.com
prefix: "[WAITING]"
attachLogs: false
smtp:
host: smtp.gmail.com
port: "587"
username: notifications@example.com
password: ${SMTP_PASSWORD}Handler Fields
| Field | Type | Description | Default |
|---|---|---|---|
handlerOn | object | Lifecycle event handlers | - |
handlerOn:
init:
command: echo "Setting up" # Runs before any steps
success:
command: echo "Workflow succeeded"
failure:
command: echo "Notifying failure"
abort:
command: echo "Cleaning up"
wait:
command: echo "Waiting for approval: ${DAG_WAITING_STEPS}"
exit:
command: echo "Always running"Note: Sub-DAGs (invoked via
call) do not inherithandlerOnfrom base configuration. Each sub-DAG must define its own handlers explicitly. See Lifecycle Handlers for details.
RunConfig
The runConfig field allows you to control user interactions when starting DAG runs:
| Field | Type | Description | Default |
|---|---|---|---|
disableParamEdit | boolean | Prevent parameter editing when starting DAG | false |
disableRunIdEdit | boolean | Prevent custom run ID input when starting DAG | false |
Example usage:
# Prevent users from modifying parameters at runtime
runConfig:
disableParamEdit: true
disableRunIdEdit: false
params:
- ENVIRONMENT: production # Users cannot change this
- VERSION: 1.0.0 # This is fixedThis is useful when:
- You want to enforce specific parameter values for production workflows
- You need consistent run IDs for tracking purposes
- You want to prevent accidental parameter changes
Step Fields
Each step in the steps array can have these fields:
Basic Fields
| Field | Type | Description | Default |
|---|---|---|---|
name | string | Step name (optional - auto-generated if not provided) | Auto-generated |
command | string/array | Command to execute. Can be a string (single command), array of strings (multiple commands executed sequentially), or multi-line string (runs as inline script). | - |
script | string | Inline script (alternative to command). Honors shebang when no shell is set. | - |
run (legacy) | string | Deprecated alias for call | - |
depends | string/array | Step dependencies | - |
Multiple Commands
The command field accepts an array of strings. Multiple commands share the same step configuration:
steps:
- name: build-and-test
command:
- npm install
- npm run build
- npm test
env:
- NODE_ENV: production
workingDir: /appInstead of duplicating env, workingDir, retryPolicy, preconditions, container, etc. across multiple steps, combine commands into one step.
Commands run in order and stop on first failure. Retries restart from the first command.
Trade-off: You lose the ability to retry or resume from the middle of the command list.
Supported step types: shell, command, docker, container, ssh
Not supported: jq, http, archive, mail, github_action, dag (configuration rejected at parse time)
Step Definition Formats
Steps can be defined in multiple formats:
Standard Format
steps:
- command: echo "Hello"Shorthand String Format
steps:
- command: echo "Hello" # Equivalent to: {command: echo "Hello"}
- command: ls -la # Equivalent to: {command: ls -la}Execution Fields
| Field | Type | Description | Default |
|---|---|---|---|
dir | string | Working directory | Current directory |
workingDir | string | Working directory (alternative to dir, inherits from DAG) | DAG's workingDir |
shell | string/array | Shell program and args for this step; overrides DAG shell | DAG shell (system default when omitted) |
stdout | string | Redirect stdout to file | - |
stderr | string | Redirect stderr to file | - |
logOutput | string | Override DAG-level log output mode for this step | DAG's logOutput |
output | string | Capture output to variable | - |
env | array/object | Step-specific environment variables (overrides DAG-level) | - |
call | string | Name of a DAG to execute as a sub DAG-run | - |
params | string/object | Parameters passed to sub DAGs (run) or executor-specific inputs (e.g., GitHub Actions with: map) | - |
ℹ️ The legacy
runfield is still accepted for backward compatibility until v1.24.0, but it will be removed in a future release. Prefercallfor new workflows.
shell accepts either a string (e.g., "bash -e") or an array (e.g., ["bash", "-e"]). DAG-level values expand environment variables when the workflow loads; step-level values are evaluated at runtime so you can reference parameters, secrets, or previous outputs.
Parallel Execution
| Field | Type | Description | Default |
|---|---|---|---|
parallel | array | Items to process in parallel | - |
maxConcurrent | integer | Max parallel executions | No limit |
steps:
- call: file-processor
parallel:
items: [file1.csv, file2.csv, file3.csv]
maxConcurrent: 2
params: "FILE=${ITEM}"Conditional Execution
| Field | Type | Description | Default |
|---|---|---|---|
preconditions | array | Conditions to check before execution | - |
continueOn | object | Continue workflow on certain conditions | - |
ContinueOn Fields
| Field | Type | Description | Default |
|---|---|---|---|
failure | boolean | Continue execution when step fails | false |
skipped | boolean | Continue when step is skipped due to preconditions | false |
exitCode | array | List of exit codes that allow continuation | [] |
output | array | List of stdout patterns that allow continuation (supports regex with re: prefix) | [] |
markSuccess | boolean | Mark step as successful when continue conditions are met | false |
Precondition Fields
| Field | Type | Description | Default |
|---|---|---|---|
condition | string | Command or expression to evaluate | - |
expected | string | Expected value or regex pattern (prefix with re:) | - |
negate | boolean | Invert the condition logic (run when condition does NOT match) | false |
steps:
- command: echo "Deploying"
preconditions:
- condition: "${ENVIRONMENT}"
expected: "production"
- condition: "`git branch --show-current`"
expected: "main"
# With negate: run only when NOT in production
- command: echo "Running dev task"
preconditions:
- condition: "${ENVIRONMENT}"
expected: "production"
negate: true # Runs when condition does NOT match
- command: echo "Running optional task"
continueOn:
failure: true
skipped: true
exitCode: [0, 1, 2]
output: ["WARNING", "SKIP", "re:^INFO:.*"]
markSuccess: trueSee the Continue On Reference for detailed documentation.
Error Handling
| Field | Type | Description | Default |
|---|---|---|---|
retryPolicy | object | Retry configuration | - |
repeatPolicy | object | Repeat configuration | - |
mailOnError | boolean | Send email on error | false |
signalOnStop | string | Signal to send on stop | SIGTERM |
Retry Policy Fields
| Field | Type | Description | Default |
|---|---|---|---|
limit | integer | Maximum retry attempts | - |
intervalSec | integer | Base interval between retries (seconds) | - |
backoff | any | Exponential backoff multiplier. true = 2.0, or specify custom number > 1.0 | - |
maxIntervalSec | integer | Maximum interval between retries (seconds) | - |
exitCode | array | Exit codes that trigger retry | All non-zero |
Exponential Backoff: When backoff is set, intervals increase exponentially using the formula:interval * (backoff ^ attemptCount)
Repeat Policy Fields
For iterating over a list of items, use parallel instead.
| Field | Type | Description | Default |
|---|---|---|---|
repeat | string | Repeat mode: "while" or "until" | - |
intervalSec | integer | Base interval between repetitions (seconds) | - |
backoff | any | Exponential backoff multiplier. true = 2.0, or specify custom number > 1.0 | - |
maxIntervalSec | integer | Maximum interval between repetitions (seconds) | - |
limit | integer | Maximum number of executions | - |
condition | string | Condition to evaluate | - |
expected | string | Expected value/pattern | - |
exitCode | array | Exit codes that trigger repeat | - |
Repeat Modes:
while: Repeats while the condition is true or exit code matchesuntil: Repeats until the condition is true or exit code matches
Exponential Backoff: When backoff is set, intervals increase exponentially using the formula:interval * (backoff ^ attemptCount)
steps:
- command: curl https://api.example.com
retryPolicy:
limit: 3
intervalSec: 30
exitCode: [1, 255] # Retry only on specific codes
- command: curl https://api.example.com
retryPolicy:
limit: 5
intervalSec: 2
backoff: true # Exponential backoff (2.0x multiplier)
maxIntervalSec: 60 # Cap at 60 seconds
exitCode: [429, 503] # Rate limit or unavailable
- command: check-process.sh
repeatPolicy:
repeat: while # Repeat WHILE process is running
exitCode: [0] # Exit code 0 means process found
intervalSec: 60
limit: 30
- command: echo "Checking status"
output: STATUS
repeatPolicy:
repeat: until # Repeat UNTIL status is ready
condition: "${STATUS}"
expected: "ready"
intervalSec: 5
backoff: 1.5 # Custom backoff multiplier
maxIntervalSec: 300 # Cap at 5 minutes
limit: 60Step-Level Container
| Field | Type | Description | Default |
|---|---|---|---|
container | string/object | Container configuration for this step. Can be a string (exec mode) or object (exec or image mode). | - |
Use the container field to run a step in its own container:
steps:
# Image mode - create new container
- name: run-in-container
container:
image: python:3.11
volumes:
- /data:/data:ro
env:
- API_KEY=${API_KEY}
command: python process.py
# Exec mode - string form
- name: run-migration
container: my-app-container
command: php artisan migrate
# Exec mode - object form with overrides
- name: admin-task
container:
exec: my-app-container
user: root
workingDir: /app
command: chown -R app:app /dataTIP
The step-level container field uses the same format as DAG-level container configuration, supporting both image mode and exec mode.
WARNING
When using container, you cannot use executor or script fields on the same step.
Step Type Configuration
| Field | Type | Description | Default |
|---|---|---|---|
type | string | Step type (ssh, http, jq, mail, etc.) | shell |
config | object | Step type-specific configuration | - |
steps:
- type: archive
config:
source: assets.tar.gz
destination: ./assets
command: extractChat (LLM)
| Field | Type | Description | Default |
|---|---|---|---|
type | string | Set to chat for LLM-based steps | - |
llm | object | LLM configuration (provider, model, system prompt) | - |
messages | array | Conversation messages for chat steps | - |
steps:
- type: chat
llm:
provider: openai
model: gpt-4o
system: "You are a helpful assistant."
messages:
- role: user
content: "What is 2+2?"
output: ANSWERLLM configuration fields:
provider: LLM provider (openai,anthropic,gemini,openrouter,local)model: Model identifier (e.g.,gpt-4o,claude-sonnet-4-20250514)system: Default system prompt (optional)temperature: Randomness control 0.0-2.0 (optional)maxTokens: Maximum tokens to generate (optional)topP: Nucleus sampling 0.0-1.0 (optional)baseURL: Custom API endpoint (optional)apiKey: API key override (optional)stream: Enable streaming output (default: true)
Message format:
role: Message role (system,user,assistant)content: Message content (supports${VAR}substitution)
See Chat Executor for full documentation.
HITL (Human in the Loop)
| Field | Type | Description | Default |
|---|---|---|---|
type | string | Set to hitl for approval gate steps | - |
config.prompt | string | Message displayed to the approver | - |
config.input | string[] | Parameter names to collect from approver | - |
config.required | string[] | Required parameters (subset of input) | - |
steps:
- command: ./deploy.sh staging
- type: hitl
config:
prompt: "Approve production?"
input: [APPROVED_BY]
required: [APPROVED_BY]
- command: ./deploy.sh productionHITL steps pause workflow execution until approval is granted via UI or API. Collected inputs become environment variables in subsequent steps.
See HITL for full documentation.
Distributed Execution
| Field | Type | Description | Default |
|---|---|---|---|
workerSelector | object | Worker label requirements for distributed execution | - |
When using distributed execution, specify workerSelector to route tasks to workers with matching labels:
steps:
- call: gpu-training
---
# Run on a worker with gpu
name: gpu-training
workerSelector:
gpu: "true"
memory: "64G"
steps:
- command: python train_model.pyWorker Selection Rules:
- All labels in
workerSelectormust match exactly on the worker - Label values are case-sensitive strings
- Steps without
workerSelectorcan run on any available worker - If no workers match the selector, the task waits until a matching worker is available
See Distributed Execution for complete documentation.
Variable Substitution
Parameter References
params:
- USER: john
- DOMAIN: example.com
steps:
- command: echo "Hello ${USER} from ${DOMAIN}"Environment Variables
env:
- API_URL: https://api.example.com
- API_KEY: ${SECRET_API_KEY} # From system env
steps:
- command: curl -H "X-API-Key: ${API_KEY}" ${API_URL}Loading Environment from .env Files
The dotenv field allows loading environment variables from .env files:
# Load specific .env file
dotenv: .env.production
# Load multiple .env files (all files loaded, later override earlier)
dotenv:
- .env.defaults
- .env.local
# Disable all .env loading
dotenv: []Loading behavior:
- If
dotenvis not specified, Dagu loads.envby default - When files are specified,
.envis automatically prepended to the list (and deduplicated if already included) - All files are loaded sequentially in order
- Variables from later files override variables from earlier files
- Missing files are silently skipped
- Files are loaded relative to the DAG's
workingDir - System environment variables take precedence over .env file variables
- .env files are loaded at DAG startup, before any steps execute
Example .env file:
# .env file
DATABASE_URL=postgres://localhost/mydb
API_KEY=secret123
DEBUG=true# DAG using .env variables
workingDir: /app
dotenv: .env # Optional, this is the default
steps:
- command: psql ${DATABASE_URL}
- command: echo "Debug is ${DEBUG}"Command Substitution
steps:
- command: echo "Today is `date +%Y-%m-%d`"
- command: deploy.sh
preconditions:
- condition: "`git branch --show-current`"
expected: "main"Output Variables
steps:
- command: cat VERSION
output: VERSION
- command: docker build -t app:${VERSION} .JSON Path Access
steps:
- command: cat config.json
output: CONFIG
- command: echo "Port is ${CONFIG.server.port}"Special Variables
These variables are automatically available:
| Variable | Description |
|---|---|
DAG_NAME | Current DAG name |
DAG_RUN_ID | Unique run identifier |
DAG_RUN_LOG_FILE | Path to workflow log |
DAG_RUN_STEP_NAME | Current step name |
DAG_RUN_STEP_STDOUT_FILE | Step stdout file path |
DAG_RUN_STEP_STDERR_FILE | Step stderr file path |
ITEM | Current item in parallel execution |
Complete Example
name: production-etl
description: Daily ETL pipeline for production data
tags: [production, etl, critical]
schedule: "0 2 * * *"
maxActiveRuns: 1
maxActiveSteps: 5
timeoutSec: 7200
histRetentionDays: 90
params:
- DATE: "`date +%Y-%m-%d`"
- ENVIRONMENT: production
env:
- DATA_DIR: /data/etl
- LOG_LEVEL: info
dotenv:
- /etc/dagu/production.env
# Default container for all steps
container:
image: python:3.11-slim
pullPolicy: missing
env:
- PYTHONUNBUFFERED=1
volumes:
- ./data:/data
- ./scripts:/scripts:ro
preconditions:
- condition: "`date +%u`"
expected: "re:[1-5]" # Weekdays only
type: graph
steps:
- command: ./scripts/validate.sh
- command: python extract.py --date=${DATE}
depends: validate-environment
output: RAW_DATA_PATH
retryPolicy:
limit: 3
intervalSec: 300
- call: transform-module
parallel:
items: [customers, orders, products]
maxConcurrent: 2
params: "TYPE=${ITEM} INPUT=${RAW_DATA_PATH}"
continueOn:
failure: false
# Use different container for this step
- name: load-data
container:
image: postgres:16
env:
- PGPASSWORD=${DB_PASSWORD}
command: psql -h ${DB_HOST} -U ${DB_USER} -f load.sql
- command: python validate_results.py --date=${DATE}
depends: load-data
mailOnError: true
handlerOn:
success:
command: |
echo "ETL completed successfully for ${DATE}"
./scripts/notify-success.sh
failure:
type: mail
config:
to: data-team@example.com
subject: "ETL Failed - ${DATE}"
body: "Check logs at ${DAG_RUN_LOG_FILE}"
attachLogs: true
exit:
command: ./scripts/cleanup.sh ${DATE}
mailOn:
failure: true
smtp:
host: smtp.company.com
port: "587"
username: etl-notifications@company.com