YAML Specification

Overview

Dagu workflows are defined using YAML files. Each file represents a DAG (Directed Acyclic Graph) that describes your workflow steps and their relationships.

Basic Structure

yaml

# Workflow metadata
description: "What this workflow does"
tags: [production, etl]    # Optional: for organization

# Scheduling
schedule: "0 * * * *"      # Optional: cron expression

# Execution control
maxActiveRuns: 1           # Max concurrent runs
maxActiveSteps: 10         # Max parallel steps
timeoutSec: 3600           # Workflow timeout (seconds)

# Parameters
params:
  - KEY: default_value
  - ANOTHER_KEY: "${ENV_VAR}"

# Environment variables
env:
  - VAR_NAME: value
  - PATH: ${PATH}:/custom/path

# Workflow steps (type: graph requires explicit depends)
type: graph
steps:
  - name: step-name        # Optional
    command: echo "Hello"
    depends: previous-step # Optional

# Lifecycle handlers
handlerOn:
  success:
    command: notify-success.sh
  failure:
    command: cleanup-on-failure.sh

Root Fields

Metadata Fields

Field	Type	Description	Default
`name`	string	Workflow name	Filename without extension
`description`	string	Human-readable description	-
`tags`	array	Tags for categorization	`[]`
`group`	string	Group name for organization	-

Scheduling Fields

Field	Type	Description	Default
`schedule`	string/array	Cron expression(s)	-
`skipIfSuccessful`	boolean	Skip if already succeeded today	`false`
`restartWaitSec`	integer	Wait seconds before restart	`0`

Schedule Formats

yaml

# Single schedule
schedule: "0 2 * * *"

# Multiple schedules
schedule:
  - "0 9 * * MON-FRI"   # 9 AM weekdays
  - "0 14 * * SAT,SUN"  # 2 PM weekends

# With timezone
schedule: "CRON_TZ=America/New_York 0 9 * * *"

# Start/stop schedules
schedule:
  start:
    - "0 8 * * MON-FRI"   # Start at 8 AM
  stop:
    - "0 18 * * MON-FRI"  # Stop at 6 PM
  restart:
    - "0 12 * * MON-FRI"  # Restart at noon

Execution Control Fields

Field	Type	Description	Default
`maxActiveRuns`	integer	Max concurrent workflow runs (-1 = unlimited)	`1`
`maxActiveSteps`	integer	Max parallel steps	`1`
`timeoutSec`	integer	Workflow timeout in seconds	`0` (no timeout)
`delaySec`	integer	Initial delay before start (seconds)	`0`
`maxCleanUpTimeSec`	integer	Max cleanup time (seconds)	`5`
`preconditions`	array	Workflow-level preconditions	-
`runConfig`	object	User interaction controls when starting DAG	-

Data Fields

Field	Type	Description	Default
`params`	array	Default parameters	`[]`
`env`	array	Environment variables	`[]`
`secrets`	array	External secret references resolved at runtime and exposed as environment variables	`[]`
`dotenv`	string/array	.env files to load	`[".env"]`
`workingDir`	string	Working directory for the DAG. Sub-DAGs inherit parent's workingDir if not set.	Directory of DAG file (or inherited from parent for sub-DAGs)
`shell`	string/array	Default shell program (and args) for all steps; accepts string (`"bash -e"`) or array (`["bash", "-e"]`). Step-level `shell` overrides.	System shell with errexit on Unix when no step shell is set
`logDir`	string	Custom log directory	System default
`logOutput`	string	Log output mode: `separate` (stdout/stderr to separate files) or `merged` (both to single file)	`separate`
`histRetentionDays`	integer	History retention days	`30`
`maxOutputSize`	integer	Max output size per step (bytes)	`1048576`

Container Configuration

Field	Type	Description	Default
`container`	string/object	Container configuration. Can be a string (existing container name to exec into) or an object (container configuration with `exec` or `image` field).	-

The container field supports two modes:

Image mode: Create a new container from a Docker image (image field required)
Exec mode: Execute in an existing running container (exec field or string form)

String Form (Exec Mode)

yaml

container: my-running-container  # Exec into existing container with defaults

Object Form - Image Mode

yaml

container:
  name: my-workflow        # Optional: custom container name
  image: python:3.11       # Required for image mode
  pullPolicy: missing      # always, missing, never
  env:
    - API_KEY=${API_KEY}
  volumes:
    - /data:/data:ro
  workingDir: /app
  platform: linux/amd64
  user: "1000:1000"
  ports:
    - "8080:8080"
  network: host
  startup: keepalive       # keepalive | entrypoint | command
  command: ["sh", "-c", "my-daemon"]   # when startup: command
  waitFor: running         # running | healthy
  logPattern: "Ready to accept connections"  # optional regex
  restartPolicy: unless-stopped              # optional: no|always|unless-stopped
  keepContainer: false     # Keep container after DAG run
  shell: ["/bin/bash", "-c"]  # Optional: shell wrapper for step commands (enables pipes, &&, etc.)

Object Form - Exec Mode

yaml

container:
  exec: my-running-container  # Required for exec mode
  user: root                  # Optional: override user
  workingDir: /app            # Optional: override working directory
  env:                        # Optional: additional environment variables
    - DEBUG=true
  shell: ["/bin/sh", "-c"]    # Optional: shell wrapper for step commands

Field Availability by Mode

Field	Exec Mode	Image Mode
`exec`	Required	Not allowed
`image`	Not allowed	Required
`user`, `workingDir`, `env`, `shell`	Optional	Optional
All other fields	Not allowed	Optional

Note: A DAG‑level container is started once (image mode) or attached to (exec mode) and kept alive while the workflow runs; each step executes via docker exec inside that container. This means step commands do not pass through the image's ENTRYPOINT/CMD. If your image's entrypoint dispatches subcommands, invoke it explicitly in the step command (see Execution Model and Entrypoint Behavior). For exec mode, the container must be running; Dagu waits up to 120 seconds. For image mode, readiness waiting (running/healthy and optional logPattern) times out after 120 seconds with a clear error including the last known state.

Secrets

The secrets block defines environment variables whose values are fetched at runtime from external providers. Each item is an object with the following fields:

Field	Type	Description
`name`	string	Environment variable name exposed to steps (required)
`provider`	string	Secret provider identifier (required). Built-in providers are `env` and `file`.
`key`	string	Provider-specific key (required). For the `env` provider this is the source environment variable; for `file` it is a file path.
`options`	object	Provider-specific options (optional). Values must be strings.

Example:

yaml

secrets:
  - name: DB_PASSWORD
    provider: env
    key: PROD_DB_PASSWORD        # Read from process environment
  - name: API_TOKEN
    provider: file
    key: ../secrets/api-token    # Relative paths resolve using workingDir then DAG file directory

Secret values are injected after DAG-level variables and built-in runtime variables, meaning they take precedence over everything except step-level overrides. Resolved values never touch persistent storage and are automatically masked in logs and captured step output.

See Secrets for provider reference, masking behavior, and extension tips.

SSH Configuration

Field	Type	Description	Default
`ssh`	object	Default SSH configuration for all steps	-

yaml

ssh:
  user: deploy
  host: production.example.com
  port: "22"           # Optional, defaults to "22"
  key: ~/.ssh/id_rsa   # Optional, defaults to standard keys
  password: "${SSH_PASSWORD}" # Optional; prefer keys for security
  strictHostKey: true  # Optional, defaults to true for security
  knownHostFile: ~/.ssh/known_hosts  # Optional, defaults to ~/.ssh/known_hosts
  shell: "/bin/bash -e"  # Optional: shell (string or array) for remote execution

When configured at the DAG level, all steps using SSH executor will inherit these settings:

yaml

# DAG-level SSH configuration
ssh:
  user: deploy
  host: app.example.com
  key: ~/.ssh/deploy_key
  shell: /bin/bash  # Commands wrapped in shell

steps:
  # These steps inherit the DAG-level SSH configuration
  - command: systemctl status myapp
  - command: systemctl restart myapp

  # Step-level config overrides DAG-level
  - type: ssh
    config:
      user: backup      # Override user
      host: db.example.com  # Override host
      key: ~/.ssh/backup_key  # Override key
      shell:
        - /bin/sh        # Override shell with explicit flags
        - -e
    command: mysqldump mydb > backup.sql

Important Notes:

SSH and container fields are mutually exclusive at the DAG level
Step-level SSH configuration completely overrides DAG-level configuration (no partial overrides)
Password authentication is supported but not recommended; prefer key-based auth
Default SSH keys are tried if no key is specified: ~/.ssh/id_rsa, ~/.ssh/id_ecdsa, ~/.ssh/id_ed25519, ~/.ssh/id_dsa
ssh.shell at the DAG level accepts either a string (e.g., "/bin/bash -e") or array form (e.g., ["/bin/bash","-e","-o","pipefail"]). Dagu tokenizes the value into the remote shell executable and arguments before wrapping commands.
For step-level SSH executor configs (executor.config.shell), use string form (e.g., "/bin/bash -e"). Array syntax is not supported when decoding the YAML map into the executor config.

LLM Configuration

Field	Type	Description	Default
`llm`	object	Default LLM configuration for all chat steps	-

yaml

llm:
  provider: openai
  model: gpt-4o
  system: "You are a helpful assistant."
  temperature: 0.7

steps:
  - type: chat
    messages:
      - role: user
        content: "What is 2+2?"

When configured at the DAG level, all chat steps inherit the LLM configuration. Step-level llm: completely overrides DAG-level configuration (no field-level merging).

See Chat Executor for full documentation.

Redis Configuration

Field	Type	Description	Default
`redis`	object	Default Redis configuration for all redis steps	-

yaml

redis:
  host: localhost
  port: 6379
  password: ${REDIS_PASSWORD}
  db: 0

steps:
  - name: cache-lookup
    type: redis
    config:
      command: GET
      key: cache:user:${USER_ID}
    output: CACHED_DATA

When configured at the DAG level, all redis steps inherit the connection settings. Step-level config values override DAG-level defaults (field-level merging).

Available fields:

url - Redis URL (redis://user:pass@host:port/db)
host - Redis host (alternative to URL)
port - Redis port (default: 6379)
password - Authentication password
username - ACL username (Redis 6+)
db - Database number (0-15)
tls - Enable TLS connection
mode - Connection mode: standalone, sentinel, cluster
sentinelMaster - Sentinel master name
sentinelAddrs - Sentinel addresses
clusterAddrs - Cluster node addresses

See Redis Executor for full documentation.

Working Directory and Volume Resolution

When using container volumes with relative paths, the paths are resolved relative to the DAG's workingDir:

yaml

# DAG with working directory and container volumes
workingDir: /app/project
container:
  image: python:3.11
  volumes:
    - ./data:/data        # Resolves to /app/project/data:/data
    - .:/workspace        # Resolves to /app/project:/workspace
    - /abs/path:/other   # Absolute paths are unchanged

steps:
  - command: python process.py

Working Directory Inheritance:

Steps inherit workingDir from the DAG if not explicitly set
Step-level workingDir overrides DAG-level workingDir
Both dir and workingDir set the working directory (use one or the other)
Sub-DAGs (via call) inherit the parent's workingDir when executed locally, unless they define their own explicit workingDir

yaml

# Example of workingDir inheritance
workingDir: /project          # DAG-level working directory

steps:
  - command: pwd                   # Outputs: /project
  - workingDir: /custom   # Override DAG workingDir
    command: pwd          # Outputs: /custom

Sub-DAG Working Directory Inheritance:

When a parent DAG calls a child DAG using call:, the child inherits the parent's working directory for local execution:

yaml

# Parent DAG with workingDir
workingDir: /app/project

steps:
  - call: child-workflow    # Child inherits /app/project as workingDir

---
# Child DAG without explicit workingDir
name: child-workflow
steps:
  - command: pwd                     # Outputs: /app/project (inherited from parent)

---
# Child DAG with explicit workingDir (overrides inheritance)
name: child-with-custom-wd
workingDir: /custom/path
steps:
  - command: pwd                     # Outputs: /custom/path

Note: Working directory inheritance only applies to local execution. For distributed execution (using workerSelector), sub-DAGs use their own context on the worker node.

Queue Configuration

Field	Type	Description	Default
`queue`	string	Queue name	-

OpenTelemetry Configuration

Field	Type	Description	Default
`otel`	object	OpenTelemetry tracing configuration	-

yaml

otel:
  enabled: true
  endpoint: "localhost:4317"  # OTLP gRPC endpoint
  headers:
    Authorization: "Bearer ${OTEL_TOKEN}"
  insecure: false
  timeout: 30s
  resource:
    service.name: "dagu-${DAG_NAME}"
    service.version: "1.0.0"
    deployment.environment: "${ENVIRONMENT}"

See OpenTelemetry Tracing for detailed configuration.

Notification Fields

Field	Type	Description	Default
`mailOn`	object	Email notification triggers	-
`errorMail`	object	Error email configuration	-
`infoMail`	object	Success email configuration	-
`waitMail`	object	Wait status email configuration	-
`smtp`	object	SMTP server configuration	-

yaml

mailOn:
  success: true
  failure: true
  wait: true      # Email when DAG is waiting for approval

errorMail:
  from: alerts@example.com
  to: oncall@example.com  # Single recipient (string)
  # Or multiple recipients (array):
  # to:
  #   - oncall@example.com
  #   - manager@example.com
  prefix: "[ALERT]"
  attachLogs: true

infoMail:
  from: notifications@example.com
  to: team@example.com  # Single recipient (string)
  # Or multiple recipients (array):
  # to:
  #   - team@example.com
  #   - stakeholders@example.com
  prefix: "[INFO]"
  attachLogs: false

waitMail:
  from: dagu@example.com
  to: approvers@example.com
  prefix: "[WAITING]"
  attachLogs: false

smtp:
  host: smtp.gmail.com
  port: "587"
  username: notifications@example.com
  password: ${SMTP_PASSWORD}

Handler Fields

Field	Type	Description	Default
`handlerOn`	object	Lifecycle event handlers	-

yaml

handlerOn:
  init:
    command: echo "Setting up"      # Runs before any steps
  success:
    command: echo "Workflow succeeded"
  failure:
    command: echo "Notifying failure"
  abort:
    command: echo "Cleaning up"
  wait:
    command: echo "Waiting for approval: ${DAG_WAITING_STEPS}"
  exit:
    command: echo "Always running"

Note: Sub-DAGs (invoked via call) do not inherit handlerOn from base configuration. Each sub-DAG must define its own handlers explicitly. See Lifecycle Handlers for details.

RunConfig

The runConfig field allows you to control user interactions when starting DAG runs:

Field	Type	Description	Default
`disableParamEdit`	boolean	Prevent parameter editing when starting DAG	`false`
`disableRunIdEdit`	boolean	Prevent custom run ID input when starting DAG	`false`

Example usage:

yaml

# Prevent users from modifying parameters at runtime
runConfig:
  disableParamEdit: true
  disableRunIdEdit: false

params:
  - ENVIRONMENT: production  # Users cannot change this
  - VERSION: 1.0.0           # This is fixed

This is useful when:

You want to enforce specific parameter values for production workflows
You need consistent run IDs for tracking purposes
You want to prevent accidental parameter changes

Step Fields

Each step in the steps array can have these fields:

Basic Fields

Field	Type	Description	Default
`name`	string	Step name (optional - auto-generated if not provided)	Auto-generated
`command`	string/array	Command to execute. Can be a string (single command), array of strings (multiple commands executed sequentially), or multi-line string (runs as inline script).	-
`script`	string	Inline script (alternative to command). Honors shebang when no shell is set.	-
`run` (legacy)	string	Deprecated alias for `call`	-
`depends`	string/array	Step dependencies	-

Multiple Commands

The command field accepts an array of strings. Multiple commands share the same step configuration:

yaml

steps:
  - name: build-and-test
    command:
      - npm install
      - npm run build
      - npm test
    env:
      - NODE_ENV: production
    workingDir: /app

Instead of duplicating env, workingDir, retryPolicy, preconditions, container, etc. across multiple steps, combine commands into one step.

Commands run in order and stop on first failure. Retries restart from the first command.

Trade-off: You lose the ability to retry or resume from the middle of the command list.

Supported step types: shell, command, docker, container, ssh

Not supported: jq, http, archive, mail, github_action, dag (configuration rejected at parse time)

Step Definition Formats

Steps can be defined in multiple formats:

Standard Format

yaml

steps:
  - command: echo "Hello"

Shorthand String Format

yaml

steps:
  - command: echo "Hello"     # Equivalent to: {command: echo "Hello"}
  - command: ls -la          # Equivalent to: {command: ls -la}

Execution Fields

Field	Type	Description	Default
`dir`	string	Working directory	Current directory
`workingDir`	string	Working directory (alternative to `dir`, inherits from DAG)	DAG's workingDir
`shell`	string/array	Shell program and args for this step; overrides DAG `shell`	DAG `shell` (system default when omitted)
`stdout`	string	Redirect stdout to file	-
`stderr`	string	Redirect stderr to file	-
`logOutput`	string	Override DAG-level log output mode for this step	DAG's logOutput
`output`	string	Capture output to variable	-
`env`	array/object	Step-specific environment variables (overrides DAG-level)	-
`call`	string	Name of a DAG to execute as a sub DAG-run	-
`params`	string/object	Parameters passed to sub DAGs (`run`) or executor-specific inputs (e.g., GitHub Actions `with:` map)	-

ℹ️ The legacy run field is still accepted for backward compatibility until v1.24.0, but it will be removed in a future release. Prefer call for new workflows.

shell accepts either a string (e.g., "bash -e") or an array (e.g., ["bash", "-e"]). DAG-level values expand environment variables when the workflow loads; step-level values are evaluated at runtime so you can reference parameters, secrets, or previous outputs.

Parallel Execution

Field	Type	Description	Default
`parallel`	array	Items to process in parallel	-
`maxConcurrent`	integer	Max parallel executions	No limit

yaml

steps:
  - call: file-processor
    parallel:
      items: [file1.csv, file2.csv, file3.csv]
      maxConcurrent: 2
    params: "FILE=${ITEM}"

Conditional Execution

Field	Type	Description	Default
`preconditions`	array	Conditions to check before execution	-
`continueOn`	object	Continue workflow on certain conditions	-

ContinueOn Fields

Field	Type	Description	Default
`failure`	boolean	Continue execution when step fails	`false`
`skipped`	boolean	Continue when step is skipped due to preconditions	`false`
`exitCode`	array	List of exit codes that allow continuation	`[]`
`output`	array	List of stdout patterns that allow continuation (supports regex with `re:` prefix)	`[]`
`markSuccess`	boolean	Mark step as successful when continue conditions are met	`false`

Precondition Fields

Field	Type	Description	Default
`condition`	string	Command or expression to evaluate	-
`expected`	string	Expected value or regex pattern (prefix with `re:`)	-
`negate`	boolean	Invert the condition logic (run when condition does NOT match)	`false`

yaml

steps:
  - command: echo "Deploying"
    preconditions:
      - condition: "${ENVIRONMENT}"
        expected: "production"
      - condition: "`git branch --show-current`"
        expected: "main"

  # With negate: run only when NOT in production
  - command: echo "Running dev task"
    preconditions:
      - condition: "${ENVIRONMENT}"
        expected: "production"
        negate: true  # Runs when condition does NOT match

  - command: echo "Running optional task"
    continueOn:
      failure: true
      skipped: true
      exitCode: [0, 1, 2]
      output: ["WARNING", "SKIP", "re:^INFO:.*"]
      markSuccess: true

See the Continue On Reference for detailed documentation.

Error Handling

Field	Type	Description	Default
`retryPolicy`	object	Retry configuration	-
`repeatPolicy`	object	Repeat configuration	-
`mailOnError`	boolean	Send email on error	`false`
`signalOnStop`	string	Signal to send on stop	`SIGTERM`

Retry Policy Fields

Field	Type	Description	Default
`limit`	integer	Maximum retry attempts	-
`intervalSec`	integer	Base interval between retries (seconds)	-
`backoff`	any	Exponential backoff multiplier. `true` = 2.0, or specify custom number > 1.0	-
`maxIntervalSec`	integer	Maximum interval between retries (seconds)	-
`exitCode`	array	Exit codes that trigger retry	All non-zero

Exponential Backoff: When backoff is set, intervals increase exponentially using the formula:
interval * (backoff ^ attemptCount)

Repeat Policy Fields

For iterating over a list of items, use parallel instead.

Field	Type	Description	Default
`repeat`	string	Repeat mode: `"while"` or `"until"`	-
`intervalSec`	integer	Base interval between repetitions (seconds)	-
`backoff`	any	Exponential backoff multiplier. `true` = 2.0, or specify custom number > 1.0	-
`maxIntervalSec`	integer	Maximum interval between repetitions (seconds)	-
`limit`	integer	Maximum number of executions	-
`condition`	string	Condition to evaluate	-
`expected`	string	Expected value/pattern	-
`exitCode`	array	Exit codes that trigger repeat	-

Repeat Modes:

while: Repeats while the condition is true or exit code matches
until: Repeats until the condition is true or exit code matches

Exponential Backoff: When backoff is set, intervals increase exponentially using the formula:
interval * (backoff ^ attemptCount)

yaml

steps:
  - command: curl https://api.example.com
    retryPolicy:
      limit: 3
      intervalSec: 30
      exitCode: [1, 255]  # Retry only on specific codes
      
  - command: curl https://api.example.com
    retryPolicy:
      limit: 5
      intervalSec: 2
      backoff: true        # Exponential backoff (2.0x multiplier)
      maxIntervalSec: 60   # Cap at 60 seconds
      exitCode: [429, 503] # Rate limit or unavailable
    
  - command: check-process.sh
    repeatPolicy:
      repeat: while        # Repeat WHILE process is running
      exitCode: [0]        # Exit code 0 means process found
      intervalSec: 60
      limit: 30
      
  - command: echo "Checking status"
    output: STATUS
    repeatPolicy:
      repeat: until        # Repeat UNTIL status is ready
      condition: "${STATUS}"
      expected: "ready"
      intervalSec: 5
      backoff: 1.5         # Custom backoff multiplier
      maxIntervalSec: 300  # Cap at 5 minutes
      limit: 60

Step-Level Container

Field	Type	Description	Default
`container`	string/object	Container configuration for this step. Can be a string (exec mode) or object (exec or image mode).	-

Use the container field to run a step in its own container:

yaml

steps:
  # Image mode - create new container
  - name: run-in-container
    container:
      image: python:3.11
      volumes:
        - /data:/data:ro
      env:
        - API_KEY=${API_KEY}
    command: python process.py

  # Exec mode - string form
  - name: run-migration
    container: my-app-container
    command: php artisan migrate

  # Exec mode - object form with overrides
  - name: admin-task
    container:
      exec: my-app-container
      user: root
      workingDir: /app
    command: chown -R app:app /data

TIP

The step-level container field uses the same format as DAG-level container configuration, supporting both image mode and exec mode.

WARNING

When using container, you cannot use executor or script fields on the same step.

Step Type Configuration

Field	Type	Description	Default
`type`	string	Step type (ssh, http, jq, mail, etc.)	shell
`config`	object	Step type-specific configuration	-

yaml

steps:
  - type: archive
    config:
      source: assets.tar.gz
      destination: ./assets
    command: extract

Chat (LLM)

Field	Type	Description	Default
`type`	string	Set to `chat` for LLM-based steps	-
`llm`	object	LLM configuration (provider, model, system prompt)	-
`messages`	array	Conversation messages for chat steps	-

yaml

steps:
  - type: chat
    llm:
      provider: openai
      model: gpt-4o
      system: "You are a helpful assistant."
    messages:
      - role: user
        content: "What is 2+2?"
    output: ANSWER

LLM configuration fields:

provider: LLM provider (openai, anthropic, gemini, openrouter, local)
model: Model identifier (e.g., gpt-4o, claude-sonnet-4-20250514)
system: Default system prompt (optional)
temperature: Randomness control 0.0-2.0 (optional)
maxTokens: Maximum tokens to generate (optional)
topP: Nucleus sampling 0.0-1.0 (optional)
baseURL: Custom API endpoint (optional)
apiKey: API key override (optional)
stream: Enable streaming output (default: true)

Message format:

role: Message role (system, user, assistant)
content: Message content (supports ${VAR} substitution)

See Chat Executor for full documentation.

HITL (Human in the Loop)

Field	Type	Description	Default
`type`	string	Set to `hitl` for approval gate steps	-
`config.prompt`	string	Message displayed to the approver	-
`config.input`	string[]	Parameter names to collect from approver	-
`config.required`	string[]	Required parameters (subset of input)	-

yaml

steps:
  - command: ./deploy.sh staging
  - type: hitl
    config:
      prompt: "Approve production?"
      input: [APPROVED_BY]
      required: [APPROVED_BY]
  - command: ./deploy.sh production

HITL steps pause workflow execution until approval is granted via UI or API. Collected inputs become environment variables in subsequent steps.

See HITL for full documentation.

Distributed Execution

Field	Type	Description	Default
`workerSelector`	object	Worker label requirements for distributed execution	-

When using distributed execution, specify workerSelector to route tasks to workers with matching labels:

yaml

steps:
  - call: gpu-training
---
# Run on a worker with gpu
name: gpu-training
workerSelector:
  gpu: "true"
  memory: "64G"
steps:
  - command: python train_model.py

Worker Selection Rules:

All labels in workerSelector must match exactly on the worker
Label values are case-sensitive strings
Steps without workerSelector can run on any available worker
If no workers match the selector, the task waits until a matching worker is available

See Distributed Execution for complete documentation.

Variable Substitution

Parameter References

yaml

params:
  - USER: john
  - DOMAIN: example.com

steps:
  - command: echo "Hello ${USER} from ${DOMAIN}"

Environment Variables

yaml

env:
  - API_URL: https://api.example.com
  - API_KEY: ${SECRET_API_KEY}  # From system env

steps:
  - command: curl -H "X-API-Key: ${API_KEY}" ${API_URL}

Loading Environment from .env Files

The dotenv field allows loading environment variables from .env files:

yaml

# Load specific .env file
dotenv: .env.production

# Load multiple .env files (all files loaded, later override earlier)
dotenv:
  - .env.defaults
  - .env.local

# Disable all .env loading
dotenv: []

Loading behavior:

If dotenv is not specified, Dagu loads .env by default
When files are specified, .env is automatically prepended to the list (and deduplicated if already included)
All files are loaded sequentially in order
Variables from later files override variables from earlier files
Missing files are silently skipped
Files are loaded relative to the DAG's workingDir
System environment variables take precedence over .env file variables
.env files are loaded at DAG startup, before any steps execute

Example .env file:

bash

# .env file
DATABASE_URL=postgres://localhost/mydb
API_KEY=secret123
DEBUG=true

yaml

# DAG using .env variables
workingDir: /app
dotenv: .env          # Optional, this is the default

steps:
  - command: psql ${DATABASE_URL}
  - command: echo "Debug is ${DEBUG}"

Command Substitution

yaml

steps:
  - command: echo "Today is `date +%Y-%m-%d`"
    
  - command: deploy.sh
    preconditions:
      - condition: "`git branch --show-current`"
        expected: "main"

Output Variables

yaml

steps:
  - command: cat VERSION
    output: VERSION
  - command: docker build -t app:${VERSION} .

JSON Path Access

yaml

steps:
  - command: cat config.json
    output: CONFIG
  - command: echo "Port is ${CONFIG.server.port}"

Special Variables

These variables are automatically available:

Variable	Description
`DAG_NAME`	Current DAG name
`DAG_RUN_ID`	Unique run identifier
`DAG_RUN_LOG_FILE`	Path to workflow log
`DAG_RUN_STEP_NAME`	Current step name
`DAG_RUN_STEP_STDOUT_FILE`	Step stdout file path
`DAG_RUN_STEP_STDERR_FILE`	Step stderr file path
`ITEM`	Current item in parallel execution

Complete Example

yaml

name: production-etl
description: Daily ETL pipeline for production data
tags: [production, etl, critical]
schedule: "0 2 * * *"

maxActiveRuns: 1
maxActiveSteps: 5
timeoutSec: 7200
histRetentionDays: 90

params:
  - DATE: "`date +%Y-%m-%d`"
  - ENVIRONMENT: production

env:
  - DATA_DIR: /data/etl
  - LOG_LEVEL: info
  
dotenv:
  - /etc/dagu/production.env

# Default container for all steps
container:
  image: python:3.11-slim
  pullPolicy: missing
  env:
    - PYTHONUNBUFFERED=1
  volumes:
    - ./data:/data
    - ./scripts:/scripts:ro

preconditions:
  - condition: "`date +%u`"
    expected: "re:[1-5]"  # Weekdays only

type: graph
steps:
  - command: ./scripts/validate.sh

  - command: python extract.py --date=${DATE}
    depends: validate-environment
    output: RAW_DATA_PATH
    retryPolicy:
      limit: 3
      intervalSec: 300
    
  - call: transform-module
    parallel:
      items: [customers, orders, products]
      maxConcurrent: 2
    params: "TYPE=${ITEM} INPUT=${RAW_DATA_PATH}"
    continueOn:
      failure: false

 # Use different container for this step
  - name: load-data
    container:
      image: postgres:16
      env:
        - PGPASSWORD=${DB_PASSWORD}
    command: psql -h ${DB_HOST} -U ${DB_USER} -f load.sql
    
  - command: python validate_results.py --date=${DATE}
    depends: load-data
    mailOnError: true

handlerOn:
  success:
    command: |
      echo "ETL completed successfully for ${DATE}"
      ./scripts/notify-success.sh
  failure:
    type: mail
    config:
      to: data-team@example.com
      subject: "ETL Failed - ${DATE}"
      body: "Check logs at ${DAG_RUN_LOG_FILE}"
      attachLogs: true
  exit:
    command: ./scripts/cleanup.sh ${DATE}

mailOn:
  failure: true
  
smtp:
  host: smtp.company.com
  port: "587"
  username: etl-notifications@company.com

YAML Specification ​

Overview ​

Basic Structure ​

Root Fields ​

Metadata Fields ​

Scheduling Fields ​

Schedule Formats ​

Execution Control Fields ​

Data Fields ​

Container Configuration ​

String Form (Exec Mode) ​

Object Form - Image Mode ​

Object Form - Exec Mode ​

Field Availability by Mode ​

Secrets ​

SSH Configuration ​

LLM Configuration ​

Redis Configuration ​

Working Directory and Volume Resolution ​

Queue Configuration ​

OpenTelemetry Configuration ​

Notification Fields ​

Handler Fields ​

RunConfig ​

Step Fields ​

Basic Fields ​

Multiple Commands ​

Step Definition Formats ​

Standard Format ​

Shorthand String Format ​

Execution Fields ​

Parallel Execution ​

Conditional Execution ​

ContinueOn Fields ​

Precondition Fields ​

Error Handling ​

Retry Policy Fields ​

Repeat Policy Fields ​

Step-Level Container ​

Step Type Configuration ​

Chat (LLM) ​

HITL (Human in the Loop) ​

Distributed Execution ​

Variable Substitution ​

Parameter References ​

Environment Variables ​

Loading Environment from .env Files ​

Command Substitution ​

Output Variables ​

JSON Path Access ​

Special Variables ​

Complete Example ​

YAML Specification

Overview

Basic Structure

Root Fields

Metadata Fields

Scheduling Fields

Schedule Formats

Execution Control Fields

Data Fields

Container Configuration

String Form (Exec Mode)

Object Form - Image Mode

Object Form - Exec Mode

Field Availability by Mode

Secrets

SSH Configuration

LLM Configuration

Redis Configuration

Working Directory and Volume Resolution

Queue Configuration

OpenTelemetry Configuration

Notification Fields

Handler Fields

RunConfig

Step Fields

Basic Fields

Multiple Commands

Step Definition Formats

Standard Format

Shorthand String Format

Execution Fields

Parallel Execution

Conditional Execution

ContinueOn Fields

Precondition Fields

Error Handling

Retry Policy Fields

Repeat Policy Fields

Step-Level Container

Step Type Configuration

Chat (LLM)

HITL (Human in the Loop)

Distributed Execution

Variable Substitution

Parameter References

Environment Variables

Loading Environment from .env Files

Command Substitution

Output Variables

JSON Path Access

Special Variables

Complete Example