Workflow Basics
Learn the fundamentals of writing Dagu workflows.
Your First Workflow
Create hello.yaml:
steps:
- command: echo "Hello from Dagu!"Run it:
dagu start hello.yamlWorkflow Structure
A complete workflow contains:
# Metadata
name: data-pipeline
description: Process daily data
tags: [etl, production]
# Configuration
schedule: "0 2 * * *"
params:
- DATE: ${DATE:-today}
# Steps
steps:
- name: process
command: python process.py ${DATE}
# Handlers
handlerOn:
failure:
command: notify-error.shSteps
The basic unit of execution.
Step Names
Step names are optional. When omitted, Dagu automatically generates names based on the step type:
steps:
- command: echo "First step" # Auto-named: cmd_1
- script: | # Auto-named: script_2
echo "Multi-line"
echo "Script"
- name: explicit-name # Explicit name
command: echo "Third step"
- type: http # Auto-named: http_4
config:
url: https://api.example.com
- call: child-workflow # Auto-named: dag_5Auto-generated names follow the pattern {type}_{number}:
cmd_N- Command stepsscript_N- Script stepshttp_N- HTTP stepsdag_N- DAG stepscontainer_N- Docker/container stepsssh_N- SSH stepsmail_N- Mail stepsjq_N- JQ steps
For parallel steps (see below), the pattern is parallel_{group}_{type}_{index}.
Shorthand Command Syntax
For simple commands, you can use an even more concise syntax:
steps:
- command: echo "Hello World"
- command: ls -la
- command: python script.pyThis is equivalent to:
type: graph
steps:
- name: step 1
command: echo "Hello World"
- name: step 2
command: ls -la
depends: step 1
- name: step 3
command: python script.py
depends: step 2Multiple Commands
Multiple commands share the same step configuration:
steps:
- name: build-and-test
command:
- npm install
- npm run build
- npm test
env:
- NODE_ENV: production
workingDir: /app
retryPolicy:
limit: 3Instead of duplicating env, workingDir, retryPolicy, preconditions, container, etc. across multiple steps, combine commands into one step.
Commands run in order and stop on first failure. Retries restart from the first command.
Trade-off: You lose the ability to retry or resume from the middle of the command list. If you need granular control over individual command retries, use separate steps.
Supported step types: shell, command, docker, container, ssh
Not supported: jq, http, archive, mail, github_action, dag (these only accept single commands)
Multi-line Scripts
steps:
- script: |
#!/bin/bash
set -e
echo "Processing..."
python analyze.py data.csv
echo "Complete"If you omit shell, Dagu uses the interpreter declared in the script's shebang (#!) when present.
Shell Selection
Set a default shell for every step at the DAG level, and override it per step when needed:
shell: ["/bin/bash", "-e", "-u"] # Default shell + args for the whole workflow
steps:
- name: bash-task
command: echo "Runs with bash -e -u"
- name: zsh-override
shell: /bin/zsh # Step-level override
command: echo "Uses zsh instead"The shell value accepts either a string ("bash -e") or an array (["bash", "-e"]). Arrays avoid quoting issues when you need multiple flags.
When you omit a step-level shell, Dagu runs through the DAG shell (or system default) and automatically adds -e on Unix-like shells so scripts stop on first error. If you explicitly set shell on a step, include -e yourself if you want the same errexit behavior.
steps:
- shell: python3
script: |
import pandas as pd
df = pd.read_csv('data.csv')
print(df.head())Dependencies
Steps run sequentially by default. Use depends for parallel execution or to control order.
steps:
- name: download
command: wget data.csv
- name: process
command: python process.py
- name: upload
command: aws s3 cp output.csv s3://bucket/Parallel Execution
You can run steps in parallel using explicit dependencies:
type: graph
steps:
- name: setup
command: echo "Setup"
- name: task1
command: echo "Task 1"
depends: setup
- name: task2
command: echo "Task 2"
depends: setup
- name: finish
command: echo "All tasks complete"
depends: [task1, task2]Working Directory
Set where commands execute:
steps:
- name: in-project
workingDir: /home/user/project
command: python main.py
- name: in-data
workingDir: /data/input
command: ls -laEnvironment Variables
System Environment Security
Dagu filters environment variables passed to step processes for security:
- System variables work in DAG YAML for expansion:
${VAR} - Only filtered variables are passed to step execution environment
- Whitelisted:
PATH,HOME,LANG,TZ,SHELL - Allowed Prefixes:
DAGU_*,LC_*,DAG_*
To make other variables available in step processes, define them in env.
Global Environment
env:
- API_KEY: secret123
- ENV: production
- AWS_ACCESS_KEY_ID: ${AWS_ACCESS_KEY_ID} # Explicit reference required
steps:
- name: use-env
command: echo "Running in $ENV"Step-Level Environment
Steps can have their own environment variables that override DAG-level ones:
env:
- ENV: production
steps:
- name: dev-test
command: echo "Running in $ENV"
env:
- ENV: development # Overrides DAG-level
- TEST_FLAG: true
# Output: Running in developmentLoad from .env Files
dotenv:
- .env
- .env.production
steps:
- name: use-dotenv
command: echo $DATABASE_URLCapturing Output
Store command output in variables:
steps:
- name: get-version
command: git rev-parse --short HEAD
output: VERSION
- name: build
command: docker build -t app:${VERSION} .Basic Error Handling
Continue on Failure
steps:
- name: optional-step
command: maybe-fails.sh
continueOn:
failure: true
- name: always-runs
command: cleanup.shSimple Retry
steps:
- name: flaky-api
command: curl https://unstable-api.com
retryPolicy:
limit: 3Timeouts
Prevent steps from running forever:
steps:
- name: long-task
command: echo "Processing data"
timeoutSec: 300 # 5 minutesStep Descriptions
Document your steps:
steps:
- name: etl-process
description: |
Extract data from API, transform to CSV,
and load into data warehouse
command: python etl.pyTags and Organization
Group related workflows:
name: customer-report
tags:
- reports
- customer
- daily
group: Analytics # UI groupingSee Also
- Control Flow - Conditionals and loops
- Data & Variables - Pass data between steps
- Error Handling - Advanced error recovery
- Parameters - Make workflows configurable
