S3
Execute S3 operations including upload, download, list, and delete. Supports AWS S3 and S3-compatible services like MinIO, Google Cloud Storage (via S3 API), DigitalOcean Spaces, and Backblaze B2.
DAG-Level Configuration
Configure S3 settings at the DAG level to share connection details across all S3 steps:
yaml
# DAG-level S3 configuration
s3:
region: us-east-1
endpoint: https://s3.amazonaws.com # Optional for AWS, required for S3-compatible services
accessKeyId: ${AWS_ACCESS_KEY_ID}
secretAccessKey: ${AWS_SECRET_ACCESS_KEY}
bucket: my-bucket
forcePathStyle: false # Set true for MinIO and most S3-compatible services
steps:
# All S3 steps inherit DAG-level configuration
- name: upload-report
type: s3
config:
key: reports/daily.csv
source: /tmp/report.csv
command: upload
- name: download-config
type: s3
config:
key: config/settings.json
destination: /tmp/settings.json
command: downloadStep-Level Configuration
yaml
steps:
- name: upload-to-s3
type: s3
config:
region: us-west-2
accessKeyId: ${AWS_ACCESS_KEY_ID}
secretAccessKey: ${AWS_SECRET_ACCESS_KEY}
bucket: my-bucket
key: data/file.txt
source: /local/path/file.txt
command: uploadOperations
Upload
Upload a local file to S3:
yaml
steps:
- name: upload-file
type: s3
config:
bucket: my-bucket
key: uploads/document.pdf
source: /tmp/document.pdf
contentType: application/pdf # Optional, auto-detected from extension
storageClass: STANDARD_IA # Optional
metadata: # Optional custom metadata
author: dagu
version: "1.0"
tags: # Optional object tags
environment: production
command: uploadDownload
Download an S3 object to local filesystem:
yaml
steps:
- name: download-file
type: s3
config:
bucket: my-bucket
key: data/export.csv
destination: /tmp/export.csv
command: downloadList
List objects in a bucket:
yaml
steps:
- name: list-objects
type: s3
config:
bucket: my-bucket
prefix: logs/2024/ # Optional prefix filter
maxKeys: 100 # Optional, default 1000
recursive: true # Optional, default false
outputFormat: json # json (default) or jsonl for streaming
command: list
output: OBJECTSDelete
Delete a single object or multiple objects by prefix:
yaml
steps:
# Delete single object
- name: delete-file
type: s3
config:
bucket: my-bucket
key: temp/old-file.txt
command: delete
# Delete by prefix (batch delete)
- name: cleanup-logs
type: s3
config:
bucket: my-bucket
prefix: logs/2023/
quiet: true # Suppress output
command: deleteConfiguration Reference
DAG-Level Fields
| Field | Required | Default | Description |
|---|---|---|---|
region | No | - | AWS region (e.g., us-east-1) |
endpoint | No | AWS S3 | Custom endpoint URL for S3-compatible services |
accessKeyId | No | - | AWS access key ID |
secretAccessKey | No | - | AWS secret access key |
sessionToken | No | - | AWS session token for temporary credentials |
profile | No | - | AWS credentials profile name |
bucket | No | - | Default bucket name |
forcePathStyle | No | false | Use path-style addressing (required for most S3-compatible services) |
disableSSL | No | false | Disable SSL (for local testing only) |
Step-Level Fields
All DAG-level fields can be overridden at the step level, plus:
| Field | Required | Default | Description |
|---|---|---|---|
key | Varies | - | S3 object key (required for upload/download/delete) |
source | Upload | - | Local file path to upload |
destination | Download | - | Local file path for download |
prefix | No | - | Key prefix for list/delete operations |
contentType | No | Auto-detect | Content-Type for uploaded objects |
storageClass | No | STANDARD | Storage class (STANDARD, STANDARD_IA, GLACIER, etc.) |
metadata | No | - | Custom metadata key-value pairs |
tags | No | - | Object tags as key-value pairs |
maxKeys | No | 1000 | Max keys to return for list |
recursive | No | false | List objects recursively |
delimiter | No | - | Delimiter for grouping keys |
outputFormat | No | json | Output format: json or jsonl |
quiet | No | false | Suppress output for delete |
S3-Compatible Services
MinIO
yaml
s3:
endpoint: http://localhost:9000
accessKeyId: minioadmin
secretAccessKey: minioadmin
bucket: my-bucket
forcePathStyle: true # Required for MinIO
disableSSL: true # If using HTTP
steps:
- name: upload-to-minio
type: s3
config:
key: data/file.txt
source: /tmp/file.txt
command: uploadGoogle Cloud Storage
GCS supports the S3 API via HMAC keys:
yaml
s3:
endpoint: https://storage.googleapis.com
accessKeyId: ${GCS_HMAC_ACCESS_KEY}
secretAccessKey: ${GCS_HMAC_SECRET}
bucket: my-gcs-bucket
steps:
- name: upload-to-gcs
type: s3
config:
key: data/file.txt
source: /tmp/file.txt
command: uploadTo create HMAC keys for GCS:
- Go to Cloud Storage > Settings > Interoperability
- Create a key for your service account
DigitalOcean Spaces
yaml
s3:
endpoint: https://nyc3.digitaloceanspaces.com
accessKeyId: ${DO_SPACES_KEY}
secretAccessKey: ${DO_SPACES_SECRET}
bucket: my-space
forcePathStyle: false # Spaces uses virtual-hosted style
steps:
- name: upload-to-spaces
type: s3
config:
key: uploads/file.txt
source: /tmp/file.txt
command: uploadAuthentication
Static Credentials
yaml
s3:
accessKeyId: ${AWS_ACCESS_KEY_ID}
secretAccessKey: ${AWS_SECRET_ACCESS_KEY}AWS Profile
yaml
s3:
profile: productionIAM Role (EC2/ECS/Lambda)
When running on AWS infrastructure with an IAM role, credentials are automatically discovered:
yaml
s3:
region: us-east-1
bucket: my-bucket
# No credentials needed - uses instance roleOutput Format
Upload Result
json
{
"operation": "upload",
"success": true,
"bucket": "my-bucket",
"key": "data/file.txt",
"uri": "s3://my-bucket/data/file.txt",
"etag": "\"d41d8cd98f00b204e9800998ecf8427e\"",
"size": 1024,
"contentType": "text/plain",
"storageClass": "STANDARD",
"duration": "125ms"
}Download Result
json
{
"operation": "download",
"success": true,
"bucket": "my-bucket",
"key": "data/file.txt",
"uri": "s3://my-bucket/data/file.txt",
"destination": "/tmp/file.txt",
"size": 1024,
"contentType": "text/plain",
"etag": "\"d41d8cd98f00b204e9800998ecf8427e\"",
"duration": "89ms"
}List Result
json
{
"operation": "list",
"success": true,
"bucket": "my-bucket",
"prefix": "logs/",
"objects": [
{
"key": "logs/app.log",
"size": 2048,
"lastModified": "2024-01-15T10:30:00Z",
"etag": "\"abc123\"",
"storageClass": "STANDARD"
}
],
"totalCount": 1,
"duration": "50ms"
}Delete Result
json
{
"operation": "delete",
"success": true,
"bucket": "my-bucket",
"key": "temp/old-file.txt",
"deletedCount": 1,
"deletedKeys": ["temp/old-file.txt"],
"duration": "45ms"
}