Skip to content

S3

Execute S3 operations including upload, download, list, and delete. Supports AWS S3 and S3-compatible services like MinIO, Google Cloud Storage (via S3 API), DigitalOcean Spaces, and Backblaze B2.

DAG-Level Configuration

Configure S3 settings at the DAG level to share connection details across all S3 steps:

yaml
# DAG-level S3 configuration
s3:
  region: us-east-1
  endpoint: https://s3.amazonaws.com  # Optional for AWS, required for S3-compatible services
  accessKeyId: ${AWS_ACCESS_KEY_ID}
  secretAccessKey: ${AWS_SECRET_ACCESS_KEY}
  bucket: my-bucket
  forcePathStyle: false  # Set true for MinIO and most S3-compatible services

steps:
  # All S3 steps inherit DAG-level configuration
  - name: upload-report
    type: s3
    config:
      key: reports/daily.csv
      source: /tmp/report.csv
    command: upload

  - name: download-config
    type: s3
    config:
      key: config/settings.json
      destination: /tmp/settings.json
    command: download

Step-Level Configuration

yaml
steps:
  - name: upload-to-s3
    type: s3
    config:
      region: us-west-2
      accessKeyId: ${AWS_ACCESS_KEY_ID}
      secretAccessKey: ${AWS_SECRET_ACCESS_KEY}
      bucket: my-bucket
      key: data/file.txt
      source: /local/path/file.txt
    command: upload

Operations

Upload

Upload a local file to S3:

yaml
steps:
  - name: upload-file
    type: s3
    config:
      bucket: my-bucket
      key: uploads/document.pdf
      source: /tmp/document.pdf
      contentType: application/pdf  # Optional, auto-detected from extension
      storageClass: STANDARD_IA     # Optional
      metadata:                     # Optional custom metadata
        author: dagu
        version: "1.0"
      tags:                         # Optional object tags
        environment: production
    command: upload

Download

Download an S3 object to local filesystem:

yaml
steps:
  - name: download-file
    type: s3
    config:
      bucket: my-bucket
      key: data/export.csv
      destination: /tmp/export.csv
    command: download

List

List objects in a bucket:

yaml
steps:
  - name: list-objects
    type: s3
    config:
      bucket: my-bucket
      prefix: logs/2024/        # Optional prefix filter
      maxKeys: 100              # Optional, default 1000
      recursive: true           # Optional, default false
      outputFormat: json        # json (default) or jsonl for streaming
    command: list
    output: OBJECTS

Delete

Delete a single object or multiple objects by prefix:

yaml
steps:
  # Delete single object
  - name: delete-file
    type: s3
    config:
      bucket: my-bucket
      key: temp/old-file.txt
    command: delete

  # Delete by prefix (batch delete)
  - name: cleanup-logs
    type: s3
    config:
      bucket: my-bucket
      prefix: logs/2023/
      quiet: true  # Suppress output
    command: delete

Configuration Reference

DAG-Level Fields

FieldRequiredDefaultDescription
regionNo-AWS region (e.g., us-east-1)
endpointNoAWS S3Custom endpoint URL for S3-compatible services
accessKeyIdNo-AWS access key ID
secretAccessKeyNo-AWS secret access key
sessionTokenNo-AWS session token for temporary credentials
profileNo-AWS credentials profile name
bucketNo-Default bucket name
forcePathStyleNofalseUse path-style addressing (required for most S3-compatible services)
disableSSLNofalseDisable SSL (for local testing only)

Step-Level Fields

All DAG-level fields can be overridden at the step level, plus:

FieldRequiredDefaultDescription
keyVaries-S3 object key (required for upload/download/delete)
sourceUpload-Local file path to upload
destinationDownload-Local file path for download
prefixNo-Key prefix for list/delete operations
contentTypeNoAuto-detectContent-Type for uploaded objects
storageClassNoSTANDARDStorage class (STANDARD, STANDARD_IA, GLACIER, etc.)
metadataNo-Custom metadata key-value pairs
tagsNo-Object tags as key-value pairs
maxKeysNo1000Max keys to return for list
recursiveNofalseList objects recursively
delimiterNo-Delimiter for grouping keys
outputFormatNojsonOutput format: json or jsonl
quietNofalseSuppress output for delete

S3-Compatible Services

MinIO

yaml
s3:
  endpoint: http://localhost:9000
  accessKeyId: minioadmin
  secretAccessKey: minioadmin
  bucket: my-bucket
  forcePathStyle: true  # Required for MinIO
  disableSSL: true      # If using HTTP

steps:
  - name: upload-to-minio
    type: s3
    config:
      key: data/file.txt
      source: /tmp/file.txt
    command: upload

Google Cloud Storage

GCS supports the S3 API via HMAC keys:

yaml
s3:
  endpoint: https://storage.googleapis.com
  accessKeyId: ${GCS_HMAC_ACCESS_KEY}
  secretAccessKey: ${GCS_HMAC_SECRET}
  bucket: my-gcs-bucket

steps:
  - name: upload-to-gcs
    type: s3
    config:
      key: data/file.txt
      source: /tmp/file.txt
    command: upload

To create HMAC keys for GCS:

  1. Go to Cloud Storage > Settings > Interoperability
  2. Create a key for your service account

DigitalOcean Spaces

yaml
s3:
  endpoint: https://nyc3.digitaloceanspaces.com
  accessKeyId: ${DO_SPACES_KEY}
  secretAccessKey: ${DO_SPACES_SECRET}
  bucket: my-space
  forcePathStyle: false  # Spaces uses virtual-hosted style

steps:
  - name: upload-to-spaces
    type: s3
    config:
      key: uploads/file.txt
      source: /tmp/file.txt
    command: upload

Authentication

Static Credentials

yaml
s3:
  accessKeyId: ${AWS_ACCESS_KEY_ID}
  secretAccessKey: ${AWS_SECRET_ACCESS_KEY}

AWS Profile

yaml
s3:
  profile: production

IAM Role (EC2/ECS/Lambda)

When running on AWS infrastructure with an IAM role, credentials are automatically discovered:

yaml
s3:
  region: us-east-1
  bucket: my-bucket
# No credentials needed - uses instance role

Output Format

Upload Result

json
{
  "operation": "upload",
  "success": true,
  "bucket": "my-bucket",
  "key": "data/file.txt",
  "uri": "s3://my-bucket/data/file.txt",
  "etag": "\"d41d8cd98f00b204e9800998ecf8427e\"",
  "size": 1024,
  "contentType": "text/plain",
  "storageClass": "STANDARD",
  "duration": "125ms"
}

Download Result

json
{
  "operation": "download",
  "success": true,
  "bucket": "my-bucket",
  "key": "data/file.txt",
  "uri": "s3://my-bucket/data/file.txt",
  "destination": "/tmp/file.txt",
  "size": 1024,
  "contentType": "text/plain",
  "etag": "\"d41d8cd98f00b204e9800998ecf8427e\"",
  "duration": "89ms"
}

List Result

json
{
  "operation": "list",
  "success": true,
  "bucket": "my-bucket",
  "prefix": "logs/",
  "objects": [
    {
      "key": "logs/app.log",
      "size": 2048,
      "lastModified": "2024-01-15T10:30:00Z",
      "etag": "\"abc123\"",
      "storageClass": "STANDARD"
    }
  ],
  "totalCount": 1,
  "duration": "50ms"
}

Delete Result

json
{
  "operation": "delete",
  "success": true,
  "bucket": "my-bucket",
  "key": "temp/old-file.txt",
  "deletedCount": 1,
  "deletedKeys": ["temp/old-file.txt"],
  "duration": "45ms"
}

See Also

Released under the MIT License.