Skip to content

Storage Backends

Storage holds everything the platform runs: flows, connections, environments, secrets, invocation logs, persisted state. Pick a backend at install; switching later is a backup-restore exercise, not a config flip.

BackendWhenConfigured by
FileSingle-host, on-prem, evaluationsStorage.StorageType: File + Storage.Directory
S3Cloud-resident installs, AWS-nativeStorage.StorageType: S3 + bucket / IAM role / access key
MinioObject-storage semantics on-premStorage.StorageType: Minio + endpoint + credentials

S3 and Minio share the same operator-kind contract and are mostly interchangeable from the platform’s perspective.

{Storage.Directory}/
├── flows/{name}/definition.yaml # pipeline definitions
├── scripts/{name}/ # custom-operator workspaces
├── auth/ # identity & credentials
│ ├── users.yaml # local-provider users
│ ├── apikeys.yaml # API keys
│ ├── secrets-policy.yaml # path-based secret access rules
│ └── secrets.enc # encrypted; never git-tracked
├── policies/ # access-control rules
├── teams/{name}/ # organisational entities
├── config/ # operational deployment config
│ ├── connections/{name}/definition.yaml
│ ├── environments/{name}/definition.yaml
│ ├── cache-policies/{name}/definition.yaml
│ ├── var-sets/{name}/definition.yaml
│ └── mappings/{name}/definition.yaml # reusable lookup tables
├── knowledge/ # standalone & entity-pinned knowledge
│ ├── folios/{slug}/ # in-product knowledge folios
│ ├── skills/ # reusable AI skill bundles
│ ├── annotations/{type}/{id}.yaml # entity annotations (tags, descriptions)
│ └── context/global.md # installation-wide notes
├── activity/
│ ├── invocations/ # per-run logs and outputs
│ ├── persistence/ # persist: state across runs
│ ├── indexes/ # invocation indexes
│ ├── shares/ # share links
│ ├── tasks/ # background task records
│ └── introspection/ # captured tool-call events
└── archives/ # archived activity (zip, tar.gz, parquet)
  • Stateful, must back up: flows/, scripts/, auth/, policies/, teams/, config/, knowledge/, activity/, the git-versioning path, and auth/secrets.enc.
  • Stateful, regeneratable: the backup archives themselves (copy off-host on a schedule).
  • Ephemeral: Cache.Directory — anywhere, separate disk OK, safe to wipe between runs.

The platform has individually configurable storage buckets — one per entity type. StorageBuckets in appsettings.yaml is the per-bucket override, so you can keep flow definitions on local disk and invocation activity in S3 for retention scaling. Bucket settings (Path, GitTracked, Retention, Archive) inherit from ancestor groups in the nested tree.

Entity names may contain / as a hierarchical separator — akeneo/product/sync is a valid flow name. This is consistent across every storage backend and is why the layering (provider → git → branch → overlay) composes cleanly. You don’t configure it; it’s just how names map to storage paths.