Storage Backends
Storage holds everything the platform runs: flows, connections, environments, secrets, invocation logs, persisted state. Pick a backend at install; switching later is a backup-restore exercise, not a config flip.
The three backends
Section titled “The three backends”| Backend | When | Configured by |
|---|---|---|
| File | Single-host, on-prem, evaluations | Storage.StorageType: File + Storage.Directory |
| S3 | Cloud-resident installs, AWS-native | Storage.StorageType: S3 + bucket / IAM role / access key |
| Minio | Object-storage semantics on-prem | Storage.StorageType: Minio + endpoint + credentials |
S3 and Minio share the same operator-kind contract and are mostly interchangeable from the platform’s perspective.
On-disk layout (File backend)
Section titled “On-disk layout (File backend)”{Storage.Directory}/├── flows/{name}/definition.yaml # pipeline definitions├── scripts/{name}/ # custom-operator workspaces├── auth/ # identity & credentials│ ├── users.yaml # local-provider users│ ├── apikeys.yaml # API keys│ ├── secrets-policy.yaml # path-based secret access rules│ └── secrets.enc # encrypted; never git-tracked├── policies/ # access-control rules├── teams/{name}/ # organisational entities├── config/ # operational deployment config│ ├── connections/{name}/definition.yaml│ ├── environments/{name}/definition.yaml│ ├── cache-policies/{name}/definition.yaml│ ├── var-sets/{name}/definition.yaml│ └── mappings/{name}/definition.yaml # reusable lookup tables├── knowledge/ # standalone & entity-pinned knowledge│ ├── folios/{slug}/ # in-product knowledge folios│ ├── skills/ # reusable AI skill bundles│ ├── annotations/{type}/{id}.yaml # entity annotations (tags, descriptions)│ └── context/global.md # installation-wide notes├── activity/│ ├── invocations/ # per-run logs and outputs│ ├── persistence/ # persist: state across runs│ ├── indexes/ # invocation indexes│ ├── shares/ # share links│ ├── tasks/ # background task records│ └── introspection/ # captured tool-call events└── archives/ # archived activity (zip, tar.gz, parquet)Three categories worth keeping straight
Section titled “Three categories worth keeping straight”- Stateful, must back up:
flows/,scripts/,auth/,policies/,teams/,config/,knowledge/,activity/, the git-versioning path, andauth/secrets.enc. - Stateful, regeneratable: the backup archives themselves (copy off-host on a schedule).
- Ephemeral:
Cache.Directory— anywhere, separate disk OK, safe to wipe between runs.
Splitting buckets across backends
Section titled “Splitting buckets across backends”The platform has individually configurable storage buckets — one per entity type. StorageBuckets in appsettings.yaml is the per-bucket override, so you can keep flow definitions on local disk and invocation activity in S3 for retention scaling. Bucket settings (Path, GitTracked, Retention, Archive) inherit from ancestor groups in the nested tree.
Hierarchical entity names
Section titled “Hierarchical entity names”Entity names may contain / as a hierarchical separator — akeneo/product/sync is a valid flow name. This is consistent across every storage backend and is why the layering (provider → git → branch → overlay) composes cleanly. You don’t configure it; it’s just how names map to storage paths.