Skip to content
S3 object storage cluster at home - GarageHQ

S3 object storage cluster at home - GarageHQ

AWS S3 (Simple Storage Service) is very useful - it provides object storage you can use for various things, like being the storage backend for software pieces that support it, a backup target for your data, or a store of static websites you can serve from the S3 bucket with just plain HTTP.

Sounds great, but there is a major flaw to this - AWS S3 is a cloud service, which can get pretty expensive in the pay-as-you-go model AWS provides.

There is a great alternative to the paid and proprietary S3, that is Free and Open Source and entirely self-hosted, while allowing for High-Availability of your storage by clustering multiple instances together - it’s GarageHQ

The garage setup

I’ve recently deployed a cluster of two GarageHQ nodes between the two of my locations. The process went pretty smoothly, as can be seen by the fact that you can read this blog and any of my websites. They’re all hosted from the GarageHQ S3 storage cluster via http, and then proxied by Caddy reverse proxy.

Preparation

Get yourself at least one Linux box - a VM, or an LXC. For a cluster - at least two obviously, for HA and redundancy reasons preferably on different servers and different geographical locations. For a cluster setup, if you can, go for an uneven number - this helps prevent split-brain scenarios.

I’ll be using just two Debian 13 LXC containers for this, to also show how to run it on less than 3 nodes. I’ve preconfigured and enrolled my LXCs as I always do beforehand, more on that here.

The GarageHQ hosts must have some way of seeing eachother via the network for the cluster to form. My LXCs will see eachother via my site-to-site IPSEC tunnels, in a later step I’ll allow that through the firewall.

DNS records

Choose yourself a subdomain out of your own domain, as a way to reference the GarageHQ S3 cluster as a whole, or two subdomains if you wish to have separate ones for the S3 API and the HTTP.
AWS is using “bucketname.s3.region.amazonaws.com” for S3 api, and “bucketname.s3-website.region.amazonaws.com” for static websites. I used “bucketname.cdn.fibermouse.xyz” for both S3 api and web - as for me, the cluster will only serve a role of storing web-related buckets.
In addition, I have cdn1.fibermouse.xyz and cdn2.fibermouse.xyz referencing the separate servers, not to have to use the FQDN every time, I do that everywhere (check out www1 or www2.fibermouse.xyz :) )

Then, you need to set the appropriate AAAA or A records in your domain registry for the subdomain(s) itself, and also for a wildcard domain for the same, so that virtual-host style bucket references work, as opposed to the legacy way of referencing bucket names after the slash in the link.

The DNS records required are then as follows: (using s3.example.com as, well, an example)

s3.example.com.     IN  AAAA    2001:db8:beef:cafe::420:69
*.s3.example.com.   IN  CNAME   s3.example.com.

Storage mountpoint

GarageHQ keeps it’s S3 data in /var/lib/garage/data, thus I mounted this particular folder to be on my ZFS pool. Other Garage data like the indexes and db are stored in other directories in /var/lib/garage/ - those are recommended to be on SSD storage not to bottleneck the setup.

Installation

Binary

Get yourself a GarageHQ binary on all of the hosts you plan to cluster together. The binary is available at https://garagehq.deuxfleurs.fr/download/. Place it somewhere within your $PATH - I placed mine at /usr/local/bin/garage

SystemD service

It is recommended to run Garage as a non-root user if you wanna protect the host system. I won’t be doing that, as my LXCs are one purpose only.
if you wanna do it the proper way, see the official GarageHQ SystemD cookbook.

This is the /etc/systemd/system/garage.service service file I went with, trimming down on the non-priviledged user lines:

[Unit]
Description=Garage Data Store
After=network-online.target
Wants=network-online.target

[Service]
Environment='RUST_LOG=garage=info' 'RUST_BACKTRACE=1'
ExecStart=/usr/local/bin/garage server
LimitNOFILE=42000

[Install]
WantedBy=multi-user.target

Default config

On one of your machines you wanna cluster, run the command below to paste a default config and to generate required secrets:

cat > garage.toml <<EOF
metadata_dir = "/tmp/meta"
data_dir = "/tmp/data"
db_engine = "sqlite"

replication_factor = 1

rpc_bind_addr = "[::]:3901"
rpc_public_addr = "127.0.0.1:3901"
rpc_secret = "$(openssl rand -hex 32)"

[s3_api]
s3_region = "garage"
api_bind_addr = "[::]:3900"
root_domain = ".s3.garage.localhost"

[s3_web]
bind_addr = "[::]:3902"
root_domain = ".web.garage.localhost"
index = "index.html"

[admin]
api_bind_addr = "[::]:3903"
admin_token = "$(openssl rand -base64 32)"
metrics_token = "$(openssl rand -base64 32)"
EOF

You need the exact same secrets to match in the configs of all of your nodes. You may copy it over now from what the command generated at /etc/garage.toml or do it later once you configure the defaults to be more sane than this.

Configuration

Adjusting the config file

Now, edit your config to fit your needs. If you have knowledge about AWS S3 and how interactions with it work either by s3 protocol or by web, it will be easier to understand for you. Let me explain it a bit though anyway.

  • db_engine - change it to lmdb or leave it as sqlite if you specifically need to. It’s best to use LMDB for anything above a single node.
  • replication_factor - change it to however many GarageHQ instances you want to hold the replicated data at a time. Eg. if you have 4 nodes, and you set it to 2, only 2 instances will have the data stored.
  • rpc_public_addr - rpc is the way garage nodes communicate with eachother. Set this to your instance’s IP address followed by the default port of 3901.
  • s3_region - S3 protocol needs a region to calculate its stuff internally. Leave it as “garage”, invent something of your own, or follow the AWS way of naming regions. It’s not really relevant, though you’ll need to remember the name if you access the cluster via S3 api.
  • s3_api root_domain - the domain you set up in DNS earlier, and one that you wish to use for S3 api access to the buckets.
  • s3_web root_domain - same here, but for web HTTP access to the buckets

This is how my config looks like:

My /etc/garage.toml
metadata_dir = "/var/lib/garage/meta"
data_dir = "/var/lib/garage/data"
db_engine = "lmdb"

replication_factor = 2
consistency_mode = "degraded"

rpc_bind_addr = "[::]:3901"
rpc_public_addr = "[2a01:115f:4015:28b0::3900:1]:3901"
rpc_secret = "AUTOGENERATED RPC SECRET"

[s3_api]
s3_region = "europe-central-a"
api_bind_addr = "[::]:3900"
root_domain = ".cdn.fibermouse.xyz"

[s3_web]
bind_addr = "[::]:3902"
root_domain = ".cdn.fibermouse.xyz"
index = "index.html"

[admin]
api_bind_addr = "[::]:3903"
admin_token = "AUTOGENERATED ADMIN SECRET"
metrics_token = "AUTOGENERATED METRICS SECRET"

There is a secret setting I added here - consistency_mode = "degraded"
This causes the GarageHQ to work as Read-Only while one of the nodes is down. This is a bit insecure, but is the only way a two-node cluster can be used for high availability. Leave it on also if you have more than two nodes, or any amount of nodes but in an even-numbered groups where each group is likely to fail for the same reason (eg. 5 nodes at “location-a” and 5 nodes at “location-b”).
To disable this behavior back to default, remove the line completely.

Reminder: You need the exact same secrets and other settings at each node you wanna cluster together. Copy the configs to each of the node and only adjust the rcp_public_addr to be the IP of the specific node.

Firewall

Make sure all of your nodes can see eachother on port 3901/tcp - this is very specific to your network setup. In my case, I allowed the port and the other default ports between nodes on my Juniper SRX300 with the following commit:

Juniper config commit
[edit security address-book global]
     address kgm-git-app01 { ... }
+    address zco-web-prx01 2a01:112f:4500:75b0::443:1/128;
+    address kgm-web-cdn01 2a01:115f:4015:28b0::3900:1/128;
+    address zco-web-cdn01 2a01:112f:4500:75b0::3900:1/128;
[edit security policies from-zone DMZ to-zone TRUST]
      policy permit-sso { ... }
+     policy permit-web-to-zco-web-prx01 {
+         match {
+             source-address any;
+             destination-address zco-web-prx01;
+             application ports_web;
+         }
+         then {
+             permit;
+         }
+     }
+     policy permit-s3-to-zco-web-cdn01 {
+         match {
+             source-address any;
+             destination-address zco-web-cdn01;
+             application ports_s3;
+         }
+         then {
+             permit;
+         }
+     }
[edit applications]
    application ssh-alt { ... }
+   application s3-api {
+       protocol tcp;
+       destination-port 3900;
+   }
+   application s3-web {
+       protocol tcp;
+       destination-port 3902;
+   }
+   application s3-rpc {
+       protocol tcp;
+       destination-port 3901;
+   }
+   application http-alt {
+       protocol tcp;
+       destination-port 8080;
+   }
[edit applications]
    application-set ports_git { ... }
+   application-set ports_s3 {
+       application s3-api;
+       application s3-web;
+       application s3-rpc;
+       application http-alt;
+       application http;
+       application https;
+   }
And on the other location, nftables linux firewall:

NFTables config commit
diff --git a/nftables.conf b/nftables.conf
index 5e18c46..829e938 100755
--- a/nftables.conf
+++ b/nftables.conf
@@ -19,6 +19,7 @@ table inet filter {
        set ports_web { type inet_service; elements = { http, https } }                                         # Ports: Web
        set ports_mgmt { type inet_service; elements = { http, https, 4343, 8006 } }                            # Ports: Management
        set ports_ipa { type inet_service; elements = { http, https, ldap, ldaps, kerberos, kpasswd, domain } } # Ports: FreeIPA
+       set ports_s3 { type inet_service; elements = { 3900, 3901, 3902, http, https, 8080 } }                  # Ports: S3

        chain input {
                type filter hook input priority filter;
@@ -91,6 +92,12 @@ table inet filter {

                        iifname @DMZ oifname @TRUST ip6 daddr 2a01:112f:4500:7510::2222:1 tcp dport ssh accept          comment "DMZ to TRUST zco-git-app01 - ssh/tcp"
                        iifname @DMZ oifname @TRUST ip6 daddr 2a01:112f:4500:7510::2222:1 th dport @ports_web accept    comment "DMZ to TRUST zco-git-app01 - ports_web/any"
+
+                       iifname @DMZ oifname @TRUST ip6 daddr 2a01:115f:4015:28b0::443:1 th dport @ports_web accept     comment "DMZ to TRUST kgm-web-prx01 - ports_web/any"
+                       iifname @DMZ oifname @TRUST ip6 daddr 2a01:115f:4015:28b0::2222:1 th dport @ports_web accept    comment "DMZ to TRUST kgm-git-app01 - ports_web/any"
+                       iifname @DMZ oifname @TRUST ip6 daddr 2a01:115f:4015:28b0::8080:1 tcp dport 8080 accept         comment "DMZ to TRUST kgm-web-api01 - http-alt/any"
+
+                       iifname @DMZ oifname @TRUST ip6 daddr 2a01:115f:4015:28b0::3900:1 tcp dport @ports_s3 accept    comment "DMZ to TRUST kgm-web-cdn01 - ports_s3/tcp"

                # Zone IOT
                        iifname @IOT oifname @IOT accept                comment "Intrazone"

The commits also allowed web ports and the http alternative port 8080 from my reverse proxies for website access from one site to the other, so if one garage node fails, but both sites’ proxies operate, they can still continue.

And why https and 8080?

Secure the S3 traffic

Garage serves S3 api via HTTP - without the S, on port 3900.
Let’s put a Caddy instance on each of the Garage nodes, giving S3 api traffic TLS encryption. After installing Caddy, this can be done very easily with the TLS provider of your choice, your internal CA (like in my case), or by default - LetsEncrypt (requires public internet http access to each server, that’s bad somewhat)

This is my Caddy config. This config also proxies Garage’s web port 3902 behind 8080 in case I’d want some custom behavior or rewrites, and also making the port more standard.

{
        acme_ca https://kgm-ipa-dc01.inf.fibermouse.xyz/acme/directory
        on_demand_tls {
                interval 2m
                burst 10
        }
}

cdn.fibermouse.xyz, cdn1.fibermouse.xyz {
        reverse_proxy localhost:3900 {
                header_up Host {host}
                header_up X-Real-IP {remote}
        }
}

*.cdn.fibermouse.xyz, *.cdn1.fibermouse.xyz {
        tls {
                on_demand
        }
        reverse_proxy localhost:3900 {
                header_up Host {host}
                header_up X-Real-IP {remote}
        }
}

:8080 {
        reverse_proxy localhost:3902 {
                header_up Host {host}
                header_up X-Real-IP {remote}
        }
}

I’ve used my own FreeIPA CA as can be seen on the top. By default, it prevents granting wildcard certs via standard ACME, luckily there is a tls on_demand option, getting me the TLS cert the moment a specific bucket is requested via the hostname. The header_up statements make sure Garage still sees the hostname including the bucket name - important for this to work.

Now, also make sure the ports you’ll be using are passed through your host-based firewall on each node. As I recently switched from UFW to pure NFTables for host-based stuff, here’s my config for the nodes:

#!/usr/bin/nft -f

flush ruleset

define JUMPHOST = { "SECRET IP", "NOT TELLING YA" }

table inet filter {
        chain input {
                type filter hook input priority filter
                policy drop
                ct state established,related accept

                iifname lo accept                                       comment "Allow loopback"

                meta l4proto icmp accept                                comment "Allow ICMP"
                meta l4proto icmpv6 accept                              comment "Allow ICMPv6"

                ip6 saddr $JUMPHOST tcp dport ssh accept                comment "Allow ssh/tcp from JUMPHOST"

                th dport { http, https, 8080 } accept                   comment "Allow ports_web/any"
                tcp dport { 3900, 3901, 3902 } accept                   comment "Allow ports_s3/tcp"
        }
}

As I won’t be using the management API on port 3904, I’ve firewalled it away. I recommend you do the same for security, or to rebind it to be on loopback IP only via the garage.toml config.

Service start

You may now enable and start the systemd service on each of your nodes with:

systemctl enable --now garage.service

To see if all is good, see journalctl -xeu garage for logs and status, and type in garage status to display the state of the soon to be cluster node.

Joining the cluster

Finally, we can connect our nodes together. On one of your nodes (if you already have a cluster and you’re just adding a next node to it, type it on the cluster node and not on the disconnected one) type in:

garage node id

This will show you the node ID, and the full command you need to type in on the other node for it to join the cluster. Run it on the other node.
Now, you should see both/all nodes in garage status.

You now need to assign a layout to the cluster, this chooses the storage size allocated for each node. Type in:

garage layout assign -z $AVAILABILITY_ZONE -c $SIZE $NODE_ID

Where:

  • $AVAILABILITY_ZONE - assigns a tag to the node, to make Garage choose how to replicate, also influenced by the garage.toml replication_factor. I chose my sitecodes of KGM and ZCO as the value, as they’re geographically separate places, perfect for redundancy.
  • $SIZE - storage size you wish to allocate, eg “2T” for 2 TiB, “64G” for 64 GiB etc.
  • $NODE_ID - the node ID you wanna apply the layout to (can be the short node ID from garage status)

See the changes with:

garage layout show

And if all is good, apply it with:

garage layout apply

It’s now all set and ready for use!

# garage status

==== HEALTHY NODES ====
ID                Hostname       Address                             Tags  Zone  Capacity  DataAvail          Version
SECRET-ISH        kgm-web-cdn01  [2a01:115f:4015:28b0::3900:1]:3901  []    KGM   59.6 GiB  64.0 GiB (100.0%)  v2.3.0
SECRET-ISH        zco-web-cdn01  [2a01:112f:4500:75b0::3900:1]:3901  []    ZCO   59.6 GiB  64.0 GiB (100.0%)  v2.3.0

Usage

For the full tutorial, see the official guide at https://garagehq.deuxfleurs.fr/documentation/

Create a bucket

You can create a bucket with

garage bucket create $BUCKET_NAME

Where $BUCKET_NAME is, to your surprise, the name of the bucket.

Web access to it can be enabled with

garage bucket website --allow $BUCKET_NAME -e $ERROR_PAGE

Where $ERROR_PAGE specifies an 404 error page, usually should be 404.html

To list buckets, type:

garage bucket list

Manage keys and access

To create an access key, type:

garage key create $NAME_OF_THE_KEY

Then, copy it where you have a use for it, eg. a .env for AWS-CLI or wherever else.

To grant it access to a bucket, use:

garage bucket allow --read --write $BUCKET_NAME --key $NAME_OF_THE_KEY

This gives the key R+W access to the bucket, use only –read for read-only, obviously.

Operations with aws-cli

As GarageHQ is fully S3 compatible, you can use AWS-CLI to access your storage.

Generate yourself a key, grant it required access, and store the secrets as environment variables - for example for desktop use, you can create a ~/.awsrc file like me:

export AWS_ENDPOINT_URL='https://cdn.fibermouse.xyz'
export AWS_DEFAULT_REGION='europe-central-a'
export AWS_ACCESS_KEY_ID='secret'
export AWS_SECRET_ACCESS_KEY='secret'

Then, if you want to use AWS-CLI, do

source ~/.awsrc

Now you can use AWS-CLI to manage your files

Example
# source ~/.awsrc
# aws s3 ls s3://fibermouse-blog-qa/

                           PRE articles/
                           PRE assets/
                           PRE casts/
                           PRE categories/
                           PRE css/
                           PRE images/
                           PRE js/
                           PRE showcase/
                           PRE tags/
2026-05-06 05:06:24        924 404.html
2026-05-06 05:06:24       7296 android-chrome-192x192.png
2026-05-06 05:06:24      27677 android-chrome-512x512.png
2026-05-06 05:06:24       6521 apple-touch-icon.png
2026-05-06 05:06:24      48952 en.search-data.json
2026-05-06 05:06:24       6968 en.search.min.423a47566acadc2f8b92bbb4ca95334500545268040247c4966cf47d6d0eb1b3.js
2026-05-06 05:06:24        340 favicon-16x16.png
2026-05-06 05:06:24        753 favicon-32x32.png
2026-05-06 05:06:24       3018 favicon-dark.svg
2026-05-06 05:06:24      15406 favicon.ico
2026-05-06 05:06:24       3064 favicon.svg
2026-05-06 05:06:24      34726 index.html
2026-05-06 05:06:24        381 index.xml
2026-05-06 05:06:25        399 site.webmanifest
2026-05-06 05:06:25       1567 sitemap.xml

Usecases

Caddy reverse proxy

I use my GarageHQ cluster to host this website and many others. My reverse proxy is Caddy, thus allowing traffic to my blog bucket looks as follows:

blog.fibermouse.xyz {
        reverse_proxy cdn1.fibermouse.xyz:8080 cdn2.fibermouse.xyz:8080 {
                lb_policy first
                header_up Host fibermouse-blog
}

“fibermouse-blog” is the name of my bucket. Notice the port :8080, as via the per-node Caddy, I use :8080 for static websites. Due to how I set up my DNS, instead of using generic “cdn.fibermouse.xyz:8080, I use cdn1 and cdn2 in lb_policy first, ensuring Caddy first reaches out to the Garage node closest to itself.

Caddy S3 plugin - TLS certificate store

I have two Caddy reverse proxies across two of my locations, ensuring HA of my websites, thus I’ve encountered problems obtaining TLS certs from LetsEncrypt - the DNS records for my websites point to both, so HTTP-01 challenge could randomly fail, as the LetsEncrypt traffic would go to a different Caddy instance than the one requesting the cert.

This can be solved by having a shared pool of TLS certs for use by both Caddys.
Fortunately enough, there is a simple to use plugin, that can allow Caddy to store its certs in S3. It can be compiled into caddy using xcaddy, as it doesn’t get packaged in by default by the distros.

I’ve solved this by automating the process via a Forgejo workflow, compiling the Caddy binary with all the plugins I use. It’s available together with the compiled binary at https://git.fibermouse.xyz/repository/caddy-custom.

With it, in the Caddy config’s global block, it looks like this:

{
        storage s3 {
                host "cdn.fibermouse.xyz"
                bucket "caddy-tls"
                region "europe-central-a"
                access_key {$S3_ACCESS_KEY}
                secret_key {$S3_SECRET_KEY}
        }
}

The access key and the secret, for security purposes, is stored away as an Environment Variable. This can be edited into the systemd service using:

systemctl edit caddy

and including the keys like so:

[Service]
Environment="S3_ACCESS_KEY=youraccesskey"
Environment="S3_SECRET_KEY=yoursecretkey"
Last updated on