Setting up ceph cloud sync module

2021-09-06

Summary :

Since mimic ceph's RGW can be synced with any S3 cloud provider

content

Requirements

  • Ceph cluster
  • 2 RGW daemons running
  • A S3 target

We'll use three endpoints :

Check our existing pool

All rgw have a zone and zonegroup, it's probably default.

You can check the pool's names:

(docker-croit)@mycephcluster / $ ceph osd lspools
1 device_health_metrics
2 .rgw.root
3 default.rgw.control
4 default.rgw.log
5 default.rgw.meta
6 default.rgw.buckets.non-ec
7 default.rgw.buckets.index
8 default.rgw.buckets.data

Or inspect the zone or zonegroup directly:

(docker-croit)@mycephcluster / $ radosgw-admin zone get --rgw-zone=default
{
    "id": "303a00f5-f50d-43fd-afee-aa0503926952",
    "name": "default",
...
}


(docker-croit)@mycephcluster / $ radosgw-admin zonegroup get --rgw-zonegroup=default
{
    "id": "881cf806-f6d2-47a0-b7dc-d65ee87f8ef4",
    "name": "default",
    "api_name": "default",
    "is_master": "true",
...
    "zones": [
        {
            "id": "303a00f5-f50d-43fd-afee-aa0503926952",
            "name": "default",

Prepare pools

Our new zone will need some pools. We'll create them manually to ensure there are no issues (e.g. too many PG per osd) blocking their creation.

(docker-croit)@mycephcluster / $ for pool in sync.rgw.meta sync.rgw.log sync.rgw.control sync.rgw.buckets.non-ec sync.rgw.buckets.index sync.rgw.buckets.data; do ceph osd pool create $pool 16 16 replicated; done
pool 'sync.rgw.meta' created
pool 'sync.rgw.log' created
pool 'sync.rgw.control' created
pool 'sync.rgw.buckets.non-ec' created
pool 'sync.rgw.buckets.index' created
pool 'sync.rgw.buckets.data' created

Create the new zone

Our new zone will be named sync:

(docker-croit)@mycephcluster / $ radosgw-admin zone create --rgw-zonegroup=default --rgw-zone=sync --endpoints=http://192.168.112.6/ --tier-type=cloud
{
    "id": "7ead9532-0938-4698-9b4a-2d84d0d00869",
    "name": "sync",
    "domain_root": "sync.rgw.meta:root",
    "control_pool": "sync.rgw.control",
    "gc_pool": "sync.rgw.log:gc",
    "lc_pool": "sync.rgw.log:lc",
    "log_pool": "sync.rgw.log",
    "intent_log_pool": "sync.rgw.log:intent",
    "usage_log_pool": "sync.rgw.log:usage",
    "roles_pool": "sync.rgw.meta:roles",
    "reshard_pool": "sync.rgw.log:reshard",
    "user_keys_pool": "sync.rgw.meta:users.keys",
    "user_email_pool": "sync.rgw.meta:users.email",
    "user_swift_pool": "sync.rgw.meta:users.swift",
    "user_uid_pool": "sync.rgw.meta:users.uid",
    "otp_pool": "sync.rgw.otp",
    "system_key": {
        "access_key": "",
        "secret_key": ""
    },
    "placement_pools": [
        {
            "key": "default-placement",
            "val": {
                "index_pool": "sync.rgw.buckets.index",
                "storage_classes": {
                    "STANDARD": {
                        "data_pool": "sync.rgw.buckets.data"
                    }
                },
                "data_extra_pool": "sync.rgw.buckets.non-ec",
                "index_type": 0
            }
        }
    ],
    "realm_id": "46669d35-f7ed-4374-8247-2b8f41218109"
}
  • rgw-zonegroup: our new zone will be part of the default zonegroup.
  • endpoints: Our new zone needs it's own RGW, so it uses a new endpoint.
  • tier-type : We use a cloud tier type, see documentation for more settings

Modify existing zone

We need to add the endpoint of our existing default zone.

(docker-croit)@mycephcluster / $ radosgw-admin zone modify --rgw-zonegroup=default --rgw-zone=default --endpoints=http://192.168.112.5:80
{
    "id": "303a00f5-f50d-43fd-afee-aa0503926952",
    "name": "default",
    "domain_root": "default.rgw.meta:root",
    "control_pool": "default.rgw.control",
    "gc_pool": "default.rgw.log:gc",
    "lc_pool": "default.rgw.log:lc",
    "log_pool": "default.rgw.log",
    "intent_log_pool": "default.rgw.log:intent",
    "usage_log_pool": "default.rgw.log:usage",
    "roles_pool": "default.rgw.meta:roles",
    "reshard_pool": "default.rgw.log:reshard",
    "user_keys_pool": "default.rgw.meta:users.keys",
    "user_email_pool": "default.rgw.meta:users.email",
    "user_swift_pool": "default.rgw.meta:users.swift",
    "user_uid_pool": "default.rgw.meta:users.uid",
    "otp_pool": "default.rgw.otp",
    "system_key": {
        "access_key": "",
        "secret_key": ""
    },
    "placement_pools": [
        {
            "key": "default-placement",
            "val": {
                "index_pool": "default.rgw.buckets.index",
                "storage_classes": {
                    "STANDARD": {
                        "data_pool": "default.rgw.buckets.data"
                    }
                },
                "data_extra_pool": "default.rgw.buckets.non-ec",
                "index_type": 0
            }
        }
    ],
    "realm_id": "46669d35-f7ed-4374-8247-2b8f41218109"
}

Create a system user

A system user will be used to sync data. If you use croit, this user has to be created through CLI.

(docker-croit)@mycephcluster / $ radosgw-admin user create --uid=syncuser --display-name=syncuser --system
{
    "user_id": "syncuser",
    "display_name": "syncuser",
    "email": "",
    "suspended": 0,
    "max_buckets": 1000,
    "subusers": [],
    "keys": [
        {
            "user": "syncuser",
            "access_key": "VGIF31FGOHZ0Q6MQRBQR",
            "secret_key": "1FwPZH0ICfV1e1zi8okXApJJJEB0XHfiOxe1mmTr"
        }
    ],
    "swift_keys": [],
    "caps": [],
    "op_mask": "read, write, delete",
    "system": "true",
    "default_placement": "",
    "default_storage_class": "",
    "placement_tags": [],
    "bucket_quota": {
        "enabled": false,
        "check_on_raw": false,
        "max_size": -1,
        "max_size_kb": 0,
        "max_objects": -1
    },
    "user_quota": {
        "enabled": false,
        "check_on_raw": false,
        "max_size": -1,
        "max_size_kb": 0,
        "max_objects": -1
    },
    "temp_url_keys": [],
    "type": "rgw",
    "mfa_ids": []
}

Configure sync zones to use this system user

We'll change both zones to use our new system user.

(docker-croit)@mycephcluster / $ radosgw-admin user info --uid syncuser| jq '.keys'
[
  {
    "user": "syncuser",
    "access_key": "VGIF31FGOHZ0Q6MQRBQR",
    "secret_key": "1FwPZH0ICfV1e1zi8okXApJJJEB0XHfiOxe1mmTr"
  }
]


(docker-croit)@mycephcluster / $ radosgw-admin zone modify --rgw-zonegroup=default --rgw-zone=default --access-key=VGIF31FGOHZ0Q6MQRBQR --secret=1FwPZH0ICfV1e1zi8okXApJJJEB0XHfiOxe1mmTr
{
    "id": "303a00f5-f50d-43fd-afee-aa0503926952",
    "name": "default",
    "domain_root": "default.rgw.meta:root",
    "control_pool": "default.rgw.control",
    "gc_pool": "default.rgw.log:gc",
    "lc_pool": "default.rgw.log:lc",
    "log_pool": "default.rgw.log",
    "intent_log_pool": "default.rgw.log:intent",
    "usage_log_pool": "default.rgw.log:usage",
    "roles_pool": "default.rgw.meta:roles",
    "reshard_pool": "default.rgw.log:reshard",
    "user_keys_pool": "default.rgw.meta:users.keys",
    "user_email_pool": "default.rgw.meta:users.email",
    "user_swift_pool": "default.rgw.meta:users.swift",
    "user_uid_pool": "default.rgw.meta:users.uid",
    "otp_pool": "default.rgw.otp",
    "system_key": {
        "access_key": "VGIF31FGOHZ0Q6MQRBQR",
        "secret_key": "1FwPZH0ICfV1e1zi8okXApJJJEB0XHfiOxe1mmTr"
    },
    "placement_pools": [
        {
            "key": "default-placement",
            "val": {
                "index_pool": "default.rgw.buckets.index",
                "storage_classes": {
                    "STANDARD": {
                        "data_pool": "default.rgw.buckets.data"
                    }
                },
                "data_extra_pool": "default.rgw.buckets.non-ec",
                "index_type": 0
            }
        }
    ],
    "realm_id": "46669d35-f7ed-4374-8247-2b8f41218109"
}


(docker-croit)@mycephcluster / $ radosgw-admin zone modify --rgw-zonegroup=default --rgw-zone=sync --access-key=VGIF31FGOHZ0Q6MQRBQR --secret=1FwPZH0ICfV1e1zi8okXApJJJEB0XHfiOxe1mmTr
{
    "id": "7ead9532-0938-4698-9b4a-2d84d0d00869",
    "name": "sync",
    "domain_root": "sync.rgw.meta:root",
    "control_pool": "sync.rgw.control",
    "gc_pool": "sync.rgw.log:gc",
    "lc_pool": "sync.rgw.log:lc",
    "log_pool": "sync.rgw.log",
    "intent_log_pool": "sync.rgw.log:intent",
    "usage_log_pool": "sync.rgw.log:usage",
    "roles_pool": "sync.rgw.meta:roles",
    "reshard_pool": "sync.rgw.log:reshard",
    "user_keys_pool": "sync.rgw.meta:users.keys",
    "user_email_pool": "sync.rgw.meta:users.email",
    "user_swift_pool": "sync.rgw.meta:users.swift",
    "user_uid_pool": "sync.rgw.meta:users.uid",
    "otp_pool": "sync.rgw.otp",
    "system_key": {
        "access_key": "VGIF31FGOHZ0Q6MQRBQR",
        "secret_key": "1FwPZH0ICfV1e1zi8okXApJJJEB0XHfiOxe1mmTr"
    },
    "placement_pools": [
        {
            "key": "default-placement",
            "val": {
                "index_pool": "sync.rgw.buckets.index",
                "storage_classes": {
                    "STANDARD": {
                        "data_pool": "sync.rgw.buckets.data"
                    }
                },
                "data_extra_pool": "sync.rgw.buckets.non-ec",
                "index_type": 0
            }
        }
    ],
    "tier_config": {
        "connection": {
            "access_key": "JO4RQ1787A6OGI6XMFDW",
            "endpoint": "http://192.168.105.5:80",
            "secret": "Dx5kKGUUeR0DaSRYueBWhV6oDRvJ9oXH2gPcVJ6s"
        }
    },
    "realm_id": "46669d35-f7ed-4374-8247-2b8f41218109"
}

Ensure default zone is the master

(docker-croit)@mycephcluster / $ radosgw-admin zonegroup get
{
    "id": "881cf806-f6d2-47a0-b7dc-d65ee87f8ef4",
    "name": "default",
    "api_name": "default",
    "is_master": "true",
    "endpoints": [],
    "hostnames": [],
    "hostnames_s3website": [],
    "master_zone": "303a00f5-f50d-43fd-afee-aa0503926952",
    "zones": [
        {
            "id": "303a00f5-f50d-43fd-afee-aa0503926952",
            "name": "default",

If the default zone is not the master, you can force it by executing
radosgw-admin zone modify --rgw-zonegroup=default --rgw-zone=default --master --default

Commit changes and verify configuration

(docker-croit)@mycephcluster / $ radosgw-admin period update --commit
{
    "id": "1861622f-b748-410d-b4a9-7338f4b6842b",
    "epoch": 3,
    "predecessor_uuid": "b6cd42db-6567-4a4b-9433-aee238da0c9d",
    "sync_status": [],
    "period_map": {
        "id": "1861622f-b748-410d-b4a9-7338f4b6842b",
        "zonegroups": [
            {
                "id": "881cf806-f6d2-47a0-b7dc-d65ee87f8ef4",
                "name": "default",
                "api_name": "default",
                "is_master": "true",
                "endpoints": [],
                "hostnames": [],
                "hostnames_s3website": [],
                "master_zone": "303a00f5-f50d-43fd-afee-aa0503926952",
                "zones": [
                    {
                        "id": "303a00f5-f50d-43fd-afee-aa0503926952",
                        "name": "default",
                        "endpoints": [
                            "http://192.168.112.5:80"
                        ],
                        "log_meta": "false",
                        "log_data": "true",
                        "bucket_index_max_shards": 11,
                        "read_only": "false",
                        "tier_type": "",
                        "sync_from_all": "true",
                        "sync_from": [],
                        "redirect_zone": ""
                    },
                    {
                        "id": "7ead9532-0938-4698-9b4a-2d84d0d00869",
                        "name": "sync",
                        "endpoints": [
                            "http://192.168.112.6/"
                        ],
                        "log_meta": "false",
                        "log_data": "true",
                        "bucket_index_max_shards": 11,
                        "read_only": "false",
                        "tier_type": "cloud",
                        "sync_from_all": "true",
                        "sync_from": [],
                        "redirect_zone": ""
                    }
                ],
                "placement_targets": [
                    {
                        "name": "default-placement",
                        "tags": [],
                        "storage_classes": [
                            "STANDARD"
                        ]
                    }
                ],
                "default_placement": "default-placement",
                "realm_id": "46669d35-f7ed-4374-8247-2b8f41218109",
                "sync_policy": {
                    "groups": []
                }
            }
        ],
        "short_zone_ids": [
            {
                "key": "303a00f5-f50d-43fd-afee-aa0503926952",
                "val": 2796720163
            },
            {
                "key": "7ead9532-0938-4698-9b4a-2d84d0d00869",
                "val": 2175446857
            }
        ]
    },
    "master_zonegroup": "881cf806-f6d2-47a0-b7dc-d65ee87f8ef4",
    "master_zone": "303a00f5-f50d-43fd-afee-aa0503926952",
    "period_config": {
        "bucket_quota": {
            "enabled": false,
            "check_on_raw": false,
            "max_size": -1,
            "max_size_kb": 0,
            "max_objects": -1
        },
        "user_quota": {
            "enabled": false,
            "check_on_raw": false,
            "max_size": -1,
            "max_size_kb": 0,
            "max_objects": -1
        }
    },
    "realm_id": "46669d35-f7ed-4374-8247-2b8f41218109",
    "realm_name": "default",
    "realm_epoch": 2
}

Configure the new zone

Our cloud sync module needs some configuration.

We'll define the endpoints and S3 user credentials that will be used to sync data. Take care: If your key starts with a 0 you will be unable to configure it. For example, the access key 05XXXXXXXX would be stored incorrectly without the leading 0:

(docker-croit)@mycephcluster / $ radosgw-admin zone modify --rgw-zonegroup=default --rgw-zone=sync --tier-config=connection.endpoint=http://192.168.105.5:80,connection.access_key=05XXXXXXXX,connection.secret=56NwS1p7krU0IMYaXXXXXXXXXXXXX
(docker-croit)@mycephcluster / $ radosgw-admin zone get --rgw-zone=sync | jq '.tier_config'
{
  "connection": {
    "access_key": 5,
    "endpoint": "http://192.168.105.5:80",
    "secret": 56NwS1p7krU0IMYaXXXXXXXXXXXXX
  }
}


(docker-croit)@mycephcluster / $ radosgw-admin zone modify --rgw-zonegroup=default --rgw-zone=sync --tier-config=connection.endpoint=http://192.168.105.5:80,connection.access_key=JO4RQ1787A6OGI6XMFDW,connection.secret=Dx5kKGUUeR0DaSRYueBWhV6oDRvJ9oXH2gPcVJ6s
{
    "id": "7ead9532-0938-4698-9b4a-2d84d0d00869",
    "name": "sync",
    "domain_root": "sync.rgw.meta:root",
    "control_pool": "sync.rgw.control",
    "gc_pool": "sync.rgw.log:gc",
    "lc_pool": "sync.rgw.log:lc",
    "log_pool": "sync.rgw.log",
    "intent_log_pool": "sync.rgw.log:intent",
    "usage_log_pool": "sync.rgw.log:usage",
    "roles_pool": "sync.rgw.meta:roles",
    "reshard_pool": "sync.rgw.log:reshard",
    "user_keys_pool": "sync.rgw.meta:users.keys",
    "user_email_pool": "sync.rgw.meta:users.email",
    "user_swift_pool": "sync.rgw.meta:users.swift",
    "user_uid_pool": "sync.rgw.meta:users.uid",
    "otp_pool": "sync.rgw.otp",
    "system_key": {
        "access_key": "",
        "secret_key": ""
    },
    "placement_pools": [
        {
            "key": "default-placement",
            "val": {
                "index_pool": "sync.rgw.buckets.index",
                "storage_classes": {
                    "STANDARD": {
                        "data_pool": "sync.rgw.buckets.data"
                    }
                },
                "data_extra_pool": "sync.rgw.buckets.non-ec",
                "index_type": 0
            }
        }
    ],
    "tier_config": {
        "connection": {
            "access_key": "JO4RQ1787A6OGI6XMFDW",
            "endpoint": "http://192.168.105.5:80",
            "secret": "Dx5kKGUUeR0DaSRYueBWhV6oDRvJ9oXH2gPcVJ6s"
        }
    },
    "realm_id": "46669d35-f7ed-4374-8247-2b8f41218109"
}

Check that the config has been properly applied.

(docker-croit)@mycephcluster / $ radosgw-admin zone get --rgw-zone=sync | jq '.tier_config'
{
  "connection": {
    "access_key": "JO4RQ1787A6OGI6XMFDW",
    "endpoint": "http://192.168.105.5:80",
    "secret": "Dx5kKGUUeR0DaSRYueBWhV6oDRvJ9oXH2gPcVJ6s"
  }
}

Commit changes

(docker-croit)@mycephcluster / $ radosgw-admin zone get --rgw-zone=sync
{
    "id": "7ead9532-0938-4698-9b4a-2d84d0d00869",
    "name": "sync",
    "domain_root": "sync.rgw.meta:root",
    "control_pool": "sync.rgw.control",
    "gc_pool": "sync.rgw.log:gc",
    "lc_pool": "sync.rgw.log:lc",
    "log_pool": "sync.rgw.log",
    "intent_log_pool": "sync.rgw.log:intent",
    "usage_log_pool": "sync.rgw.log:usage",
    "roles_pool": "sync.rgw.meta:roles",
    "reshard_pool": "sync.rgw.log:reshard",
    "user_keys_pool": "sync.rgw.meta:users.keys",
    "user_email_pool": "sync.rgw.meta:users.email",
    "user_swift_pool": "sync.rgw.meta:users.swift",
    "user_uid_pool": "sync.rgw.meta:users.uid",
    "otp_pool": "sync.rgw.otp",
    "system_key": {
        "access_key": "",
        "secret_key": ""
    },
    "placement_pools": [
        {
            "key": "default-placement",
            "val": {
                "index_pool": "sync.rgw.buckets.index",
                "storage_classes": {
                    "STANDARD": {
                        "data_pool": "sync.rgw.buckets.data"
                    }
                },
                "data_extra_pool": "sync.rgw.buckets.non-ec",
                "index_type": 0
            }
        }
    ],
    "tier_config": {
        "connection": {
            "access_key": "JO4RQ1787A6OGI6XMFDW",
            "endpoint": "http://192.168.105.5:80",
            "secret": "Dx5kKGUUeR0DaSRYueBWhV6oDRvJ9oXH2gPcVJ6s"
        }
    },
    "realm_id": "46669d35-f7ed-4374-8247-2b8f41218109"
}

Configuring RGW

We need to modify each radosgw configuration to manage the right zone by removing rgw zone = default and adding

[client.rgw.$(hostname)]
host = $(hostname)
rgw zone = default

[client.rgw.$(hostname)]
host = $(hostname)
rgw zone = sync

On our infrastructure, we edit /etc/ceph/ceph.conf by adding :

[client.rgw.new-croit-host-C0DE01]
host = new-croit-host-C0DE01
rgw zone = default
[client.rgw.new-croit-host-C0DE02]
host = new-croit-host-C0DE02
rgw zone = sync

If you use croit, you can simply replace ceph.conf template with this content.

[global]
auth cluster required = cephx
auth service required = cephx
auth client required = cephx
auth supported = cephx
mon client hunt interval = {{huntInterval}}
mon client hunt parallel = {{huntParallel}}
fsid = {{fsid}}
mon host = {{commaSeparatedList mons}}

{{~#if full}}
{{~#unless server.managementHost}}
crush location = host={{server.hostname}}
{{~/unless}}

{{~#if publicNets}}
public network = {{publicNets}}
{{~/if}}

{{~#if privateNets}}
cluster network = {{privateNets}}
{{~/if}}
log file = /dev/null
mon cluster log file = /dev/null
mon cluster log to syslog = true
log to syslog = true
err to syslog = true

{{~#replaceAll "rgw zone = default" ""~}}
{{~#options}}
{{key}} = {{value}}
{{~/options}}
{{~/replaceAll}}

[client.rgw.new-croit-host-C0DE01]
host = new-croit-host-C0DE01
rgw zone = default
[client.rgw.new-croit-host-C0DE02]
host = new-croit-host-C0DE02
rgw zone = sync
{{~/if}}

To apply the changes, you must restart the RGWs.

root@new-croit-host-C0DE01 ~ $ systemctl restart ceph-radosgw@rgw.new-croit-host-C0DE01.service
root@new-croit-host-C0DE02 ~ $ systemctl restart ceph-radosgw@rgw.new-croit-host-C0DE02.service

Test S3 sync

We'll use s3cmd for testing.

Generate users on both source and target and configure their s3cmd configuration file. Check API access works.

~ s3cmd -c .s3cfg_source ls
~ s3cmd -c .s3cfg_target ls
~

If you have an API error error, check your crendentials, endpoints.

Create a bucket on source and add objects

We first create a bucket:

~ s3cmd -c .s3cfg_source mb s3://mystetbucket
Bucket 's3://mystetbucket/' created
~ s3cmd -c .s3cfg_source ls
2021-06-24 15:06  s3://mystetbucket

Now we add an object on the source:

~ s3cmd -c .s3cfg_source put /tmp/myobject s3://mystetbucket/synctest                                                                                                                                                                                 64
WARNING: Module python-magic is not available. Guessing MIME types based on file extensions.
upload: '/tmp/myobject' -> 's3://mystetbucket/synctest'  [1 of 1]
 13 of 13   100% in    0s   325.90 B/s  done

And check if it's synced to the target:

~ s3cmd -c .s3cfg_target ls s3://
2021-06-24 15:30  s3://rgw-default-271b93c16a9565d8
~ s3cmd -c .s3cfg_target ls s3://rgw-default-271b93c16a9565d8
                          DIR  s3://rgw-default-271b93c16a9565d8/mystetbucket/
~ s3cmd -c .s3cfg_target ls s3://rgw-default-271b93c16a9565d8/mystetbucket/
2021-06-24 15:36           13  s3://rgw-default-271b93c16a9565d8/mystetbucket/synctest

Tips and tricks

At any time you can increase debug logging for easier debugging:

root@new-croit-host-C0DE02 ~ $ ceph --admin-daemon /var/run/ceph/ceph-client.rgw.new-croit-host-C0DE02.96866.94534872347832.asok config set debug_rgw_sync 5
root@new-croit-host-C0DE02 ~ $ ceph --admin-daemon /var/run/ceph/ceph-client.rgw.new-croit-host-C0DE02.96866.94534872347832.asok config set debug_rgw 5

To check syncing status:

(docker-croit)@mycephcluster / $ radosgw-admin sync status --rgw-zone=sync
          realm 46669d35-f7ed-4374-8247-2b8f41218109 (default)
      zonegroup 881cf806-f6d2-47a0-b7dc-d65ee87f8ef4 (default)
           zone 7ead9532-0938-4698-9b4a-2d84d0d00869 (sync)
  metadata sync syncing
                full sync: 0/64 shards
                incremental sync: 64/64 shards
                metadata is caught up with master
      data sync source: 303a00f5-f50d-43fd-afee-aa0503926952 (default)
                        syncing
                        full sync: 0/128 shards
                        incremental sync: 128/128 shards
                        data is caught up with source

References