Software Release v2002

Network Monitoring and Quality of Life Improvements

This is our third release for Ceph Nautilus, featuring a lot of new features and user experience enhancements. This release is bigger than usual: our developers made a total of 431 changes for you.

Please read the upgrade notes at the end of this post carefully before upgrading.

New server image

There’s a new server image with the latest Ceph 14.2.7 release available now.

More information on the main dashboard

We now show an IOPS graph and details on the CephFS usage in the main dashboard.

Network Monitoring

Troubleshooting your network? You would probably start with running ping, wouldn’t it be great if that data was always available? We added a background ping process that continuously monitors the network latency between all servers in a selected network.

This feature is still in its infancy, expect more controls and data visualizations in the next release! This feature requires an image with Ceph 14.2.6 or later.

Give us feedback

Giving us feedback or requesting help is easier than ever. You can now report a suggestion or problem from within our web UI. The report includes a screenshot and the health state of your Ceph cluster so we can quickly identify the problem and help you.

Improved cluster import

Importing an existing cluster is now easier than ever. croit can now detect many Ceph monitor installations that are embedded in an OS installation on existing disks and automatically import these. We’ve also improved importing old ceph-disk OSDs from existing clusters: we automatically adjust permissions if necessary and fix up symbolic links pointing to journal partitions.

Simpler manual configuration backups

Backing up your croit and Ceph configuration was always easy: simply enable our fully encrypted cloud backup and you are good to go. This release brings a simple new API to download an unencrypted backup that can be restored by uploading it in the setup of a new deployment.

More checks for common suboptimal configurations

We’ve added more checks for common bad configurations that we’ve encountered in the wild. For example, croit now checks the configured cache memory usage against the memory that you actually have in your servers. Other checks make sure that you do not accidentally put metadata onto HDDs when SSDs are available and that the right disk types are used.

Enhanced management of Ceph keys

Permissions on Ceph keys can be rather complex, especially if RADOS namespaces are involved. We’ve revamped our UI to represent complex permissions better.

Custom MTU settings

Added support for networks with a custom MTU (jumbo frames). Note that this is mainly for clusters added to existing networks, we do not recommend jumbo frames for greenfield deployments

CephFS client listings

Added a list of currently connected CephFS clients.

Ceph crash tracking

Recent crashes of Ceph daemons are now shown in the UI, they often indicate an imminent disk failure.

Smaller changes

  • Disk IO requirements of our management node have been reduced
  • Debugging a custom event script? You can now run it directly from the UI on a selected server and get the output
  • Added more event script examples: Monitoring via SNMP and Check_MK
  • Services now report their uptime, detailed state (e.g., active vs. standby MDS), and memory usage.
  • Support for Seagate MACH.2 disks
  • Health checks for rolling reboot have been improved: they were too strict in many scenarios
  • Health checks for rolling reboot of iSCSI services have been fixed, they no longer rely on a fixed timeout
  • Support for Luminous clusters has been improved (for imported clusters)
  • You can now directly upgrade from Luminous to Nautilus
  • Startup order of services after full power outages has been improved
  • S3 explorer got a few minor improvements related to objects
  • Using SSH from within the docker container is easier and safer than ever: we now automatically fill the known_hosts file with the host keys of the servers (our embedded SSH client always did this check)
  • All hosts are now added to /etc/hosts so you can always use hostnames to refer to your servers
  • iSCSI disks are now protected from accidental changes on the RBD level that would confuse the gateway
  • You can now add unmanaged IPMI interfaces that are using custom credentials
  • IPMI support for HP servers has been improved
  • Ceph configuration settings have been tuned for setups involving deletion of a large number of snapshots
  • Fixed a few minor issues when RGW is running in a multi-site configuration
  • IP selection dialogs now show the network type of an IP (e.g. Ceph frontend vs. backend)

Upgrade notes

As always: ensure that you have a working backup (like our encrypted cloud backup) before upgrading the container.

docker rm -f croit
docker run --net=host --restart=always --volumes-from croit-data --name croit -d croit/croit:2002

When upgrading from v1901 or earlier or if you are not yet running Ceph Nautilus: Please refer to the upgrade instructions in release notes of our last release.

API Changes

We made two incompatible changes in our API: the disk wipe API was changed and moved to /disks and the Ceph key management API has been revamped for better handling of complex permissions. See our API docs for details. An OpenAPI specification is also available from your deployment at /api/swagger.json