Exploring Block Storage Options in the Ceph Ecosystem: RBD, NVMe-oF, and iSCSI
The Ceph ecosystem has gained popularity for its robust and scalable storage solutions. As an open-source storage platform, Ceph provides excellent block storage solutions through protocols such as RBD, NVMe-oF, and iSCSI. We recently added support for NVMe-oF to the croit deployment and management solution for Ceph, and we want to use this opportunity to talk about these block storage options, highlighting their features and use cases in this blog post.
RBD (RADOS Block Device)
RBD is a reliable and flexible block storage solution within the Ceph ecosystem. It provides block storage by leveraging Ceph's distributed architecture, ensuring high availability and redundancy.
Key Features
- Scalability: RBD is highly scalable, allowing you to increase storage capacity without downtime.
- High Availability: With Ceph’s distributed nature, RBD provides high availability and resilience against hardware failures.
- Snapshots and Clones: RBD supports efficient snapshots and clones, which are useful for backup and testing environments.
- Integration: RBD integrates seamlessly with various virtualization platforms like QEMU/KVM, OpenStack and Kubernetes.
Use Cases
- Databases: Suitable for database storage where high availability and data integrity are crucial.
- Virtual Machine Storage: Ideal for storing virtual machine images due to its scalability and performance.
- RBD is a native and easy to integrate storage backend for other solutions like Proxmox, OpenStack, Kubernetes, OpenNebula and many others
RBD can easily be used within the croit UI. It just requires the creation of a pool and of one or more RBD images as described here:
iSCSI (Internet Small Computer Systems Interface)
iSCSI is a network protocol that allows the transmission of SCSI commands over IP networks. It provides a cost-effective way to facilitate block-level storage in a networked environment.
Key Features
- Compatibility: iSCSI works over existing IP networks, making it compatible with most network infrastructures.
- Easy Integration: Easily integrates with existing storage systems and can be used to extend storage to servers without direct-attached storage.
Use Cases
- Remote Storage: Enables remote storage access, making it suitable for distributed environments, especially if more modern options are not supported or available (e.g. NVMe/TCP is supported by VMware vSphere Hypervisor (ESXi) 7.0U3 or later1).
The iSCSI gateway is no longer maintained as of November 2022. This means that it is no longer in active development and will not be updated to add new features.2
NVMe-oF (Non-Volatile Memory Express over Fabrics)
NVMe-oF is an advanced protocol that extends the benefits of NVMe over a network fabric. It offers low-latency, high-performance storage solutions by leveraging NVMe’s efficient access methods over network fabrics like Ethernet, Fibre Channel, and InfiniBand.
Key Features
- NVMe-oF can use both TCP and RDMA (Remote Direct Memory Access) for data transmission.
- TCP: Utilizing TCP allows NVMe-oF to run over existing Ethernet networks, making it a cost-effective and easily deployable solution. This method provides a good balance between performance and network compatibility.
- RDMA: Using RDMA-capable fabrics, such as RoCE (RDMA over Converged Ethernet) or InfiniBand, enables NVMe-oF to achieve maximum performance with extremely low latency and high throughput. RDMA reduces CPU overhead and improves I/O efficiency by allowing direct memory access from the storage device to the host.
- High Performance: Provides low-latency and high-throughput storage access.
- Multipathing: Enables highly available access to scalable storage across multiple paths in a network, extending beyond direct-attached storage limitations.
Use Cases
- High-Performance Computing (HPC): Ideal for environments requiring massive data throughput and low latency.
- Data Centers: Enhances the performance of virtualized environments and shared storage resources.
- AI and Machine Learning: Accelerates data access for AI model training and machine learning algorithms.
- Replacement for iSCSI: NVMe-oF offers the same features as iSCSI with higher performance and lower overheads.
NVMe-oF can easily be deployed with the croit management software. You can find the instructions on how to deploy NVMe-oF through here: Setting up NVMe-oF - croit
Why You Should Use NVMe-oF Instead of iSCSI
- Active Development: iSCSI is no longer being actively developed in Ceph. This means that while iSCSI can still be used, it may not receive the latest features, optimizations, or improvements that are being actively incorporated into NVMe-oF. Choosing NVMe-oF ensures that your storage infrastructure benefits from ongoing advancements and support within the Ceph ecosystem.
- Designed for Fast Storage: NVMe and NVMe-oF were specifically designed for fast storage, unlike iSCSI. NVMe-oF provides significantly better performance, making more efficient use of modern high-speed storage devices. This results in lower latency and higher throughput, which are critical for performance-intensive applications.
- Ecosystem Support: NVMe-oF is being rapidly adopted and supported by a wide range of hardware and software vendors. This broad industry support ensures better compatibility, future-proofing, more options for hardware and software integration, and a more vibrant ecosystem overall. (e.g. NVMe/TCP is supported by VMware vSphere Hypervisor (ESXi) 7.0U3 or later1, Windows Server vNext will natively support NVMe-oF)
- Improved Data Services: NVMe-oF supports advanced data services such as end-to-end data protection, advanced error recovery, and efficient namespace management, which are not as robust in iSCSI implementations.
- Lower Latency Networking Options: NVMe-oF can utilize low-latency networking options like RoCE (RDMA over Converged Ethernet) and InfiniBand, which provide faster data transfer rates and reduced latency compared to traditional Ethernet networks used by iSCSI.
By considering these factors, it's clear that NVMe-oF offers several advantages over iSCSI, making it the preferred choice for modern, high-performance storage solutions.
Implementation Tips
Implementing RBD, iSCSI, and NVMe-oF in a Ceph environment can be streamlined with a few key best practices:
- Network Optimization: To ensure a robust network configuration to avoid bottlenecks and maximize performance:
- Use high-speed networks such as 25GbE or better.
- Use multiple active network interfaces for redundancy either bonded or by multipathing.
- Adjust network settings, such as MTU size and queue depths, to optimize performance for high-throughput and low-latency applications.
- Use dedicated network interfaces for iSCSI and NVMe-oF traffic to isolate it from other network traffic and enhance reliability.
- Proper Pool Configuration: Create multiple pools with different replication levels and performance characteristics to match your workload requirements.
- Regular Monitoring: Use Ceph's monitoring tools to monitor cluster health and performance, making adjustments as needed.
- Regular Backups: Implement a robust backup strategy for Ceph RBDs to ensure data integrity and availability in case of failures.
- Use Multipathing: Enable multipath I/O for redundancy and load balancing in your iSCSI and NVMe-oF targets and clients. Multipathing for NVMe-oF is enabled by default in croit NVMe-oF targets.
- Scalability Planning: Plan for scalability by designing your NVMe-oF deployment with future growth in mind, ensuring that the infrastructure can handle increasing workloads.
- Leverage RDMA-Capable Fabrics: Utilize RDMA for NVMe-oF capable network fabrics like RoCE (RDMA over Converged Ethernet) to achieve maximum performance and low latency.
By following these tips, you can effectively implement and manage iSCSI and NVMe-oF in your Ceph environment, ensuring optimal performance, reliability, and scalability.
Important Disclaimer: Hardware Requirements for NVMe-oF
Before implementing NVMe-oF in your storage infrastructure, It is important to ensure your hardware being used for a NVMEoF gateway is recent enough to support avx2 CPU extensions. Generally most server CPUs from the last 10 years will support this.
To check if your system's processor supports AVX2, you can use the following command in your terminal:
lscpu | grep avx2
In the results returned from this command you should ensure that AVX2 is listed. If not, your hardware will not support NVMe-oF, and you might consider upgrading your hardware or consulting with a professional to explore your options.
Ensuring proper hardware compatibility is essential for the successful deployment and optimal performance of NVMe-oF in your storage environment.
Sources:
1: IBM Storage Ceph - Configuring the NVMe-oF initiator for VMware ESXi