Virtualization is no longer a bleeding edge technology; it has become the standard for deploying new infrastructure. In order to support many of the features and efficiencies of what virtualization promises, shared storage must be used. Storage can be virtualized as well and offers some additional features
in regards to optimization and availability. This paper is not intended to be a formal "Best Practices" document but rather an informal design guide of what I have found works best in general for deploying virtual infrastructure connected to shared storage using iSCSI.
What is iSCSI?
Shared storage can be connected using many different protocols and interfaces. I will not compare and contrast these options but rather will focus on using iSCSI on gigabit ip infrastructure. It is debatable whether iSCSI is the best performing option today but it is generally accepted that it is the most cost efficient to deploy and maintain.
iSCSI is a block level storage protocol using SCSI commands (the same used in directed attached storage) encapsulated in ip packets. It runs on regular ip network gear and does not require anything special although it is best to deploy it on gigabit infrastructure. It was initially developed in 1998 and standardized in 2004.
This makes it an excellent choice for smaller virtual deployments under 100 vm's. I personally have deployed infrastructure with 50 vm's with plenty of room for growth. Of course everything is relative and it depends on the virtualized workload and the I/O per second (iops) that can be supported by the san. Again, this is not a sizing document but is meant to provide some assurance that iSCSI is an extremely viable solution in such environments and should not be considered "bleeding edge" or "1.x" technology.
Bandwidth
The first thing we think about when considering iSCSI is bandwidth. Our experience in the server world makes us wonder that if a physical server has a gigabit interface with direct attached storage, how can a virtual host support 20 vm's (typical # for a single host) using a single iSCSI interface? The key to understanding this is to understand how much I/O your server really requires. Typical 30 iops per vm is good enough to use for budgeting. If you multiply out the 20 vm's X 30 iops you get 600 iops.
Now the trick is to turn that # into bandwidth. The formula is this….
IOPS * TransferSizeInBytes = BytesPerSec
Now assuming the TransferSizeinBytes (block size) is 8k, the above example would yield 4,800,000 which is roughly 4MB per second. Mathematically, gigabit links support 125MBps. So, as you can see in this example, a single gigabit interface is sufficient. The moral of this story is that the network is usually not the bottleneck at this scale.
Something to keep in mind here is that your virtual environment will most likely have multiple hosts. Therefore, I like the idea of having an aggregate of bandwidth on the SAN side to support potential spikes in I/O. For example, 4 virtual hosts with 1gb iSCSI interfaces should connect to a SAN that has multiple gb interfaces bonded together to support 2gb or more. This will depend on your SAN and how many interfaces it supports but it is very common for SAN's today to ship with at least 2 interfaces with the option of adding additional.
Keep in mind that switch configuration should be made to support this as well. You can use either Etherchannel or LACP (aka link aggregation) to set that up on your switches. It goes without saying that a good switch is recommended here. Personally, I think Cisco is the best choice here but there are other options that meet the requirements such as Procurve. I recommend a layer 3 switch but it is not required.
Network Optimizations
Although I mentioned above that the network was not the bottleneck, that is not to say that we should not take steps to optimize the network for storage. The next two items are generally considered to be best practices however, it always surprises me how many deployments miss these items.
Vlans
Storage traffic should be segmented on a separate vlan. This resolves 2 primary problems.
- Security – nobody has any business on the storage vlan except the storage devices and administrators. There is no positive benefit for users and storage to share the same vlan.
- Broadcast traffic is reduced to the devices on the storage vlan. iSCSI storage is an Ethernet technology and when an Ethernet device broadcasts all devices on that segment listen. Putting storage interfaces on a separate vlan mitigates this traffic overhead.
Jumbo Frames
Ethernet frames are 1500 bytes by default. The nature of TCP/IP involves transmit and reply packets. It has been found that frame size can be increased therefore reducing the amount of TCP/IP packets put on the wire. These packets are called Jumbo Frames and are 9000 bytes. Storage traffic can uniquely benefit from this. I have found that file transfer speeds have increased by as much as 50% using Jumbo Frames although I have not done specific testing to measure its effect on iSCSI traffic.
Of note here is the fact that all points in the network traffic flow need to support Jumbo Frames otherwise frame size will fall back to the lowest common denominator. In a virtual infrastructure, this would entail configuring your storage SAN, switches, and virtual hosts to support Jumbo Frames.
Also of note, in a VMWare infrastructure, you will need to enable Jumbo Frame support at the cli unless you are running the Enterprise Plus edition of VSphere. If you are running the Enterprise Plus edition then you can configure this using the VSphere Distributed Switch.
Impact of iSCSI HBA's
iSCSI requires some cpu overhead on the virtual host side. There are purpose built Host Bus Adapters (HBA's) for offloading the iSCSI work onto the card. But the cpu overhead is generally very low at about 3% and with today's processers this is not a problem. Additionally, VMWare support for iSCSI in 4.x has been re-written and improved yielding better performance than 3.5 or earlier. I have not seen this be a problem but there is nothing wrong with using an iSCSI HBA either.
SAN's, Arrays & Disks
Everyone reading this is familiar with RAID arrays and various expectations of performance using different RAID levels. iSCSI & SAN's do not change these laws. In general, a SAN does not instantly make a 4 disk RAID 5 array perform magically better than it did in DAS. In my testing using iometer, I have found the results to be consistent with what would be expected from the underlying array configuration. Different storage vendors may debate that but let's just keep it an even playing field for the moment and consider that the SAN serves up the LUN as is.
Generally, you will create a "datastore" that is comprised of many more disks spreading the I/O across a larger array and therefore delivering better performance than what you could deliver with multiple smaller arrays. This is another benefit of using shared storage and a SAN.
Therefore, you will want to use an array of as many physical disks as possible and you will want to use the type of disk that will deliver the proper amount of I/O to support your array. 15k-SAS delivers about 160 iops while a 7.2k-SATA is at about 80 iops. You will choose your RAID level based on the balance of redundancy vs performance. I usually use RAID 10 for SATA and RAID 10 or 50 for SAS depending on budget/requirements. Generally, I like RAID 10 overall due to best performance and lowest impact on rebuild times when drives fail.
Check your SAN documentation on how much flexibility you have on configuring your storage as some SAN's will only let you configure one aggregate of storage and RAID level. You can play with various array configurations using this online calculator to see the relation of spindles and RAID and their impact on IOPS.
Summary
Deploying iSCSI is deceptively easy. I have found that many people run into issues when not fully understanding all the pieces that come into play. The following items should be added to the iSCSI deployment checklist. Doing so will help you avoid many of the most common issues and enable you to get the most out of your iSCSI deployment.
- Configure storage vlan
- Enable jumbo frames on all switches with storage vlan, SAN's, VMware or other virtual hosts.
- Implement link aggregation on SAN for better throughput and availability.
- Understand the limits of your SAN both in capacity and iops.
No comments:
Post a Comment