ZFS (and other advanced filesystems) will now do partial reconstruction of a failed drive (that is, they don't have to bit copy the entire drive, only the parts which are used), which helps. But there are still problems. ZFS's pathological case results in rebuild times of 2-3 WEEKS for a 1TB drive in a RAID-Z (similar to RAID-5). It's all due to the horribly small throughput, maximum IOPs, and latency of the hard drive.
SSDs, on the other hand, are no where near the problem. They've got considerably more throughput than a hard drive, and, more importantly, THOUSANDS of times better IOPS. Frankly, more than any other reason, I expect the significant IOPS of the SSD to signal the death knell of HDs in the next decade. By 2020, expect HDs to be gone from everything, even in places where HDs still have better GB/$. The rebuild rates and maintenance of HDs simply can't compete with flash.
Note: IOPS = I/O Per Second, or the number of read/write operations (irregardless of size) which a disk can service. HDs top out around 350, consumer SSDs do under 10,000, and high-end SSDs can do up to 100,000.
Friday, September 18, 2009
RAID vs SSD
Wednesday, July 02, 2008
Welcome to the Aperi Blog
The Aperi project announces the first release of the Storage Network Simulator. The simulator is a tool that enables you to simulate a storage area network (SAN) through software. You can create a SAN configuration, add devices to the SAN, create arbitrary connections between devices, and remove connections between devices. Using this tool to create a simulated SAN environment can help when you:Welcome to the Aperi Blog
- Have limited or no access to hardware and software when developing and testing SRM applications
- Have "off-line"' access to SAN devices without impacting the performance of the real network (such as the SNIA lab or any SAN in the world).
The SAN Simulator provides an increase in productivity and efficiency for Aperi development and testing by removing the dependence on device availability.
- Need to perform "what-if" analysis before you plan to extend or reconfigure your SAN
Tuesday, July 01, 2008
Storage Management - Green IT not green
Note that the term “under storage” is substituted for “under management.” Truth be told, data management in distributed computing environments is extraordinarily lax. The best analogy for distributed storage is a huge and growing junk drawer. This point is underscored by data collated by Sun Microsystems after performing nearly 10,000 storage assessments at client facilities. Per Sun’s statistics, for every hard disk deployed by a company, roughly 30 percent of its capacity contains useful data accessed regularly as part of day to day operations. Another 40 percent must be retained for reasons of historical value, regulatory or legal compliance, or because it is intellectual property. Rarely referenced, this data belongs in an archive, preferably tape or optical because they consume far less kilowatt hours than do disk-based systems.
This point is never brought up in the articles you read in the trades. Instead, vendors posit a number of hardware and software value-add solutions as silver bullets for Green IT. Virtualization, de-duplication, compression, re-driving arrays with larger disk drives, leveraging MAID (massive arrays of independent disk, a portion of which spin down when not in use), and thin provisioning are just a few of the green panaceas that are being discussed. Most involve plugging additional hardware into the wall, which is hardly an intelligent way to reduce power consumption.
All of these techniques deliver tactical value at best: unmanaged data will continue to grow over time and eliminate whatever short term power reductions that the new technologies deliver. They are simply re-arranging deck chairs on the Titanic. Getting to green in IT ultimately and strategically comes down to managing data better. It costs a company virtually nothing to sort out their data junk drawer, to apply processes for classifying data so that it can be migrated over time into an archive, and to deploy storage resource management tools to spot wasted space, ownerless files and junk data in their repositories.
Thursday, December 06, 2007
Storage - SAN, HBA, iSCSI, TOE, VM
Centralized Storage and the Impact on VMware TCO
11 Reasons to Choose Qlogic iSCSI HBA’s over Software
Configuring iSCSI in a VMware ESX Server 3 Environment
SAN vs DASD - Cheap SAN gear
Tuesday, November 06, 2007
Best Server Storage Setup?
Secondly, pick some disks. Price out the various available drives and compare their $/GB rates. There will be a sweet spot were you get the best ratio, probably around the 400G or 500G size these days.
Even though the 750GB Seagates appear to provide less bang-for-buck than smaller solutions (400GB, 300GB), the higher data storage density pays off in a big way. Cramming more data into a single box means amortizing the power/heat cost of the non-disk components better, and also allows you better utilization of your floorspace (which is going to become very important, if you really are looking to scale this into the multi-petabyte range).
Tuesday, July 24, 2007
Redhat GFS
The difference is how it tries to solve the problem. NFS works over IP and access files at the inode level. This requires the server system or device to be running RPC and the NFS protocol. Most network filesystems work in a similar way. You have servers and clients accessing the servers via some protocol.
Now imagine a filesystem designed for servers that allows them to access the filesystem at a block level directly via the shared bus. Let's say a parallel SCSI buss (or any bus that allows more than one host, e.g. iSCSI, Fibre Channel, Firewire). Imagine how fast it would be to access a shared disk over Fibre Channel! The problem is that if two servers mount the filesystem at the same time it would normally currupt the filesystem. People with SAN's (Storage Area Networks) solve this problem by making mini virtual hard drives and setting ACL's on them so only one host can access that virtual hard drive at a time. This could lead to a waste of space.
GFS solves the SAN problem by using a Distributed Lock Manager (DLM). No one host is the server of the filesystem, but writes/locks are coordinated via the DLM. Now multiple hosts *can* share a virtual hard drive or real block device and not corrupt the filesystem. If a host dies, no problem, there is no server for the filesystem!
Let's give an example. Say you have a firewire enclosure. Now plug that firewire hard drive into two computers. This, by the way, may still require a patch to sbp so that Linux will tell the enclosure to allow both hosts to talk to it at the same time. Now that the hard drive is talking to both computers you could run GFS on it and access the data at the block level by both systems. Now start serving email via IMAP (load balanced), *both hot*, no standby. Now kill a box. IMAP still works. No remounting, no resycronization.
Pretty amazing if you ask me! This technology is pretty rare. IBM has GPFS. SGI has Clustered XFS. Both are pretty expensive. GFS? RedHat just re-GPL'd it! Microsoft? Ummm. I think they are just now getting logical volume management.
GFS also has nice features like journaling (kinda required for this sorta thing), ACL's, quotas, and online resizing.
Friday, June 15, 2007
Monday, May 14, 2007
Hitachi's Universal Storage Platform V is virtually huge
The USPV offers a performance boost over previous Tagmastore systems. In addition, the new hardware ships thin provisioning software - technology yet to be implemented by Hitachi's high-end rivals.The new hardware can handle 3.5 million input-output operations per second — a 40 per cent boost from its predecessor, launched in 2004. The USPV also offers a 4GB/sec Fibre Channel Switch backplane for connections to disk drives and hosts. The array now supports 16 controller pairs for a total of 224 font-end Fibre Channel ports and 112 FICON or ESCON host ports. The device hold up to 1152 drives.
While internal storage has stayed the same at 332TB, virtualized external storage gets a major boost from its previous incarnation's 32PB to up to 247PB.
"This is a big box for big users," principal IT advisor of Illuminata, John Webster said. "It's clearly not for the faint of heart. You've really got to know what you're doing with a device like this."
Hitachi promises a major improvement in disk utilization with the array's use of thin provisioning. While the technology isn't new, the system is the first high-end device of its kind to use it.
Thin provisioning is a technology debuted by 3PAR where physical disk capacity is used only as needed for virtual volumes. It replaces the traditional method where large portions of storage capacity are allocated to applications but often remain unused.
Wednesday, April 18, 2007
Smugmug - Amazon S3
But wait! It gets even better! Because of the stupid way the tax law operates in this country, I would actually have to pay taxes on the $423K I spent buying drives (yes, exactly like the money I spent was actually profit. Dumb.). So I’d have to pay an additional ~$135K in taxes. Technically, I’d get that back over the next 5 years, so I didn’t want to include it as “savings” but as you can imagine, the cash flow implications are huge. In a very real sense, the actual cash I conserved so far is about $474,000.
But wait! It gets even better! Amazon has been so reliable over the last 7 months (considerably more reliable than our own internal storage, which I consider to be quite reliable), that just last week we made S3 an even more fundamental part of our storage architecture. I’ll save the details for a future post, but the bottom line is that we’re actually going to start selling up to 90% of our hard drives on eBay or something. So costs I had previously assumed were sunk are actually about to be recouped. We should get many hundreds of thousands of dollars back in cash.
I expect our savings from Amazon S3 to be well over $1M in 2007, maybe as high as $2M.