The much anticipated release of the new ZFS filesystem in Solaris 10 will revolutionize the way system administrators (and executives) think about and work with filesystems. Breaking free of the traditional volume or partition architecture, ZFS combines scalability and flexibility while providing a simple command interface. Coined by Sun as the “last word in filesystems,” ZFS is already being ported to several Linux distributions and Mac OSX. Designed to have at least a 30 year shelf life, this filesystem will make waves with its upcoming release in Solaris 10. We’ve been playing with ZFS for several months and have written some recipes about its basic administration. Here are ten reasons why you’ll want to reformat all of your systems and use ZFS.
1. So easy your mom could administer it
ZFS is administered by two commands, zpool and zfs. Most tasks typically require a single command to accomplish. And the commands are designed to make sense. For example, check out the commands to place a quota on its size.
2. Honkin’ big filesystems
How big do filesystems need to be? In a world where 640KB is certainly not enough for computer memory, current filesystems have reached or are reaching the end of their usefulness. A 64-bit filesystem would meet today’s need, but estimate of the lifetime of a 64-bit filesystem is about 10 years. Extending to 128-bits gives ZFS an expected lifetime of 30 years (UFS, for comparison, is about 20 years old). So how much data can you squeeze into a 128-bit filesystem? 16 exabytes or 18 million terabytes. How many files can you cram into a ZFS filesystem? 200 million million.
Could anyone use a fileystem that large? No, not really. The topic has roused discussions about boiling the oceans if a real life storage unit that size was powered on. It may not be necessary to have 128 bits, but it doesn’t hurt and we won’t have to worry about running out of addressable space.
3. Filesystem, heal thyself
ZFS employs 256 bit checksums end-to-end to validate data stored under its protection. Most filesystems (and you know who you are) depend on the underlying hardware to detect corrupted data and then can only nag about it if they get such a message. Every block in a ZFS filesystem has a checksum associated with it. If ZFS detects a checksum mismatch on a raidz or mirrored filesystem, it will actively reconstruct the block from the available redundancy and go on about its job.
4. fsck off, fsck
fsck has been voted out of the house. We don’t need it anymore. Because ZFS data are always consistent on disk, don’t be afraid to yank out those power cords if you feel like it. Your ZFS filesystems will never require you to enter the superuser password for maintenance mode.
5. Compress to your heart’s content
I’ve always been a proponent of optional and appropriate compression in filesystems. There are some data that are well suited to compression such as server logs. Many people get ruffled up over this topic, although I suspect that they were once burned by doublespace munching up an important document. When thoughtfully used, ZFS compression can improve disk I/O which is a common bottleneck. ZFS compression can be turned on for individual filesystems or hierarchies with a very easy single command.
6. Unconstrained architecture
UFS and other filesystems use a constrained model of fixed partitions or volumes, each filesystem having a set amount of available disk space. ZFS uses a pooled storage model. This is a significant departure from the traditional concept of filesystems. Many current production systems may have a single digit number of filesystems and adding or manipulating existing filesystems in such an environment is difficult.
In ZFS, mounted in different places in the host filesystem.
7. Grow filesystems without green thumb
If your pool becomes overcrowded, you can grow it. With one command. On a live production system. Enough said.
8. Dynamic striping
On by default, dynamic striping automatically includes all deivces in a pool in writes simultaneously (stripe width spans all the avaiable media). This will speed up the I/O on systems with multiple paths to storage by load balancing the I/O on all of the paths.
9. The term “raidz” sounds so l33t
The new RAID-Z redundant storage model replaces RAID-5 and improves upon it. RAID-Z does not suffer from the “write hole” in which a stripe of data becomes corrupt because of a loss of power during the vulnerable period between writing the data and the parity. RAID-Z, like RAID-5, can survive the loss of one disk. A future release is planned using the keyword raidz2 which can tolerate the loss of two disks. Perhaps the best feature is that creating a raidz pool is crazy simple.
10. Clones with no ethical issues
The simple creation of restored to the original filesystem to return to the previous state. Snapshots can be written to other storage (disk, tape), transferred to another system, and converted back into a filesystem.
More information
For more information, check out Sun’s official ZFS page and the detailed OpenSolaris community ZFS information. If you want to take ZFS out for a test drive, the latest version of Solaris Express has it built in and ready to go. Download it here.