What is ZFS? Reasons to use ZFS and its features

What is ZFS? Reasons to use ZFS and its features

History of ZFS

The Z File System (ZFS) was developed in 2001 by Matthew Ahrens and Jeff Bonwick. ZFS was designed as the next generation file system for Sun MicroSystems' OpenSolaris. In 2008, ZFS was ported to FreeBSD. In the same year, a project to port ZFS to Linux was also launched. However, because ZFS is licensed under the Common Development and Distribution License (CDDL), which is incompatible with the GNU General Public License, it cannot be ported to the Linux kernel. To solve this problem, most Linux distributions provide some way to install ZFS.
Shortly after Oracle acquired Sun Microsystems, OpenSolaris became closed source, which made subsequent development of ZFS also closed source. Many ZFS developers were very unhappy about this. Two-thirds of the ZFS core developers, including Ahrens and Bonwick, left Oracle as a result of the decision. They joined other companies and founded the OpenZFS project in September 2013. This project leads the open source development of ZFS.
Let's go back to the licensing issue mentioned above. Now that the OpenZFS project has been separated from Oracle, one might wonder why they don't use a GPL-compatible license so that it can be included in the Linux kernel. According to the OpenZFS official website, changing the license requires contacting all people who have contributed code to the current OpenZFS implementation (including the initial public ZFS code and OpenSolaris code) and obtaining their permission. This was almost impossible (as some contributors might have died or be hard to find), so they decided to keep the original license.

What is ZFS and what features does it have?

As mentioned earlier, ZFS is an advanced file system. Therefore, it has some interesting properties. for example:

  • Storage Pool
  • Copy-on-write
  • Snapshot
  • Data integrity verification and automatic repair
  • RAID-Z
  • The maximum single file size is 16 EB (1 EB = 1024 PB)
  • Maximum storage of 256 quadrillion (256*1015) ZB (1 ZB = 1024 EB)

Let’s take a closer look at some of these features.

How do I install ZFS?

If you want to use ZFS right away (out of the box), then you need to install FreeBSD or an operating system that uses the illumos kernel. illumos is a clone of the OpenSolaris kernel.
In fact, support for ZFS is the main reason some experienced Linux users choose BSD.
If you want to try ZFS on Linux, you can only use it on storage file systems. As far as I know, there is no Linux distribution that can install ZFS on the root directory and use it out of the box. If you're interested in trying ZFS on Linux, the ZFS on Linux project has a number of tutorials to show you how.

Storage Pool

Unlike most file systems, ZFS combines features of a file system and a volume manager. This means that, unlike other file systems, ZFS can create a file system that spans a range of hard disks or pools. Not only that, you can also increase the storage capacity of the pool by adding hard drives. ZFS can be partitioned and formatted

Ten reasons and features to use ZFS

1. No more fsck, scandisk

No matter you are using Linux, UNIX or Windows, I believe everyone has had similar experiences: when the system loses power unexpectedly or shuts down illegally, the file system is found to have inconsistent problems after the system restarts. At this time, fsck or scandisk is needed to repair it, which is very time-consuming and may not be successfully repaired in the end. What's worse is that if a server needs to perform fsck, it can only be taken offline, and existing applications often have large hard drives, so the corresponding fsck repair time is also very long, which is almost unbearable for many users who use the server.
After using ZFS, you can completely abandon tools like fsck, because ZFS is a file system based on the COW (Copy on Write) mechanism. COW will not rewrite existing files on the hard disk, ensuring that all files on the hard disk are valid. Therefore, there will not be such an inconsistent concept, and naturally there will be no need for such a tool.

2. Simple management

As a new file system, ZFS completely abandons the traditional File System + Volume Manager + Storage architecture. All storage devices are managed through ZFS Pool. As long as various storage devices are added to the same ZFS Pool, you can easily manage and configure the file system in this ZFS Pool. You no longer need to remember various professional concepts, various commands such as newfs, metinit and the usage of various Volume Managers. In ZFS, we only need two commands, zpool (for ZFS Pool management) and zfs (for ZFS file system management), to easily manage a 128-bit file system. For example, we often encounter the situation where system data grows too fast and the existing storage capacity is insufficient, so we need to add hard disks. If we follow the traditional Volume Manager management method, we need to consider many existing factors in advance and calculate the various parameters that need to be configured according to the application in advance. In the case of ZFS, our system administrators can be completely liberated and no longer need such complex manual considerations and calculations. We can leave these to ZFS because the ZFS Pool will automatically adjust and dynamically adapt to demand. We only need a simple command to add a new hard disk to this ZFS Pool:

zpool add zfs_pool mirror c4t0d0 c5t0d0

All file systems based on this dynamically adjusted ZFS Pool can immediately use the new hard disk and automatically select the most optimized parameters. And ZFS also provides a graphical management interface

3. No capacity limit

The ZFS (Zettabyte File System) file system, as its name suggests, can provide truly massive storage, and it is almost impossible to encounter capacity problems in reality. Under the existing 64-bit kernel, it can accommodate single files up to 16 Exabytes (264) in size, can use 264 storage devices, and can create 264 file systems.

4. Fully guarantee the accuracy and integrity of data

Since all ZFS data operations are based on transactions, a group of corresponding operations will be parsed by ZFS as a transaction operation. A transaction operation means that a group of operations will either fail together or succeed together. And as mentioned before, all ZFS operations are based on COW (Copy on Write), which ensures that the data on the device is always valid and will never be inconsistent due to system crashes or unexpected power outages.
Another potential threat to data may come from hardware problems, such as hard disks, RAID card hardware problems, or driver bugs. Existing file systems often encounter this problem and simply pass the erroneous data directly to the upper-layer application. We usually call this problem Silent Data Corruption. In ZFS, all data, whether user data or metadata of the file system itself, is subjected to a 256-bit checksum. When ZFS submits data, it will perform a check to completely eliminate this silent data corruption situation.

5. Provide excellent performance and scalability

Different from the traditional File System + Volume Manager + Storage architecture, ZFS provides all functions directly based on storage devices. Therefore, it has its own unique innovative features and its performance is naturally extraordinary.

Dynamic Striping vs. Static Striping

Since ZFS is based on COW and a global dynamic ZFS Pool, any write operation is a write operation to a new data block. ZFS dynamically selects the best device from the ZFS Pool and writes it linearly in a transaction, fully and effectively utilizing the bandwidth of existing devices. We call this feature Dynamic Striping. The corresponding Static Striping is the method used by traditional file systems. Static Striping requires the administrator to correctly calculate and set this set of Stripes in advance, and if a new device is added, it needs to be manually calculated and set again. What is more serious is that if the manual calculation is wrong, it will directly affect the performance of the system. After using the Dynamic Striping feature, we don’t need any human intervention at all. ZFS will automatically adjust and intelligently provide you with the best equipment and the fastest operation method.

Support multiple block sizes

ZFS supports data block definitions of various sizes, from 512 bytes to 1M bytes. Unlike traditional file systems that often have fixed-size data blocks, ZFS can dynamically calculate based on files of different sizes and dynamically select the best data blocks.
Because data blocks of different sizes directly affect the actual hard disk capacity and reading speed. If smaller data blocks are used, the fragmentation caused by storing files will be less, and reading and writing small files will be faster, but more metadata will need to be created, and reading and writing large files will take more time. If larger data blocks are used, less metadata is used, which is more conducive to reading and writing large files, but it will cause more fragmentation. ZFS analyzes an algorithm for selecting data block size based on actual investigation of existing file usage, and dynamically determines the optimal data block based on the actual file size. Therefore, ZFS is very intelligent and can achieve self-tuning results without the intervention of the system administrator. Of course, ZFS also supports users to customize the data block size used for a single file or the entire file system.

Intelligent Prefetch

Most operating systems have the ability to pre-read data, and ZFS is a more intelligent data pre-reading function that is built directly on the file system. It can not only intelligently identify multiple reading modes and read data in advance, but also perform this pre-reading intelligent identification on each read data stream, which is a very good thing for many streaming media providers.
In terms of scalability, unlike existing file systems which are mostly based on a restricted static model, ZFS adopts the dynamic concept of ZFS Pool. Its metadata is also dynamic, and read and write operations are parallel and have a priority concept, so linear performance growth can be guaranteed even in the case of large data volumes and multiple devices.

6. Self-healing function

ZFS Mirror 和RAID-Z

Traditional hard disk mirroring and RAID 4 and RAID 5 array methods will encounter the problem mentioned above: Silent Data Corruption. If a physical problem occurs on a hard disk and causes data errors, existing Mirror, including RAID 4 and RAID 5 arrays, will silently submit the erroneous data to the upper-level application. If this error occurs in Metadata, it will directly cause the system to panic. And there is a more serious situation: in RAID 4 and RAID 5 arrays, if the system is calculating the Parity value and writing new data and the new Parity value again when a power outage occurs, all the stored data in the entire array will be meaningless.
In ZFS, the corresponding ZFS Mirror and RAID-Z methods are proposed. When reading data, it will automatically check with the 256-bit check code, actively discover this silent data corruption, and then obtain the correct data through the corresponding Mirror hard disk or other hard disks in the RAID-Z array and return it to the upper-level application, and automatically repair the data corruption of the original hard disk at the same time.

Fault Manager

Solaris 10 includes a ZFS diagnostic engine that interacts with Solaris's Fault Manager (another new feature of Solaris 10) to diagnose and analyze and report ZFS Pool and storage device errors in real time. Users can receive friendly messages from the Fault Manager in a timely manner. Although the diagnostic engine will not take proactive actions to repair or resolve the problem, it will prompt the system administrator with possible actions in the message. A ZFS error message similar to the following, where REC-ACTION is the recommended action:

SUNW-MSG-ID: ZFS-8000-D3, TYPE: Fault, VER: 1, SEVERITY: Major
EVENT-TIME: Fri Mar 10 11:09:06 MST 2006
PLATFORM: SUNW,Ultra-60, CSN: -, HOSTNAME: neo
SOURCE: zfs-diagnosis, REV: 1.0
EVENT-ID: b55ee13b-cd74-4dff-8aff-ad575c372ef8
DESC: A ZFS device failed. Refer to http://sun.com/msg/ZFS-8000-D3 for more information.
AUTO-RESPONSE: No automated response will occur.
IMPACT: Fault tolerance of the pool maybe compromised.
REC-ACTION: Run 'zpool status -x' and replace the bad device.

7. Security

In terms of security, ZFS supports ACL (read control list) similar to NT style NFSv4. Moreover, for the 256-bit verification code mentioned above, users can choose from a variety of verification methods, including the SHA-256 verification algorithm, thereby ensuring data security at the physical storage unit level.

8. Super powerful function

As the "last file system", ZFS covers the basic file system and volume management functions, while providing many enterprise-level super functions: Quota, Reservation, Compression, Snapshot, and Clone. And it's very fast. With this file system, you no longer need any Volume Manager.

9. Compatibility

ZFS is a file system that is fully compatible with the POSIX specification, so upper-layer applications are completely unaffected. ZFS also provides an Emulated Volume module that can use any ZFS file system as a normal block device. At the same time, ZFS can also use Volume built based on Volume Manager as a storage device unit. This gives everyone the greatest freedom to obtain the various features provided by ZFS without modifying applications or existing file systems.

10. Open Source

ZFS is operated by Sun Microsystems as an open source project of OpenSolaris and is completely free to use, which means that we can not only enjoy the high quality of a commercial company, but also the advantages of the open source model.
Although only Solaris currently supports this file system, this open source model will definitely promote more ZFS-based applications. Some foreign developers are now porting ZFS to Linux and Mac OS. If you want to try out ZFS, since it is currently bundled with Solaris 10, you need to download the latest version of Solaris 10 6/06 (http://www.sun.com/software/solaris/get.jsp).

Additional Notes

This article discusses the advantages of ZFS. Now, let me tell you a very real problem with ZFS. Using RAID-Z can be expensive because you need to purchase a large number of disks to increase storage space.
Have you used ZFS yet? What is your experience like?

Summarize

The above is the full content of this article. I hope that the content of this article will have certain reference learning value for your study or work. Thank you for your support of 123WORDPRESS.COM. If you want to learn more about this, please check out the following links

<<:  The correct way to migrate MySQL data to Oracle

>>:  Summary of Vue component basics

Recommend

The difference between float and position attributes in CSS layout

CSS Layout - position Property The position attri...

Explanation of the steps for Tomcat to support https access

How to make tomcat support https access step: (1)...

What does mysql database do

MySQL is a relational database management system ...

Example of writing mobile H5 to invoke APP (IOS, Android)

iOS 1. URL scheme This solution is basically for ...

How to use Linux to calculate the disk space occupied by timed files

Open the scheduled task editor. Cent uses vim to ...

Detailed explanation of simple snow effect example using JS

Table of contents Preface Main implementation cod...

Pitfalls based on MySQL default sorting rules

The default varchar type in MySQL is case insensi...

Problems encountered when updating the auto-increment primary key id in Mysql

Table of contents Why update the auto-increment i...

How to open ports to the outside world in Alibaba Cloud Centos7.X

In a word: if you buy a cloud server from any maj...

A brief summary of my experience in writing HTML pages

It has been three or four months since I joined Wo...

Specific use of node.js global variables

Global Object All modules can be called global: r...

React event binding details

Table of contents Class component event binding F...

Solution to transparent font problem after turning on ClearType in IE

The solution to the transparent font problem after...

Summary of seven MySQL JOIN types

Before we begin, we create two tables to demonstr...