Build a file management system step by step with nginx+FastDFS

Build a file management system step by step with nginx+FastDFS

1. Introduction to FastDFS

FastDFS open source address: https://github.com/happyfish100

Reference: Design principles of distributed file system FastDFS

Reference: FastDFS distributed file system

Personal packaged FastDFS Java API: https://github.com/bojiangzhou/lyyzoo-fastdfs-java

1. Introduction

FastDFS is an open source high-performance distributed file system (DFS). Its main features include: file storage, file synchronization and file access, as well as high capacity and load balancing. It mainly solves the problem of massive data storage and is particularly suitable for online services that use small and medium-sized files (recommended range: 4KB < file_size <500MB) as carriers.

The FastDFS system has three roles: Tracker Server, Storage Server, and Client.

Tracker Server : Tracking server, mainly responsible for scheduling and balancing; responsible for managing all storage servers and groups. After startup, each storage will connect to the Tracker, inform it of information such as the group it belongs to, and maintain periodic heartbeats.

Storage Server: Storage server mainly provides capacity and backup services; it is organized into groups, and each group can have multiple storage servers, with data backing up each other.

Client : The client is the server that uploads and downloads data, which is also the server where our own project is deployed.

2. FastDFS storage strategy

In order to support large capacity, storage nodes (servers) are organized in volumes (or groups). The storage system consists of one or more volumes. The files between volumes are independent of each other. The sum of the file capacities of all volumes is the file capacity of the entire storage system. A volume can be composed of one or more storage servers. The files in the storage servers under a volume are the same. Multiple storage servers in the volume play the role of redundant backup and load balancing.

When adding a server to a volume, the system automatically synchronizes existing files. After synchronization is complete, the system automatically switches the newly added server online to provide services. When storage space is low or running out, volumes can be added dynamically. Simply add one or more servers and configure them as a new volume to expand the capacity of the storage system.

3. FastDFS upload process

FastDFS provides users with basic file access interfaces, such as upload, download, append, delete, etc., in the form of a client library.

The Storage Server will periodically send its storage information to the Tracker Server. When there is more than one Tracker Server in the Tracker Server Cluster, the relationship between each Tracker is equal, so the client can choose any Tracker when uploading.

When Tracker receives a request from a client to upload a file, it assigns a group that can store the file to the file. After the group is selected, it decides which storage server in the group to assign to the client. After the storage server is assigned, the client sends a file write request to the storage, and the storage will assign a data storage directory for the file. Then assign a fileid to the file, and finally generate a file name based on the above information to store the file.

4. File synchronization of FastDFS

When writing a file, the client considers the file writing successful when it writes the file to a storage server in the group. After the storage server finishes writing the file, the background thread will synchronize the file to other storage servers in the same group.

After each storage writes a file, it will also write a binlog. The binlog does not contain file data, but only meta information such as the file name. This binlog is used for background synchronization. The storage will record the progress of synchronization to other storages in the group so that it can continue synchronization after restart; the progress is recorded in the form of timestamps, so it is best to ensure that the clocks of all servers in the cluster are synchronized.

The synchronization progress of the storage will be reported to the tracker as part of the metadata. The tracker will use the synchronization progress as a reference when choosing to read storage.

5. File downloading with FastDFS

After the client successfully uploads the file, it will get a file name generated by storage. Then the client can access the file based on this file name.

Just like uploading files, the client can choose any tracker server when downloading files. When a tracker sends a download request to a tracker, it must include the file name information. The tracker parses the file group, size, creation time and other information from the file name, and then selects a storage for the request to serve the read request.

2. Install FastDFS environment

0. Introduction

Operating environment: CentOS7 X64, the following operations are all stand-alone environments.

I downloaded all the installation packages to /softpackages/ and unzipped them to the current directory.

The first thing to do is to modify the hosts and map the file server's IP address to the domain name (stand-alone TrackerServer environment). This is because you will need to configure the server address in many subsequent configurations. If the IP address changes, you only need to modify the hosts.

# vim /etc/hosts

Add the following line, this is my IP
192.168.51.128 file.ljzsg.com

If you want to access the virtual machine locally, add a line in C:\Windows\System32\drivers\etc\hosts

1. Download and install libfastcommon

libfastcommon is a common C function library extracted from FastDFS and FastDHT, a basic environment, and can be installed.

① Download libfastcommon

# wget https://github.com/happyfish100/libfastcommon/archive/V1.0.7.tar.gz

② Decompression

# tar -zxvf V1.0.7.tar.gz
# cd libfastcommon-1.0.7

③ Compile and install

# ./make.sh#
./make.sh install

④ libfastcommon.so is installed to /usr/lib64/libfastcommon.so, but the lib directory set by the FastDFS main program is /usr/local/lib, so a soft link needs to be created.

# ln -s /usr/lib64/libfastcommon.so /usr/local/lib/libfastcommon.so
# ln -s /usr/lib64/libfastcommon.so /usr/lib/libfastcommon.so
# ln -s /usr/lib64/libfdfsclient.so /usr/local/lib/libfdfsclient.so
# ln -s /usr/lib64/libfdfsclient.so /usr/lib/libfdfsclient.so

2. Download and install FastDFS

① Download FastDFS

# wget https://github.com/happyfish100/fastdfs/archive/V5.05.tar.gz

② Decompression

# tar -zxvf V5.05.tar.gz
# cd fastdfs-5.05

③ Compile and install

# ./make.sh
# ./make.sh install

④ Corresponding files and directories after installation by default

A. Service script:

/etc/init.d/fdfs_storaged
/etc/init.d/fdfs_tracker

B. Configuration files (these three are sample configuration files provided by the author):

/etc/fdfs/client.conf.sample
/etc/fdfs/storage.conf.sample
/etc/fdfs/tracker.conf.sample

C. The command tool is in the /usr/bin/ directory:

fdfs_appender_test
fdfs_appender_test1
fdfs_append_file
fdfs_crc32
fdfs_delete_file
fdfs_download_file
fdfs_file_info
fdfs_monitor
fdfs_storaged
fdfs_test
fdfs_test1
fdfs_trackerd
fdfs_upload_appender
fdfs_upload_file
stop.sh
restart.sh

⑤ The bin directory set by the FastDFS service script is /usr/local/bin, but the actual commands are installed under /usr/bin/.

Two ways:

》 First, modify the corresponding command path in the FastDFS service script, that is, change /usr/local/bin in the two scripts /etc/init.d/fdfs_storaged and /etc/init.d/fdfs_tracker to /usr/bin.

# vim fdfs_trackerd
Use the search and replace command to make unified changes: %s+/usr/local/bin+/usr/bin
# vim fdfs_storaged
Use the search and replace command to make unified changes: %s+/usr/local/bin+/usr/bin

》 The second is to create a soft link from /usr/bin to /usr/local/bin. I use this method.

# ln -s /usr/bin/fdfs_trackerd /usr/local/bin
# ln -s /usr/bin/fdfs_storaged /usr/local/bin
# ln -s /usr/bin/stop.sh /usr/local/bin
# ln -s /usr/bin/restart.sh /usr/local/bin

3. Configure FastDFS Tracker

For detailed description of the configuration file, please refer to: Detailed description of FastDFS configuration file

① Go to /etc/fdfs, copy the FastDFS tracker sample configuration file tracker.conf.sample, and rename it to tracker.conf.

# cd /etc/fdfs
# cp tracker.conf.sample tracker.conf
# vim tracker.conf

② Edit tracker.conf. The ones marked in red need to be modified, and the others can be left as default.

# Whether the configuration file is ineffective, false means it is effective disabled=false
# The port providing the service is port=22122
# Tracker data and log directory address (the root directory must exist, subdirectories will be created automatically)
base_path=/ljzsg/fastdfs/tracker
# HTTP service port http.server_port=80

③ Create the tracker basic data directory, that is, the directory corresponding to base_path

# mkdir -p /ljzsg/fastdfs/tracker

④ Open the tracking port in the firewall (default 22122)

# vim /etc/sysconfig/iptables
Add the following port line:
-A INPUT -m state --state NEW -m tcp -p tcp --dport 22122 -j ACCEPT
Restart the firewall:
# service iptables restart

⑤ Start Tracker

When started successfully for the first time, two directories, data and logs, will be created under /ljzsg/fdfsdfs/tracker/ (configured base_path).

You can start it this way
# /etc/init.d/fdfs_trackerd start
You can also start it in this way, provided that a soft link is created above, and this method is used later
# service fdfs_trackerd start

Check whether FastDFS Tracker has been successfully started and port 22122 is being listened to, which means that the Tracker service has been successfully installed.

# netstat -unltp|grep fdfs

Disable Tracker command:

# service fdfs_trackerd stop

⑥ Set Tracker to start at boot

# chkconfig fdfs_trackerd on

or:
# vim /etc/rc.d/rc.local
Add configuration:
/etc/init.d/fdfs_trackerd start

⑦ Tracker server directory and file structure

After the Tracker service is successfully started, two directories, data and logs, will be created under base_path. The directory structure is as follows:

${base_path}
 |__data
 | |__storage_groups.dat: storage group information| |__storage_servers.dat: storage server list|__logs
 | |__trackerd.log: tracker server log file

4. Configure FastDFS Storage

① Enter the /etc/fdfs directory, copy the FastDFS storage sample configuration file storage.conf.sample, and rename it to storage.conf

# cd /etc/fdfs
# cp storage.conf.sample storage.conf
# vim storage.conf

② Edit storage.conf

Those marked in red need to be modified, and the others can be left as default.

# Whether the configuration file is ineffective, false means it is effective disabled=false 

#Specify the group (volume) where this storage server is located
group_name=group1

# storage server service port port=23000

# Heartbeat interval, in seconds (this refers to actively sending heartbeats to the tracker server)
heart_beat_interval=30

# Storage data and log directory address (the root directory must exist, subdirectories will be automatically generated)
base_path=/ljzsg/fastdfs/storage

# The storage server supports multiple paths when storing files. The number of base paths for storing files is configured here, usually only one directory is configured.
store_path_count=1


# Configure store_path_count paths one by one, with index numbers based on 0.
# If store_path0 is not configured, it will be the same as the path corresponding to base_path.
store_path0=/ljzsg/fastdfs/file

# FastDFS uses two-level directories when storing files. The number of directories where files are stored is configured here. 
# If this parameter is only N (eg 256), the storage server will automatically create N * N subdirectories for storing files under store_path when it is first run.
subdir_count_per_path=256

# List of tracker_servers, will actively connect to tracker_server
# When there are multiple tracker servers, write a line for each tracker server tracker_server=file.ljzsg.com:22122# The time period allowed for system synchronization (the default is all day). This is usually set to avoid problems caused by peak synchronization. sync_start_time=00:00sync_end_time=23:59
# Access port http.server_port=80

③ Create a Storage basic data directory, corresponding to the base_path directory

# mkdir -p /ljzsg/fastdfs/storage

# This is the configured store_path0 path
# mkdir -p /ljzsg/fastdfs/file

④ Open the storage port in the firewall (default 23000)

# vim /etc/sysconfig/iptables

Add the following port line:
-A INPUT -m state --state NEW -m tcp -p tcp --dport 23000 -j ACCEPT

Restart the firewall:
# service iptables restart

⑤ Start Storage

Make sure Tracker is enabled before enabling Storage. If the first startup is successful, two directories, data and logs, will be created under the /ljzsg/fastdfs/storage directory.

You can start it this way
# /etc/init.d/fdfs_storaged start

You can also use this method, and use this method later.
# service fdfs_storaged start

Check whether Storage is started successfully. Port 23000 is being monitored, which means Storage is started successfully.

# netstat -unltp|grep fdfs

To turn off the Storage command:

# service fdfs_storaged stop

Check whether Storage and Tracker are communicating:

/usr/bin/fdfs_monitor /etc/fdfs/storage.conf

⑥ Set Storage to start at boot

# chkconfig fdfs_storaged on

or:
# vim /etc/rc.d/rc.local
Add configuration:
/etc/init.d/fdfs_storaged start

⑦ Storage Directory

Similar to Tracker, after Storage is successfully started, data and logs directories are created under base_path to record the information of Storage Server.

Under the store_path0 directory, N*N subdirectories are created:

5. File upload test

① Modify the client configuration file in the Tracker server

# cd /etc/fdfs
# cp client.conf.sample client.conf
# vim client.conf

Just modify the following configuration and keep the others as default.

# Client data and log directory
base_path=/ljzsg/fastdfs/client

# Tracker Port
tracker_server=file.ljzsg.com:22122

② Upload test

Execute the following command in Linux to upload the namei.jpeg picture

# /usr/bin/fdfs_upload_file /etc/fdfs/client.conf namei.jpeg

After successful upload, the file ID number is returned: group1/M00/00/00/wKgz6lnduTeAMdrcAAEoRmXZPp870.jpeg

The returned file ID is composed of the group, storage directory, two-level subdirectories, fileid, and file suffix (specified by the client, mainly used to distinguish file types).

3. Install Nginx

The file was uploaded successfully above, but we were unable to download it. Therefore, Nginx is installed as a server to support HTTP access to files. At the same time, the Nginx environment is also required to install the Nginx module of FastDFS later.

Nginx only needs to be installed on the server where StorageServer is located to access files. Since it is a single machine here, TrackerServer and StorageServer are on the same server.

1. Install the environment required for nginx

① gcc installation

# yum install gcc-c++

② PCRE pcre-devel installation

# yum install -y pcre pcre-devel

③ zlib installation

# yum install -y zlib zlib-devel

④ OpenSSL installation

# yum install -y openssl openssl-devel

2. Install Nginx

① Download nginx

# wget -c https://nginx.org/download/nginx-1.12.1.tar.gz

② Decompression

# tar -zxvf nginx-1.12.1.tar.gz
# cd nginx-1.12.1

③ Use default configuration

# ./configure

④ Compile and install

# make
# make install

⑤ Start nginx

# cd /usr/local/nginx/sbin/
# ./nginx

Other commands
# ./nginx -s stop
# ./nginx -s quit
# ./nginx -s reload

⑥ Set up startup

# vim /etc/rc.local

Add a line:
/usr/local/nginx/sbin/nginx

# Set execution permissions
# chmod 755 rc.local

⑦ Check the version and module of nginx

/usr/local/nginx/sbin/nginx -V

⑧ Open Nginx port in the firewall (default 80)

After adding, you can access it using port 80 on this machine.

# vim /etc/sysconfig/iptables

Add the following port line:
-A INPUT -m state --state NEW -m tcp -p tcp --dport 80 -j ACCEPT

Restart the firewall:
# service iptables restart 

3. Access files

Simple test access file

① Modify nginx.conf

# vim /usr/local/nginx/conf/nginx.conf

Add the following line to map /group1/M00 to /ljzsg/fastdfs/file/data
location /group1/M00 {
 alias /ljzsg/fastdfs/file/data;
}# Restart nginx# /usr/local/nginx/sbin/nginx -s reload

② Access the previously uploaded image in the browser successfully.

http://file.ljzsg.com/group1/M00/00/00/wKgz6lnduTeAMdrcAAEoRmXZPp870.jpeg

4. FastDFS Configuration Nginx Module

1. Install and configure Nginx module

① fastdfs-nginx-module module description

FastDFS stores files on the Storage server through the Tracker server, but files need to be copied between the storage servers in the same group, which causes synchronization delays.

Assume that the Tracker server uploads the file to 192.168.51.128, and the file ID is returned to the client after the upload is successful.

At this time, the FastDFS storage cluster mechanism will synchronize this file to the storage in the same group 192.168.51.129. If the client uses this file ID to retrieve the file on 192.168.51.129 before the file is copied, an error message indicating that the file cannot be accessed will occur.

The fastdfs-nginx-module can redirect file links to the source server to retrieve files, avoiding file access errors caused by client replication delays.

② Download fastdfs-nginx-module and decompress it

# Why is there such a long string here? Because there are some version issues between the latest version of master and the current nginx.
# wget https://github.com/happyfish100/fastdfs-nginx-module/archive/5e5f3566bbfa57418b5506aaefbe107a42c9fcb1.zip

# unzip 5e5f3566bbfa57418b5506aaefbe107a42c9fcb1.zip

# Rename # mv fastdfs-nginx-module-5e5f3566bbfa57418b5506aaefbe107a42c9fcb1 fastdfs-nginx-module-master

③ Configure Nginx

Adding modules to nginx

# Stop the nginx service first# /usr/local/nginx/sbin/nginx -s stop Enter the decompressed package directory# cd /softpackages/nginx-1.12.1/

# Add module# ./configure --add-module=../fastdfs-nginx-module-master/src

Recompile and install # make && make install

④ View Nginx modules

# /usr/local/nginx/sbin/nginx -V

The following indicates that the module has been added successfully

⑤ Copy the configuration file in the fastdfs-nginx-module source code to the /etc/fdfs directory and modify

# cd /softpackages/fastdfs-nginx-module-master/src

# cp mod_fastdfs.conf /etc/fdfs/

Modify the following configuration, other default

#Connection timeout connect_timeout=10

Tracker Server
tracker_server=file.ljzsg.com:22122
# StorageServer default port storage_server_port=23000

# If the uri of the file ID contains /group**, set it to true
url_have_group_name = true

# The store_path0 path of the Storage configuration must be consistent with the one in storage.conf store_path0=/ljzsg/fastdfs/file

⑥ Copy some FastDFS configuration files to the /etc/fdfs directory

# cd /softpackages/fastdfs-5.05/conf/

# cp anti-steal.jpg http.conf mime.types /etc/fdfs/

⑦ Configure nginx and modify nginx.conf

# vim /usr/local/nginx/conf/nginx.conf

Modify the configuration, other defaults

Add the fastdfs-nginx module to port 80

location ~/group([0-9])/M00 {
 ngx_fastdfs_module;
}

Notice:

The listen 80 port value should correspond to http.server_port=80 in /etc/fdfs/storage.conf (changed to 80 earlier). If you change to another port, you need to unify it and open the port in the firewall.

For location configuration, if there are multiple groups, configure location ~/group([0-9])/M00. If there are no groups, no group is required.

⑧ Create a soft link under the /ljzsg/fastdfs/file file storage directory and link it to the directory where the data is actually stored. This step can be omitted.

# ln -s /ljzsg/fastdfs/file/data/ /ljzsg/fastdfs/file/data/M00

⑨ Start nginx

# /usr/local/nginx/sbin/nginx

The configuration is successful if the print location is as follows

⑩ Visit in the address bar.

If you can download the file, the installation is successful. Note that unlike the direct use of nginx routing access in the third point, the fastdfs-nginx-module module is configured here to redirect file links to the source server to retrieve files.

http://file.ljzsg.com/group1/M00/00/00/wKgz6lnduTeAMdrcAAEoRmXZPp870.jpeg

Final deployment structure diagram (stolen picture): You can build the environment according to the following structure.

5. Java Client

Now that the file system platform has been built, we need to write the client code to implement uploading and downloading in the system. Here is just a simple test code.

1. First, you need to build the FastDFS client Java development environment

① Use Maven for dependency management in the project. You can introduce the following dependencies in pom.xml:

<dependency>
 <groupId>net.oschina.zcx7878</groupId>
 <artifactId>fastdfs-client-java</artifactId>
 <version>1.27.0.0</version>
</dependency>

For other methods, refer to the official documentation: https://github.com/happyfish100/fastdfs-client-java

② Import configuration files

You can directly copy fastdfs-client.properties.sample or fdfs_client.conf.sample in the package to your project and remove .sample.

Here I directly copy the configuration in fastdfs-client.properties.sample to the project configuration file config.properties and modify tracker_servers. Just load this configuration file

2. Client API

Personal packaged FastDFS Java API is synchronized to github: https://github.com/bojiangzhou/lyyzoo-fastdfs-java.git

6. Permission Control

Previously, nginx was used to support http file access, but everyone can directly access this file server, so some permission control is done.

The permission control of FastDFS is to enable token verification on the server side. The client obtains the token based on the file name, current Unix timestamp, and secret key. The file can be accessed through http by adding the token parameter in the address.

① Enable token verification on the server

Modify http.conf
# vim /etc/fdfs/http.conf

Set to true to enable token verification http.anti_steal.check_token=true Set the token expiration time in seconds (s) http.anti_steal.token_ttl=1800
The key should be consistent with fastdfs.http_secret_key in the client configuration file http.anti_steal.secret_key=FASTDFS1234567890

If the token check fails, the returned page is http.anti_steal.token_check_fail=/ljzsg/fastdfs/page/403.html

Remember to restart the service.

② Configure the client

The client only needs to set the following two parameters, and the keys on both sides remain consistent.

# Token anti-hotlink function
fastdfs.http_anti_steal_token=true
# Key
fastdfs.http_secret_key=FASTDFS1234567890

③ The client generates a token

To access the file, you need to bring the generated token and the Unix timestamp, so the returned token is the concatenation of the token and the timestamp.

After that, you can access it by concatenating the token to the address: file.ljzsg.com/group1/M00/00/00/wKgzgFnkaXqAIfXyAAEoRmXZPp878.jpeg?token=078d370098b03e9020b82c829c205e1f&ts=1508141521

/**
  * Get the token to access the server and add it to the address*
  * @param filepath file path group1/M00/00/00/wKgzgFnkTPyAIAUGAAEoRmXZPp876.jpeg
  * @param httpSecretKey key * @return return token, such as: token=078d370098b03e9020b82c829c205e1f&ts=1508141521
  */
 public static String getToken(String filepath, String httpSecretKey){
  // unix seconds
  int ts = (int) Instant.now().getEpochSecond();
  //token
  String token = "null";
  try {
   token = ProtoCommon.getToken(getFilename(filepath), ts, httpSecretKey);
  } catch (UnsupportedEncodingException e) {
   e.printStackTrace();
  } catch (NoSuchAlgorithmException e) {
   e.printStackTrace();
  } catch (MyException e) {
   e.printStackTrace();
  }

  StringBuilder sb = new StringBuilder();
  sb.append("token=").append(token);
  sb.append("&ts=").append(ts);

  return sb.toString();
 }

④ Notes

If the generated token verification fails, please perform the following two checks:
A. Confirm that the token generation function (ProtoCommon.getToken) is called and the file ID passed does not contain the group name. The format of the file ID transmitted is as follows: M00/00/00/wKgzgFnkTPyAIAUGAAEoRmXZPp876.jpeg

B. Confirm that the server time is basically consistent. Note that the server time cannot differ too much, and should not differ to the minute level.

⑤ By comparison, if the system files have high privacy, you can directly access them through the API provided by fastdfs-client without configuring Nginx to access them through http. The main purpose of configuring Nginx is to quickly access server files (such as pictures). If permission verification is also required, the client needs to generate a token, which is actually meaningless.

The point is, I didn't find out how FastDFS can add token verification to some resources and open them partially. If anyone knows, please leave a message.

OK, the above is the process of using FastDFS to build a file system and upload and download in a single machine.

<<:  MySQL 8.0.19 installation detailed tutorial (windows 64 bit)

>>:  Vue+Router+Element to implement a simple navigation bar

Recommend

A brief analysis of the usage of HTML float

Some usage of float Left suspension: float:left; ...

Web project development JS function anti-shake and throttling sample code

Table of contents Stabilization Introduction Anti...

Detailed explanation of keywords and reserved words in MySQL 5.7

Preface The keywords of MySQL and Oracle are not ...

A brief discussion on the problem of forgotten mysql password and login error

If you forget your MySQL login password, the solu...

Research on the value of position attribute in CSS (summary)

The CSS position attribute specifies the element&...

Font references and transition effects outside the system

Copy code The code is as follows: <span style=...

Windows 10 installation vmware14 tutorial diagram

Software Download Download software link: https:/...

A detailed discussion on detail analysis in web design

In design work, I often hear designers participati...

33 ice and snow fonts recommended for download (personal and commercial)

01 Winter Flakes (Individual only) 02 Snowtop Cap...

Using Openlayer in Vue to realize loading animation effect

Note: You cannot use scoped animations! ! ! ! via...

Automatically log out inactive users after login timeout in Linux

Method 1: Modify the .bashrc or .bash_profile fil...

Detailed explanation of Vue's SSR server-side rendering example

Why use Server-Side Rendering (SSR) Better SEO, s...

How to install pyenv under Linux

Prerequisites Need to install git Installation St...