Detailed explanation of Nginx reverse generation Mogilefs distributed storage example

Detailed explanation of Nginx reverse generation Mogilefs distributed storage example

1. Introduction to Distributed Storage System

With the continuous development of information technology, while it brings us convenience, the ever-increasing amount of data, the increasingly complex connections between information, the increasing concurrency of data access, the higher and higher requirements for I/O, and the increasingly complex data types have also become problems that need to be solved urgently for the continued rapid development of information technology. The emergence of distributed storage systems has largely solved most of the above problems.

A distributed storage system stores data in multiple independent devices. Traditional network storage systems use centralized storage servers to store all data. The storage server becomes the bottleneck of system performance and the focus of reliability and security, and cannot meet the needs of large-scale storage applications. The distributed storage system adopts a scalable system structure, uses multiple storage servers to share the storage load, and uses location servers to locate storage information. It not only improves the system's reliability, availability, and access efficiency, but is also easy to expand.

Distributed storage systems can be divided into general distributed storage and dedicated distributed storage based on interface type. General distributed storage means that there is no file system interface and it needs to be accessed through an API interface; dedicated distributed storage is also called a distributed file system, which generally has a file system interface and can be mounted directly. General distributed storage includes mogilefs, fastdfs, etc., and special distributed storage systems include moosefs, etc.

2. Mogilefs

MogileFS is an open source distributed file storage system. MogileFS is suitable for working scenarios where massive small files are stored. It was developed by Danga Interactive, a subsidiary of LiveJournal. The team has developed many well-known open source projects including Memcached, MogileFS, and Perlbal.

1. Mogilefs architecture diagram:

2. Components that make up Mogliefs:

1.Trackers (mogilefsd): The core component of Mogliefs, whose main functions are (Replication) node file replication, (Deletion) file deletion, (Query) metadata query, (Monitor) health monitoring, (Reaper) storage failure reset, etc. It is often called a metadata server, but it does not store metadata. Instead, it stores metadata in a database such as MySQL. To ensure the reliability of the architecture, there are usually multiple Trackers. Trackers can be seen as a side-by-side proxy that is only responsible for processing metadata information.

2. Database: The database is used to store Mogliefs metadata, and Trackers manage the data. Therefore, HA is usually recommended.

3.mogstored (storage node): where the actual files are stored. Usually at least two copies of the actual file are saved.

3. Example Demonstration Topology Diagram

Install Trackers and mogstored on three nodes at the same time, and choose one of the nodes to install MySQL. In a production environment, it is best to deploy MySQL separately and perform master-slave replication. Trackers and mogstored can also be deployed separately on different nodes, which needs to be decided based on the actual production environment. This is mainly to demonstrate mogilefs, not MySQL master-slave replication. If you want Mogilefs to be able to mount, you can use FUSE to achieve it.

It should be noted that the file URL stored in mogilefs is very special (the process of mogilefs file name generation will be explained later). For example, when storing a picture, the file URL may be in a format similar to 6060/0000/0000/0000/00000021.jpg, which is not very user-friendly. Users may need an intuitive URL like image.hello.com/21.jpg. Therefore, Nginx is usually used to replace Mogilefs.

4. System environment and installation

Mogilefs is a relatively old but mature distributed storage. Considering that there may be compatibility issues on Centos7, Centos6 is used for demonstration here.

Operating system: CentOS release 6.6

Mogilefs: 2.46

nginx: 1.10

mysql:5.1

IP Allocation:

n1:192.168.29.111, n2:192.168.29.112, n3:192.168.29.113, n4:192.168.29.114

The structure is shown in the figure above.

1. Install MySQL, mogilefsd, and mogstored on the n1 node, and configure n1 as Trackers and Storage Node

MySQL is installed directly using yum.

~]# yum install -y mysql mysql-server

Install the Trackers and Storage Node components of Mogilefs. When installing, be sure to install the Perl-related dependency packages, which are:

perl-Danga-Socket-1.61-1.el6.rf.noarch.rpm
perl-IO-stringy-2.110-1.2.el6.rfx.noarch.rpmperl-Net-Netmask-1.9015-8.el6.noarch.rpm
Perlbal-1.78-1.el6.noarch.rpmperl-Perlbal-1.78-1.el6.noarch.rpm
Perlbal-doc-1.78-1.el6.noarch.rpmperl-IO-AIO-3.71-2.el6.x86_64.rpm

The above dependency packages must be installed before installing Mogilefs. Install the components:

yum install -y MogileFS-Server-mogstored-2.46-2.el6.noarch.rpm MogileFS-Server-mogilefsd-2.46-2.el6.noarch.rpm MogileFS-Server-2.46-2.el6.noarch.rpm

Configure MogileFS-Server-mogilefsd:

~]# vim /etc/mogilefs/mogilefsd.conf #Main configuration file for Mogilfs Trackers
# Enable daemon mode to work in background and use syslog
daemonize = 1 #Whether to run as a daemon process.
# Where to store the pid of the daemon (must be the same in the init script)
pidfile = /var/run/mogilefsd/mogilefsd.pid #pid file path# Database connection information
db_dsn = DBI:mysql:mogilefs:host=192.168.29.111 #database address db_user = moguser #configure database user name and password db_pass = 123456
# IP:PORT to listen on for mogilefs client requests
listen = 0.0.0.0:7001 #Listening address and port # Optional, if you don't define the port above.
conf_port = 7001 #Default port # Number of query workers to start by default.
query_jobs = 10 #Number of query processes # Number of delete workers to start by default.
delete_jobs = 1 
# Number of replicate workers to start by default.
replicate_jobs = 5
# Number of reaper workers to start by default.
# (you don't usually need to increase this)
reaper_jobs = 1
# Number of fsck workers to start by default.
# (these can cause a lot of load when fsck'ing)
#fsck_jobs = 1
# Minimum amount of space to reserve in megabytes
# default: 100
# Consider setting this to be larger than the largest file you
# would normally be uploading.
#min_free_space = 200
# Number of seconds to wait for a storage node to respond.
# default: 2
# Keep this low, so busy storage nodes are quickly ignored.
#node_timeout = 2
# Number of seconds to wait to connect to a storage node.
# default: 2
# Keep this low so overloaded nodes get skipped.
#conn_timeout = 2
# Allow replication to use the secondary node get port,
# if you have apache or similar configured for GET's
#repl_use_get_port = 1

After modifying the configuration, enter the database to create a root user who can connect remotely, or use mogdbsetup to initialize the database:

mysql> GRANT ALL ON mogilefs.* TO 'moguser'@'192.168.29.%' IDENTIFIED BY '123456'; #Create user moguser, who has all permissions to manage the mogilefs database and allows users at 192.168.29.* to connect remotely.
mysql> FLUSH PRIVILEGES;
mysql> quit
~]# mogdbsetup --dbhost=127.0.0.1 --dbuser=moguser --dbpass=123456

After initialization is complete, you can see the created mogilefs library and the tables in it in MySQL:

Start mogilefs and confirm that port 7001 is in listening state:

~]# service mogilefsd start
Starting mogilefsd [ OK ]
~]# ss -lnt

Note: You can install Trackers services on both n2 and n3 nodes to eliminate the risk of single point failure and average I/O pressure.

3. Configure Storage Node on n1

The path of the Storage Node configuration file is /etc/mogilefs/mogstored.conf:

~]# vim /etc/mogilefs/mogstored.conf
maxconns = 10000 #Maximum number of concurrent connectionshttplisten = 0.0.0.0:7500 #Mogilefs data transmission is achieved through the http protocol. Here are the listening address and portmgmtlisten = 0.0.0.0:7501 #Listening address and port for health monitoringdocroot = /mogliefs/mogdata #Data storage path. The group and owner of the directory must be mogilefs

Create a data storage directory and change the group and owner to mogilefs:

~]# mkdir -pv /mogliefs/mogdata
~]# chown -R mogilefs.mogilefs /mogliefs/

Start mogstored and check whether the process is started normally and whether the port is listening:

~]# service mogstored start
~]# ss -lnt #Listening ports are 7500 and 7501

4. Follow the steps of n1 to install Mogilefs on nodes n2 and n3, and copy the configuration file on n1 to n2 and n3.

~]# scp /etc/mogilefs/*.conf [email protected]:/etc/mogilefs/
~]# scp /etc/mogilefs/*.conf [email protected]:/etc/mogilefs/

Start the mogstored service and confirm the monitoring:

~]# service mogstored start
~]# ss -lnt #Listening ports are 7500 and 7501

5. Use the mogadm command on n1 to integrate all nodes into a cluster.

Add a storage node and check:

1 ~]# mogadm host add 192.168.29.111 --ip=192.168.29.111 --port=7500 --status=alive
2 ~]# mogadm host add 192.168.29.112 --ip=192.168.29.112 --port=7500 --status=alive
3 ~]# mogadm host add 192.168.29.113 --ip=192.168.29.113 --port=7500 --status=alive
~]# mogadm check

If you want the storage in the mogilefs cluster to be recognized as different devices, you need to create a directory named dev* under the created /mogliefs/mogdata directory so that each node can be used as a storage device. Mogilefs stores redundancy in different devices, and each node should be identified as a different device.

Create dev1, dev2, and dev3 directories in the /mogliefs/mogdata/ directory on n1, n2, and n3, respectively, and add devices to Trackers:

1 ~]# mogadm device add 192.168.29.111 1
2 ~]# mogadm device add 192.168.29.112 2
3 ~]# mogadm device add 192.168.29.113 3

6. Create Domain and Class

In Mogilefs, in order to facilitate the management of file copies on multiple nodes, files are usually not managed as units in the device, but as classes. Operations such as copying and deletion are performed with class as the smallest unit. Many files can be placed in each class, and the size of the class is not fixed.

In the storage space of Mogilefs, all data files are on the same plane, so there cannot be cases where files have the same name, which will affect the flexibility of Mogilefs, so the concept of Domain (namespace) is introduced. Domain contains Class, and the same file name can exist in different Domains.

~]# mogadm domain add imgs #Create a Domain named imgs
~]# mogadm domain add text #Create a Domain named text
~]# mogadm domain list #View Domain list

You can customize the properties of the Class. The format is: mogadm class add <domain> <class> [opts]

~]# mogadm class add imgs png --mindevcount=3 --hashtype=MD5 #Define a class named png in Domain imgs, copy 3 copies to different devices, and use MD5 for verification~]# mogadm class add imgs jpg --mindevcount=3 --hashtype=MD5 #Define a class named jpg in Domain imgs, copy 3 copies to different devices, and use MD5 for verification~]# mogadm domain list

7. Use Mogilefs to do upload and download tests

Mogilefs can interact with its own API interface, which contains many commands for managing stored data. For example, the command to upload data is mogupload, and the command to view data is mogfileinfo, etc.

Example: Test uploading the file /test/123.png to the Mogilefs cluster (the file is prepared locally in advance):

~]# mogupload --trackers=192.168.29.111 --domain=imgs --class=png --key='/111.png' --file='/test/123.png' #Upload the 123.png file through Trackers with IP 192.168.29.111, save it to the space with Domain as imgs and Class as png, and rename it to 111.png
~]# mogfileinfo --trackers=192.168.29.111 --domain=imgs --class=png --key='/111.png' #Check the storage status of files with key 111.png in Domain imgs and Class png. 

At this point, the Mogilefs distributed storage cluster has been built, but if you want the client to communicate with it, you need to program on the interface, which is very troublesome. Fortunately, we can use Nginx as a reverse proxy for communication. The following demonstrates the steps of Nginx reverse Mogilefs.

5.Nginx anti-generation Mogilefs

1. Open the mogilefsd service of n2 and n3, and set all three nodes as Trackers (make sure the configuration file is the same as n1):

~]# service mogilefsd start

2. Compile and install Nginx on the n4 node

Install dependency packages:

~]# yum install gcc gcc-c++ perl pcre-devel openssl openssl-devel

Download the Nginx compilation installation package nginx-1.10.3.tar.gz and Nginx_Mogilefs module nginx_mogilefs_module-1.0.4.tar.gz and expand them:

~]# ls
nginx-1.10.3 nginx_mogilefs_module-1.0.4
nginx-1.10.3.tar.gz nginx_mogilefs_module-1.0.4.tar.gz
~]# cd nginx-1.10.3
./configure \
> --prefix=/usr \
> --sbin-path=/usr/sbin/nginx \
> --conf-path=/etc/nginx/nginx.conf \
> --error-log-path=/var/log/nginx/error.log \
> --http-log-path=/var/log/nginx/access.log \
> --pid-path=/var/run/nginx/nginx.pid \
> --lock-path=/var/lock/nginx.lock \
> --user=nginx \
> --group=nginx \
> --with-http_ssl_module \
> --with-http_flv_module \
> --with-http_stub_status_module \
> --with-http_gzip_static_module \
> --http-client-body-temp-path=/var/tmp/nginx/client/ \
> --http-proxy-temp-path=/var/tmp/nginx/proxy/ \
> --http-fastcgi-temp-path=/var/tmp/nginx/fcgi/ \
> --http-uwsgi-temp-path=/var/tmp/nginx/uwsgi \
> --http-scgi-temp-path=/var/tmp/nginx/scgi \
> --with-pcre \
> --with-debug \
> --add-module=../nginx_mogilefs_module-1.0.4/ #Be sure to add the path where the Mogilefs module is located, which is essential.
~]# make & make install

Add the nginx user and start nginx:

~]# useradd -s /sbin/nologin -M nginx
~]# /usr/sbin/nginx

3. Configure Nginx

Single Tracker Example:

location /imgs/ {
   mogilefs_tracker 192.168.29.111:7001; #Single Trackers example mogilefs_domain imgs; #Specify Domain
   mogilefs_class png jpg; #Specify Class

   mogilefs_pass { #transmission related configuration proxy_pass $mogilefs_path;
    proxy_hide_header Content-Type;
    proxy_buffering off;
   }
  }

Multiple Trackers Example:

Add the scheduling module to the http configuration section in the nginx configuration:

1 upstream mogsvr {
2 server 192.168.29.111:7001;
3 server 192.168.29.112:7001;
4 server 192.168.29.113:7001;
5 }

Add the following to the server configuration section in the nginx configuration:

location /imgs/ {
    mogilefs_tracker mogsvr;
   mogilefs_domain imgs;
   mogilefs_class png jpg;

   mogilefs_pass {
    proxy_pass $mogilefs_path;
    proxy_hide_header Content-Type;
    proxy_buffering off;
   }
  }

Restart nginx and access the previously uploaded images through nginx:

Summarize:

When uploading files, an error message MogileFS::Backend: couldn't connect to mogilefsdbackend at /usr/local/share/perl/5.8.4/Client.pm line 282 was encountered. This is because the mogilefsd service cannot connect to MySQL. You can find the error by checking the connection between them.

Well, the above is the full content of this article. I hope that the content of this article will have certain reference learning value for your study or work. If you have any questions, you can leave a message to communicate. Thank you for your support of 123WORDPRESS.COM.

You may also be interested in:
  • nginx FastDFS distributed storage module testing method
  • How to implement distributed current limiting using nginx
  • Nginx server reverse proxy proxy_pass configuration method explanation
  • Example of using nginx as a reverse proxy to achieve load balancing
  • Nginx reverse proxy websocket configuration example
  • Nginx learning summary 5 (nginx reverse proxy)

<<:  Detailed explanation of incompatible changes of components in vue3

>>:  MySQL sorting by conventional sorting, custom sorting, and sorting by Chinese pinyin letters

Recommend

Analysis of MySQL query sorting and query aggregation function usage

This article uses examples to illustrate the use ...

Web designer's growth experience

<br />First of all, I have to state that I a...

MySQL v5.7.18 decompression version installation detailed tutorial

Download MySQL https://dev.mysql.com/downloads/my...

Sample code for using CSS to write a textured gradient background image

The page length in the project is about 2000px or...

Detailed explanation of the workbench example in mysql

MySQL Workbench - Modeling and design tool 1. Mod...

Awk command line or script that helps you sort text files (recommended)

Awk is a powerful tool that can perform some task...

Getting Started Tutorial on GDB in Linux

Preface gdb is a very useful debugging tool under...

5 Reasons Why Responsive Web Design Isn’t Worth It

This article is from Tom Ewer's Managewp blog,...

JavaScript canvas realizes dynamic point and line effect

This article shares the specific code for JavaScr...

HTML/CSS Basics - Several precautions in HTML code writing (must read)

The warning points in this article have nothing t...

Use PSSH to batch manage Linux servers

pssh is an open source software implemented in Py...

Vuex implements a simple shopping cart

This article example shares the specific code of ...