Detailed explanation of how NGINX counts the website's PV, UV, and independent IP

Detailed explanation of how NGINX counts the website's PV, UV, and independent IP

Nginx: PV, UV, independent IP

Everyone who makes websites knows that they often need to check the website's PV, UV and other website access data. Of course, if the website has a CDN, the nginx local log is meaningless. The following is a statistical analysis of the log access data of the nginx website;

concept:

  • UV (Unique Visitor): Independent visitor, each independent Internet access computer (based on cookies) is regarded as a visitor, the number of visitors who visit your website within one day (00:00-24:00). The same cookie is only counted once per day.
  • PV (Page View): Visits, that is, page views or clicks. Each visit of a user to the website is recorded once. The user visits the same page multiple times, and the visit value is accumulated
  • Counting independent IPs: The same IP address is only counted once between 00:00 and 24:00. Friends who do website optimization are most concerned about this.

First, let's state the environment. This time, we are running nginx version 1.7, and the backend Tomcat is running a dynamic interactive program (user authentication is required. If it is a static page, the cache value cannot be captured, and $http_cookie is empty). That's it;

nginx log file configuration

http {
  include mime.types;
  default_type application/octet-stream;
  log_format main '$remote_addr - [$time_local] "$request" '
            ' - $status "User_Cookie:$guid" ';
 #User_Cookie is the log display character, $guid is a variable, the specific content is defined below, you can also write $http_cookie in the log format to display the complete cookie content<br>
  sendfile on;
  keepalive_timeout 65;
    upstream backserver {
    ip_hash;
    server 1.1.2.2:8080;
    server 1.1.2.3:8080;
}
server {
    listen 80;
    server_name localhost;
    #if ( $http_cookie ~* "(.*)$") matches all contentif ( $http_cookie ~* "CSID=([A-Z0-9]*)"){
        set $guid $1;
    } #Only match CSID character information, here is a regular expression<br>
    access_log logs/host.access.log main;
     location ~* ^(.*)$ {
       #limit_req zone=allips burst=1 nodelay;
 
       proxy_pass http://backserver;
       proxy_set_header Host $host;
       proxy_set_header X-Real-IP $remote_addr;
       proxy_set_header REMOTE-HOST $remote_addr;
       proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
       client_max_body_size 8m;
       }
    error_page 500 502 503 504 /50x.html;
    location = /50x.html {
      root html;
    }
} 

Note: The values ​​in $http_cookie are cookie values, separated by ";"

Log output format

192.168.40.2 - [02/Nov/2016:15:44:35 +0800] "GET /wcm/app/main/refresh.jsp?r=1478072325778 HTTP/1.1" - 200 "User_Cookie:7F00000122A5597C46607B1C0A7EC016"
192.168.40.2 - [02/Nov/2016:15:44:35 +0800] "GET /webpic/W0201611/W020161102/W020161102566715167404.jpg HTTP/1.1" - 200 "User_Cookie:7F00000122A5597C46607B1C0A7EC016"
119.255.31.109 - [02/Nov/2016:15:44:36 +0800] "GET /wcm/app/main/refresh.jsp?r=1478072510132 HTTP/1.1" - 200 "User_Cookie:7F000001237921BE9237838AEC65704D"
119.255.31.109 - [02/Nov/2016:15:44:36 +0800] "GET /wcm/app/message/message_query_service.jsp?READFLAG=0&MSGTYPES=1%2C2%2C3 HTTP/1.1" - 200 "User_Cookie:7F000001237921BE9237838AEC65704D"
192.168.40.2 - [02/Nov/2016:15:44:37 +0800] "GET /wcm/app/message/message_query_service.jsp?READFLAG=0&MSGTYPES=1%2C2%2C3 HTTP/1.1" - 200 "User_Cookie:7F00000123D3BF2345115EAAC21F71E0"
192.168.40.2 - [02/Nov/2016:15:44:37 +0800] "GET /wcm/app/message/message_query_service.jsp?READFLAG=0&MSGTYPES=1%2C2%2C3 HTTP/1.1" - 200 "User_Cookie:7F00000123EF73896DF98EDA9950944E"
192.168.40.2 - [02/Nov/2016:15:44:37 +0800] "GET /wcm/app/message/message_query_service.jsp?READFLAG=0&MSGTYPES=1%2C2%2C3 HTTP/1.1" - 200 "User_Cookie:7F00000123FE0F9C397E1A8F0C4F044B"
192.168.40.2 - [02/Nov/2016:15:44:37 +0800] "GET /wcm/app/main/refresh.jsp?r=1478072511427 HTTP/1.1" - 200 "User_Cookie:7F00000123A465B7EA1DE0AF0AE671B7"
119.255.31.109 - [02/Nov/2016:15:44:38 +0800] "GET /wcm/app/message/message_query_service.jsp?READFLAG=0&MSGTYPES=1%2C2%2C3 HTTP/1.1" - 200 "User_Cookie:7F00000123D89B11302DF80AE773C900"

PV Statistics

The number of visits to a single link address can be counted:

[root@localhost logs]# grep index.shtml host.access.log | wc -l

Total PV:

[root@localhost logs]# awk '{print $6}' host.access.log | wc -l

Dedicated IP

[root@localhost logs]# awk '{print $1}' host.access.log | sort -r | uniq -c | wc -l

UV Statistics

[root@localhost logs]# awk '{print $10}' host.access.log | sort -r | uniq -c | wc -l

Cookie Test Page

Regarding the type of cookies, you can use the following HTML code to edit and add the cookies you need to type:

#index.html
 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=gbk">
<meta http-equiv="Refresh" content="10"> //For testing purposes, refresh the page every 10 seconds</head>
<body>
<h1>test.test.com domain test</h1>
The cookies for this domain are listed below<br>
<p>
<script>
document.cookie="guid=A1UD8E5512451111111111"; //Kind cookies, appenddocument.cookie="city=beijing"; //Kind cookies, appenddocument.write(document.cookie); //List existing</script>
</p>
</body>
</html> 

The above is the full content of this article. I hope it will be helpful for everyone’s study. I also hope that everyone will support 123WORDPRESS.COM.

You may also be interested in:
  • Shell statistics pv and uv, independent ip method

<<:  Introduction to MySQL isolation level, lock and MVCC

>>:  7 native JS error types you should know

Recommend

Ubuntu 20.04 firewall settings simple tutorial (novice)

Preface In today's increasingly convenient In...

Ubuntu 16.04 image complete installation tutorial under VMware

This article shares with you the installation tut...

Summary of 7 types of logs in MySQL

There are the following log files in MySQL: 1: re...

Docker time zone issue and data migration issue

Latest solution: -v /usr/share/zoneinfo/Asia/Shan...

JavaScript to implement a simple web calculator

background Since I was assigned to a new project ...

MySQL paging analysis principle and efficiency improvement

MySQL paging analysis principle and efficiency im...

Tips for implementing multiple borders in CSS

1. Multiple borders[1] Background: box-shadow, ou...

Various ways to achieve the hollowing effect of CSS3 mask layer

This article introduces 4 methods to achieve mask...

Detailed explanation of CSS sticky positioning position: sticky problem pit

Preface: position:sticky is a new attribute of CS...

Detailed explanation of the pitfalls of Apache domain name configuration

I have never used apache. After I started working...

Common considerations for building a Hadoop 3.2.0 cluster

One port changes In version 3.2.0, the namenode p...

Docker installation and configuration image acceleration implementation

Table of contents Docker version Install Docker E...

6 solutions for network failure in Docker container

6 solutions for network failure in Docker contain...

The grid is your layout plan for the page

<br /> English original: http://desktoppub.a...

Web page CSS priority is explained in detail for you

Before talking about CSS priority, we need to und...