A few days ago, I received an email from Google informing me that I needed to migrate my Universal Analytics properties to Google Analytics 4. The basic problem is simple: GA4 doesn’t seem to be supported by the theme I use for my blog. However, many other considerations emerged as well.

Ethical Considerations Link to heading

Google Analytics is a service that allows tracking of users who visit a website, collecting information such as the browser used, operating system, geographic location, etc. While this data is useful for gaining insight into who visits my site, I also realize that it’s not ethical to collect this data for Google without user consent (yes, I haven’t put up the banner yet, you can call the police). For this reason, I’ve decided to remove Google Analytics from my site.

GoAccess Link to heading

Still, I wanted to keep track of how many people read my blog and all the connected sites. I evaluated some options and my choice fell on GoAccess, an open source software that allows you to analyze the logs of a web server and generate real-time statistics. In my case, the log in question is that of nginx.

This allows me to analyze and store access data from a log that is already generated on its own, without having to forward the data to third parties.

How it works Link to heading

In particular, for our use, GoAccess starts up and creates a service that generates an html page that updates in real time thanks to the data arriving via websocket that is always raised by GoAccess. The guide focuses on the configuration of nginx and the configuration of goaccess as a service in systemd.

Configuration Link to heading

I use Debian 11 and I installed the software version from the repository, for other options I refer you to the official guide.

Installation Link to heading

sudo apt install goaccess

First of all, let’s modify the goaccess configuration, which is located in /etc/goaccess/goaccess.conf, let’s uncomment the lines related to the date format, the time format and the log format, my configuration is related to my setup so look at it, the important thing is that there are at least these lines:

time-format %H:%M:%S
date-format %d/%b/%Y
log-format %h %^[%d:%t %^] "%r" %s %b "%R" "%u"

Nginx configuration Link to heading

Now I’m going straight to the point, we are interested in an html page that updates in real time the statistics read in the access.log, first we are going to add the websocket upgrade management in nginx, so we are going to modify /etc/nginx/nginx.conf and add:

http {
    map $http_upgrade $connection_upgrade {
        default upgrade;
        ''      close;
    }
}

Create the directory that will contain the HTML files generated by GoAccess:

sudo mkdir -p /var/www/html/goaccess

Then we modify /etc/nginx/sites-available/mydomain, I configure it on mydomain/analytic and the websocket on mydomain/ws, so I add:

upstream gwsocket {
     server 127.0.0.1:7890;
}

server {
    location /analytic/ {
        alias /var/www/html/goaccess;
        try_files $uri/report.html =404;

        location ~ ^/analytic/(.*)/(.*)\.html$ {
            alias /var/www/html/goaccess/goaccess_files/$1/$2.html;
        }
    }
    location /ws {
         proxy_http_version 1.1;
         proxy_set_header Upgrade $http_upgrade;
         proxy_set_header Connection $connection_upgrade;
         proxy_pass http://gwsocket;
         proxy_buffering off;
         proxy_read_timeout 7d;
     }
}

Web part done, now we test our configurations:

 sudo nginx -t

If everything is okay, let’s restart Nginx:

sudo systemctl restart nginx

Goaccess configuration as a service Link to heading

Now let’s create the service that will read the log and generate the html files, let’s create the file /etc/systemd/system/goaccess.service:

[Unit]
Description=GoAccess

[Service]
Type=simple
ExecStart=/usr/bin/goaccess -f /var/log/nginx/access.log \
          --real-time-html --ws-url=wss://mysite:443/ws \
          -o /var/www/html/goaccess/report.html --port=7890 \
          --config-file=/etc/goaccess/goaccess.conf 
ExecStop=/bin/kill ${MAINPID}
PrivateTmp=false
RestartSec=1800
User=root
Group=root
Restart=always

[Install]
WantedBy=multi-user.target

NOTE:: Some guides include the -g option, which gives me an error and is also missing from the official documentation. Additionally, there’s the --origin option which restricts websocket access, but when I tried to set it, it didn’t work for me. As a result, I’ve left it out.

Now we enable the service:

sudo systemctl enable goaccess

And we start it:

sudo systemctl start goaccess

If everything goes well, enable the service to start on boot:

sudo systemctl enable goaccess

For safety, check if the service is active:

sudo systemctl status goaccess

Done Link to heading

That’s it. Now, by visiting mysite/analytic/, you should see the GoAccess page updating in real-time. For reference, here’s the page with my configuration: https://halon.cc/analytic/.

Conclusion Link to heading

I’m currently content with the setup; however, there are still some unresolved matters:

  • The rotation of the nginx access.log file is fixed. While increasing the rotation period to a month is possible, this approach has its limitations. Maintaining an extensive log for an extended period can lead to problems. I’m aiming to discover a method to retain historical data without compromising the system’s stability. In theory, I should preserve some form of summarized data (which the software generates), but I’m still in the process of determining how to achieve this.

  • It’s less than ideal to have this page publicly accessible on the web. I need to devise a security mechanism that doesn’t necessitate a login—perhaps a token-based system.

For any inquiries or concerns related to the above, please don’t hesitate to ask below. I’m committed to incorporating improvements and rectifications to the guide.

Warm regards!