Viewing web logs the old fashioned way with Goaccess / Apr 2025
Like many others, my website is under a constant barrage of crawling from bots. I need to figure out which one is hosing me, but I am also resisting having third-party trackers of any form. I took a look at hosting a Plausible instance as OCaml does, but it's yet another service to run and maintain. Then Nick Ludlam pointed me to an old-fashioned server-side log analyser with builtin privacy called Goaccess he's using on his site, which is also perfect for my needs!
Setting up Goaccess is extremely simple. It's a single binary with no dependencies outside of ncurses, and just needs some server side logs. I currently use Caddy to front the HTTP2/3 for my custom OCaml webserver, so I just had to configure it to output JSON-format logs.
anil.recoil.org {
reverse_proxy http://localhost:8080
encode zstd gzip
log {
format json
output file /var/log/caddy/anil.recoil.org.log {
roll_size 1gb
roll_keep 100
}
}
}
The above causes Caddy to log lines in a JSON format like this:
{ "level": "info", "ts": 1745414562.426229,
"logger": "http.log.access.log0",
"msg": "handled request",
"request": {
"remote_ip": "<snip>", "remote_port": "56839",
"client_ip": "<snip>", "proto": "HTTP/3.0",
"method": "GET", "host": "anil.recoil.org",
"uri": "/assets/home.svg",
"headers": {
"Sec-Fetch-Dest": [ "image" ],
"Sec-Fetch-Site": [ "same-origin" ],
"Sec-Fetch-Mode": [ "no-cors" ],
"Priority": [ "u=5, i" ],
"Accept-Encoding": [ "gzip, deflate, br" ],
"Accept": [
"image/webp,image/avif,image/jxl,image/heic,image/heic-sequence,video/*;q=0.8,image/png,image/svg+xml,image/*;q=0.8,*/*;q=0.5" ],
"User-Agent": [
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/18.3.1 Safari/605.1.15"
],
"Referer": [ "https://anil.recoil.org/" ],
"Accept-Language": [ "en-GB,en;q=0.9" ]
},
"tls": {
"resumed": false, "version": 772, "cipher_suite": 4865,
"proto": "h3", "server_name": "anil.recoil.org"
}
}, <...etc>
}
While this is a verbose logging format, it compresses very well and has lots of information that can be analysed without the need for any JavaScript. Once the logging is setup, just running goaccess <logfile>
spins up a curses configuration from which I can select the Caddy log format.
After that, there is a simple interactive terminal dashboard that not only shows the usual analytics, but also fun things like operating system and time-of-access frequency patterns.
The tool can also blank out IP addresses in order to preserve privacy in the output analytics, and also spit out an HTML report, although I'm not using this particular functionality. While Plausible looks like the answer for bigger sites, this simple tool is good enough for me. The very first iteration of this site in 1998 used to use Analog (written by my former Xen/Docker colleague Stephen Turner), so it's nice to go back full circle to this sort of tool again!
Related News
- The AIETF arrives, and not a moment too soon / Feb 2025