March 4, 2015
Deleting NGINX cache puts critical unlink errors in error log
The information in this post is based on FastCGI caching on NGINX 1.4.6 running on Ubuntu Server 14.04 x64. It may or may not be valid for other versions.
After migrating several sites from Apache to NGINX I have grown very fond of its built-in caching capabilities, which works extremely well under most circumstances without much meddling from me.
However, one thing I really can’t do without is the ability to clear the cache myself. The free community edition of NGINX only supports time-based cache expiry (i.e. you can set it up to check if something has changed after an hour, a day, etc.). But what if there is no reliable way of determining ahead of time when a certain resource will change? For example, I have no idea if it will be an hour, a day or a year before I come back and edit something in this post – and why only cache for an hour if caching for a day would have been fine?
This is where the ability to clear the cache manually (or by having your web application notify NGINX that something should be purged) is needed. The people behind NGINX are clearly aware of the need for this as the feature is supported in the paid version of their product – but while they are certainly entitled to set up their licensing any way they want, the price is a bit steep for me when this function is the only paid feature I really need.
Fortunately, it turns out you can just delete files from the cache directory yourself and NGINX will pick up on this and fetch a new copy from your back-end without a hitch. However, if you do this without tweaking your configuration you are likely to see a whole bunch of messages similar to this one in your error log after a while:
2015/03/04 17:35:24 [crit] 16665#0: unlink() "/path/to/nginx/cache/9/a0/53eb903773998c16dcc570e6daebda09" failed (2: No such file or directory)
It appears that these errors occur when NGINX itself tries to delete cache entries after the time specified by the inactive parameter of the fastcgi_cache_path directive. The default for this is only 10 minutes, but you can set it to whatever value you want. I’ve set it to 7 days myself, which seems to work well as I haven’t seen this error at all after changing it.
I find it really strange that it is considered a critical error that a cache entry cannot be deleted because it doesn’t exist. The fact that its severity classification is so high means that it’s impossible to get rid of just by ignoring log entries below a certain threshold. As soon as a new copy is fetched from the back-end the entry will exist again, so this should be a warning at most, in my opinion.
Now, if the cache entry couldn’t be deleted because of problems with permissions or something third, that would be a critical error, because it might make NGINX continue serving cached content long after its expiry time, but the clean-up process doesn’t seem to make this distinction.