If you have ever tailed your access logs (eg. tail -f /var/log/nginx/access.log) you may have seen some bad bot signs such as rapid requests for exploitable WordPress plugins or brute forcing logins on your site. The first obvious defence of course is having strong passwords and up to date WordPress files. (from the base install to the plugins and themes that go with) Knowing you’re up to date removes some of the anxiety of seeing the following fly by on your screen:
91.200.12.70 davelozier.com - [06/Jun/2014:22:01:17 -0400] "GET / HTTP/1.1" 200 7807 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.24 Safari/535.1" 91.200.12.70 davelozier.com - [06/Jun/2014:22:01:23 -0400] "GET /wp-content/plugins/export-to-text/export-to-text.css HTTP/1.1" 404 206 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_5_8) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/14.0.803.0 Safari/535.1" 91.200.12.70 davelozier.com - [06/Jun/2014:22:01:27 -0400] "GET /wp-content/plugins/auto-attachments/a-a.css HTTP/1.1" 404 206 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US) AppleWebKit/534.16 (KHTML, like Gecko) Ubuntu/10.10 Chromium/10.0.648.0 Chrome/10.0.648.0 Safari/534.16" 91.200.12.70 davelozier.com - [06/Jun/2014:22:01:31 -0400] "GET /wp-content/plugins/pica-photo-gallery/LICENSE.txt HTTP/1.1" 404 4387 "-" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_2; en-US) AppleWebKit/534.16 (KHTML, like Gecko) Chrome/10.0.648.133 Safari/534.16" 91.200.12.70 davelozier.com - [06/Jun/2014:22:01:31 -0400] "GET /wp-content/plugins/livesig/jquery-ui-tabs.pack.js HTTP/1.1" 404 206 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/535.19 (KHTML, like Gecko) Chrome/18.0.1025.45 Safari/535.19" 91.200.12.70 davelozier.com - [06/Jun/2014:22:01:31 -0400] "GET /wp-content/plugins/foxypress/Coupons.csv HTTP/1.1" 404 537 "-" "Mozilla/5.0 Slackware/13.37 (X11; U; Linux x86_64; en-US) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.41" 91.200.12.70 davelozier.com - [06/Jun/2014:22:01:31 -0400] "GET /wp-content/plugins/wp-table/index.html HTTP/1.1" 404 537 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.66 Safari/535.11" 91.200.12.70 davelozier.com - [06/Jun/2014:22:01:31 -0400] "GET /wp-content/plugins/cac-featured-content/cac-featured-content-helper.php HTTP/1.1" 404 537 "-" "Mozilla/5.0 (X11; U; Linux x86_64; en-US) AppleWebKit/534.16 (KHTML, like Gecko) Chrome/10.0.648.133 Safari/534.16" 91.200.12.70 davelozier.com - [06/Jun/2014:22:01:31 -0400] "GET /wp-content/plugins/wp-levoslideshow/functions.php HTTP/1.1" 404 537 "-" "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.215 Safari/535.1" 91.200.12.70 davelozier.com - [06/Jun/2014:22:01:31 -0400] "GET /wp-content/plugins/lisl-last-image-slider/lisl.php HTTP/1.1" 404 537 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/534.16 (KHTML, like Gecko) Chrome/10.0.634.0 Safari/534.16"
Bots are everywhere on the Internet and surprisingly, humans make up less than 40% of the traffic we’re seeing on our websites. Statista made the chart below based on data from Incapsula, a cloud application platform.
Bad bots vs. good ones is practically 50/50! Good bots will adhere to your robots.txt file directives and wont normally hammer on your website. Bad bots on the other hand really don’t care. If they knock your Nginx + PHP-FPM website over flooding it with requests they’ll just come back later and try again. Our second defence is where the Nginx HttpLimitReqModule ngx_http_limit_req_module module (0.7.21+) comes into play. We can slow these bad bots down.
First we need to edit the Nginx configuration file (ie. /etc/nginx/nginx.conf) and add the following line in the “http” block:
limit_req_zone $binary_remote_addr zone=onereqpersec:10m rate=1r/s;
Then add the following line to the “location” blocks for static files and PHP requests:
limit_req zone=onereqpersec burst=5;
With these configuration settings, excessive requests will be delayed until their number exceeds the maximum burst size and then the request is terminated with an 503 error (Service Temporarily Unavailable).
If you have ever experienced your website seeming sluggish or unresponsive there is a good chance it was because of a bad bot consuming all of it’s resources. If you’re running Nginx you cool their engines by rate limiting their requests!