With the search engines like Baidu, Bing and Google requesting a lot of pages per second (our record is currently 80 per second only for search engines), you might want to throttle some of the down.
If you are running your own dedicated server on Linux, help is simple: there is a command called
to control the access to your server on a network level. It also has the option to limit the amount of requests to your machine per second, minute or even per hour.
IPtables can do more, read here about the 20 tips for system administrators. But what we want to do, is limiting the requests from Baidu (and maybe Yandex) to put some load off your servers:
You can address whole networks (CLASS-C) of corporations with it, if you want to get their network out of your hair:
# Baidu 220.127.116.11 # CLASS-C 180.76.5.* iptables -I INPUT -s 18.104.22.168/24 -p tcp --dport 80 -m limit --limit 10/min --limit-burst 25 -j ACCEPT
which limits the requests to your port 80 (standard web server) to 10 per minute with a short burst allowance of 25 for 1 minute.
Do not forget that Baidu (and other spiders) might come from more than one IP network, so in combination 10 per minute might just be the right value for you.