Sunday, August 7, 2011

Turns out it was a Chinese bot.

As it turns out, FRED's recent downtime was caused by an ill-behaved crawler run by a Chinese search engine. When this issue first arose, one of the first things I did was to look for excessive numbers of requests coming from single IPs, and this bot had been among the top 3 or 4 clients over the better part of that day. But because it made fewer requests than other crawlers such as Google, Bing, and Yahoo, I discounted it as a cause of the issue. After all, it had made fewer requests than those other well-behaved bots, which FRED has no problem serving.

However, I had retrospectively counted the Chinese requests over the whole day in aggregate. When I had a chance to watch the server processes escalate in real time, I saw that one IP address was making as many as 100 concurrent requests! It was the IP of that Chinese spider.

Adding a line to FRED's firewall config fixed the problem by blocking that IP (their whole class B subnet actually). So FRED's search rank in this chinese search engine will suffer, but I'm ok with that. :^)

-p

1 comment:


  1. Myexamcollection CISM offers everything you need to pass the Certified Information Security Manager exam. With comprehensive materials and practice tests, you'll be fully prepared to ace your certification. Join today!
    Click Here for Your Success:
    https://www.myexamcollection.com/CISM-vce-questions.htm

    ReplyDelete