Internet robots, spiders, web bots, crawlers. Bots are known by many names. But what are they exactly?
Bots are little software programs that were created to perform elementary and repetitive tasks on the internet that are too time-consuming or dull for us to do. Think of indexing your website in search engines, or monitoring the health of your website. Unfortunately for us, bots can be bad as well.
Waving the Red Flag
Bots regularly come in the form of malware, and they are everywhere. Half of all internet traffic consists of bots and a quarter of them are bad bots. You can imagine that this can cause problems for your website. Some harmful issues bots can cause include the points below
There are some tip-offs that give away if your website has been visited by bad bots. A decreased average session length and increased bounce rates are major red flags. These bots can do some serious damage to your website. Luckily there are some effective ways available to block bots from your website.
Time to Fight Back
There is no holy grail available for blocking bots from your website. There are many different ways to block bots and each of them has its own benefits. Some ways to block bots are:
• CAPTCHA’s • Plug-ins • .htaccess
CAPTCHA’s are little tests on a website that distinguish robots from humans. They’re commonly used on websites that require human input, think of review sites or questionnaires. Note that they don’t block bots from your website, but from your forms. It’s an efficient way to ensure the data from your forms is honest.
The classic CAPTCHA is one we all know: the wavy and irregular letters, that we need to read and fill in ourselves. Google launched reCAPTCHA: just the simple click on the ‘’I’m not a robot’’ button does the job.
While these are effective to block bots from your forms, it can cause irritation with actual human visitors who just quickly want to move on to your website. Recently a different kind of CAPTCHA has emerged. Companies develop CAPTCHA’s that require fun little tasks or puzzles.
Keep in mind though, that bots are getting smarter and smarter, and have been known to be able to outsmart CAPTCHA’s. That’s why applying more than one way to protect your website from bots might be a smart move.
If you have a WordPress website, or some other CMS, plug-ins can be a very easy and time-saving way to block bots (especially for the not-so-tech-savvy amongst us). WordPress offers many different plug-ins to fight bad bots. One effective plug-in is the so-called Blackhole for Bad Bots
The Blackhole for Bad Bots plug-in sets up a nice little booby trap for bad bots. It requires your website to have a robots.txt file. Robots.txt is created for search engine bots that want to index your website. With the robots.txt file you can deny access to pages you do not want indexed, like the Thank You page. This plug-in sets up a hidden link to your website. The next step is to deny access to that link in your robots.txt file. When bad bots don’t follow your robots.txt, they will access the link, that sends them straight in a black hole.
A way to block the bad bots that requires a bit more skills is by using your .htaccess file. By adding code to your file, you can block bots immediately when they enter your website. Please note that one typo in your code could bring your whole website down, so make sure you know what your doing and always make a back-up of your .htaccess file.
Know your enemy! Start by identifying the bad bots. Identifying them is a complicated process, that requires you to analyze the log files of your server. There are also databases that offer lists of bad bots (note that these are never truly complete). Once they have been identified, blocking bots can be done through various ways such as User Agent, IP address, referrer and many more. Again, knowledge of .htaccess and HTML is necessary to do this. If you don’t have the required skills for this, many forums online can give you a hand. Some examples of these forms are: