Do you know how to create custom robots.txt file for WordPress? Did you hear about it before? Well this post is exclusively for them who are not aware of this. All of the bloggers who are doing blog should definitely know about it. Mainly if you are a search engine optimizer, you should be a geek about it. So, let’s see what is custom robots file and how it works.
You May Like to Read: Good Bounce Rate for WordPress Blog
What is a Robots.txt File?
Basically a robots.txt file is a simple text file which includes instructions for the search engine spiders. A search engine bot or spider uses to crawl a site entirely. They crawl the whole accessible site and index them on the engine. But sometimes a webmaster would like to include or exclude a single or a series of files. So, what to do then? How to tell the spiders that you want to include this but want to exclude that? Well for that, you have to create robots.txt file. Here I will give instructions about robots.txt for wordpress. But in this way, you can add this to any platform.
How to Create Custom Robots.txt File
Creating the file is really easy because it is just one type of files with the extension txt. There are two methods to create robots file for wordpress. One is by creating a file in your computer and then uploading it to root folder or public_html. Simply right click on your computer, create a text file and save with the name robots.txt, and upload it to your public_html directory. Another is by creating a file directly in the root or public_html folder with the name robots.txt. So, no matter which method is used, all is to have the file in the public_html. But that is not done. You have to add custom codes to your robots.txt file. Simply edit it and write down the codes according to your wordpress blog structure. Let’s see what type of codes should be there. But please note that your file url should be http://yourdomain.com/robots.txt. You can have a look of my blog’s robots.txt file.
Add Custom Codes for Robots.txt
Allow Index All Pages
A default robots.txt allowing all links is looked like this:
User-agent: * Disallow:
User-agent: * Allow: /
Here the asterisk icon (*) is for selecting all spider user agents. And the value of disallow blank means you don’t want to exclude pages. Or you could use allow complete directory. Generally it is not required because if you want to allow search engines to crawl full site, you should make the robots.txt blank or delete it. This file’s main purpose is to exclude pages.
Disallow All Pages
User-agent: * Disallow: /
This is the code to disallow the whole WordPress site for all user-agents.
Disallow A Folder for All User-agent
User-agent: * Disallow: /folder/
This code instructs all user agents to disallow the directory.
Disallow A Folder for Custom User-Agent
User-agent: Googlebot-Mobile Disallow: /folder/
This code only instructs the given user agent to disallow the directory.
Disallow a Full Folder But a File
User-agent: * Disallow: /folder/ Allow: /folder/file.html
Disallow a Series of Same Type Links
Sometimes you may need to disallow the pages that automatically generate by the user requests. Such as a search on your WordPress blog generate dynamic urls. To eliminate this you have to use this type of code:
User-agent: * Disallow: /search/*
This means you want to exclude all pages after that folder.
Disallow Pages with Matching Extentions
If you want to stop crawlers from crawling pages with specific extension, you have to choose the following entry code:
User-agent: * Disallow: /*.html$
N.B: In the following entries, you have seen two exclusive characters asterisk (*) and dollar ($). The asterisk sign means you are specifying the rest of the urls. As an example, if your code is Disallow: /folder/* that’s mean you are specifying the url with the folder mark and the rest after the slash. And the dollar sign is for ending up a url. If you want to end up a specific url, you should use it.
You Should Read: 10 Tips to Improve Page Load Time
Want More Entries for Robots Files?
In this post, I have tried to help you learn the basic structure of this. These are the basic entries that you could use in your blogs like WordPress, Blogger or any other platform. But that is not all. Sometimes you may need to eliminate critical pages. Don’t worry, if you weren’t successful, I could help you. Comment with your custom codes request and I will try to help. But finally I think, you have learned a little bit about the ways to create custom robots.txt file for WordPress to maintain crawling of search engine spiders.
Did you find this post helpful? Get our best business blogging tips via email (it's free)!
You are almost finished. Just log in to your email and confirm subscription.
Something went wrong.