Whether you’re struggling to get your content ranked in Google or simply looking for an easy win in your WordPress SEO efforts, learning how to optimize your robots.txt file could be the solution.
This basic yet all-important file can make a big difference to the way search engine crawlers prioritize and index pages from your website in their search results.
In this guide, you’ll discover:
- How to locate and edit your robots.txt file in WordPress
- A complete step-by-step guide to WordPress robots.txt optimization
- Expert tips on WordPress robots.txt configuration for different use cases.
Ready to optimize?
Then let’s get into it.
What Exactly is Robots.txt In WordPress?
Robots.txt is a simple text file stored on your WordPress site that provides useful information to search engine crawler bots about which content to index.
Where To Find Robots.txt File In WordPress?
You’ll typically find your robots.txt file in the root directory of your WordPress installation. Unless you’ve specifically changed it, this should be the public_html folder.
You can check to see if it’s there just by adding /robots.txt to the end of your website’s URL, as in:
What’s Inside The Robots.Txt File?
Depending on how they’re configured, one robots.txt file may look very different from another.
For example, look at this screenshot above of a robots.txt file from a WordPress development agency. Now, compare it to the screenshot below, which is of Amazon.com’s robots.txt file.
Big difference, right?
So it’s fair to say that the exact contents of any given robots.txt file depends on the specific nature of the website and its owner’s content indexing preferences.
Even so, if you look at even a handful of the examples on this page, you’ll start to notice two key elements:
User agents and rules.
1. User agent
The user agent is simply the name of the bot that a specific set of robots.txt rules apply to go.
In most cases, you’ll see an asterisk (*) next to this comment, as in:
User-agent: *
This means that the rules apply to all bots (user agents) that visit your site.
If you wanted to target a specific bot, you would simply replace that asterisk with the name of that bot.
For example, if you want to stop OpenAI crawling WordPress to train ChatGPT, you would begin by replacing * with the name of the company’s web crawler –GPTBot- like so:
User-agent: GPTBot.
2. Rules
Robots.txt rules are the specific instructions given to bots about which content to crawl and which to ignore.
The ones you’ll see most often are:
A. Disallow
Disallow is perhaps the most common rule you’ll see in your file. It tells bots to ignore a certain file, directory, or even an entire site.
You’ll usually see WordPress sites blocking all bots from the wp-admin directory with the following code:
User-agent: *
Disallow: /wp-admin/
B. Allow
The only time you’ll need the Allow rule is if you’ve blocked bots from accessing a certain directory yet still want them to access a particular page within that directory.
For example, while disallowing access to wp-admin is a good idea for the sake of your WordPress security and performance, actually allowing access to the admin-ajax.php file can be useful.
This file allows functions like live search, real-time content updates, and even form submissions (such as in comments) to run in the background. So allowing bots to crawl it helps to keep these functions performing properly.
To allow bots to crawl admin-ajax without crawling the rest of your wp-admin folder, you would use the following:
User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php
C. Crawl Delay
Less common, though still useful, is the Crawl Delay rule, which tells bots how many seconds they should wait between each request to crawl your site.
This can be particularly useful to minimize the impact of bot crawlers on your site’s performance.
To use it, you would add the following line:
Crawl-delay: 10
This sets the crawl delay to 10 seconds between requests.
D. Sitemap
Finally, robots.txt often tells bots where to find your WordPress sitemap so that they can crawl and index your content more efficiently.
To add a sitemap to robots.txt, use the rule like so:
Sitemap:
But Why Robots.txt In WordPress? What Good Does It Do?
One thing you should know is that there is no rule, technical, legal, or otherwise, which states that a bot has to obey what’s written in your robots.txt file.
If an unethical company tells their bot to ignore this file or any of the rules in it, there’s not much you can do to stop it. Meanwhile, some companies -even the ones you want crawling your site- may respect the majority of rules while ignoring others.
Google, for example, ignores the Crawl Delay directive, instead using its own crawl rate methods to prevent its bots from overwhelming your site.
So with that being said, why go to all this trouble?
The answer is that robots.txt can make a big difference to your search rankings by informing search crawlers which content to prioritize.
Sure, Google may not give much of a hoot about your Crawl Delay rule, but they otherwise respect your wishes and crawl your site accordingly.
Another thing you need to know is that each time the Google crawl bot comes knocking on your website’s door, it can only crawl a certain number of pages before it has to leave again.
The exact number of pages is determined by your site’s Crawl Budget, which is the term used to describe the number of times a bot can crawl your site in a given time frame. Each site’s crawl budget is different and based on factors such as how much crawling your server can handle and how important Google perceives your site to be.
Now, imagine you’re a new site with a relatively small crawl budget.
Without clear direction, the Google crawler has no way of knowing what’s important to index and what isn’t. So, it just starts systematically going through each page until it’s used up your crawl budget.
What happens when it gets to non-public content that serves no purpose for your users and indexes that in Google, using up your crawl budget before it can get to those important pages you were counting on for conversion?
Those pages don’t get crawled, don’t get indexed, and don’t convert, rendering them a waste of time and resources.
So, why use robots.txt in WordPress? Because you want your priority ranking in Google and other search engines.
Creating and Editing a robots.txt File in WordPress
How you optimize robots.txt depends on whether you want the speed and convenience of a user-friendly WordPress SEO plugin or the hands-on control of editing it manually.
Not sure about the right route for you? Let’s take a look at both to help you decide:
Creating and Editing robots.txt Through the Yoast SEO Plugin
Although other SEO plugins like All-in-One SEO and RankMath also offer robots.txt optimization, Yoast is pretty much the standard bearer in WordPress SEO and the most popular tool of its type on the market, so that’s the example we’ll be using today.
To begin:
Login to WordPress and go to Yoast SEO – Tools, then click on File Editor.
If you don’t have a file in place yet, your first step to optimization is to create one. Do this by clicking the Create robots.txt file button.
If you do have one, you’ll see a text editor where you can make and save changes to the file.
For example, if yours looks like the one above, the first thing you’ll want to do is add rules that allow access to the admin-ajax.php file which disallows the rest of the wp-admin folder.
If you recall, we do this by using the following code to set rules:
Disallow: /wp/wp-admin/
Allow: /wp/wp-admin/admin-ajax.php
Add those rules to the file as in the illustration above, then hit Save Changes to robots.txt.
Creating and Editing robots.txt Manually
Prefer not to use a plugin? No problem, you can always download robots.txt to your device and manually edit it there.
You have two options to download your file:
A. Use an FTP Client
If you typically use an FTP tool such as Filezilla to manage things like theme and plugin uploads, you might find this the quickest option.
Simply:
- Connect to the server hosting your WordPress site
- Download the Robots.txt file to your device
- Edit it
- Reupload it to your site.
Or;
B. Edit Via Your Hosting Account
If you don’t typically use FTP, you’ll probably have to log into your hosting account anyway to get the FTP details for your website.
So, while you’re there, it might be just as easy to download Robots.txt from your hosting company’s file management tools.
(If you do still want to use FTP but don’t know where to start, try starting with our WordPress FTP Access Guide).
Where you find your file manager tools will vary depending on which hosting company you use.
In this example from Hostinger, we first go to Websitesˆand click on the Dashboard button next to the site whose file we want to edit.
Then, click File Manager.
Next, open up the public_html folder.
From there, find your Robots.txt file and tap the download icon to save it that way.
Whichever option you use, all you need to do now is open that file on your device, make and save your changes, and then reupload to your server via the same method (FTP or File Manager).
Pro Tip – A Few Best Practices From Us
Now that you know how to find and edit robots.txt in WordPress, there’s a lot you can do using the same basic user-agent designations and allow/disallow rules.
Before you go diving into your optimization efforts, here are a few tips and recommended best practices from our own WordPress SEO experts.
DO: Add a Sitemap
Sitemaps provide clear guidance to crawlers about the structure of your website, which can help it to rank your content effectively and accurately.
Although those crawlers will find your sitemap eventually, they’ll get there quicker if they visit your robots file first and are directed straight to it.
DON’T: Block Important Pages
The most effective way to optimize a robots.txt file is to ensure that your allow and disallow rules are configured correctly.
For example, if your code looks like this:
User-agent: *
Disallow: /
Then that means you’re blocking all bots from every page on your website, including the important and otherwise well-optimized ones like your homepage, product pages, and blog posts.
Here, the forward-slash ( / ) symbol refers to the root directory of your site and everything in it. So, if you’ve been wondering why nothing on your site is being indexed and you see that in your robots.txt file, delete it immediately.
DO: Create a Backup
Save a backup copy of your robots.txt file before you make any changes to it.
That way, you can always revert back to a good working version if your changes cause more problems than they solve.
DON’T: Overcomplicate Your Rules
Unless you’re running a mammoth, enterprise-level website, there’s no need to turn robots.txt into a complicated and unwieldy document.
Do so, and you’re not only going to make it much more difficult to manage that file when it comes to editing, but you also run the risk that more lines of code means more potential for error.
Wherever possible, keep it simple.
For example, if you want to allow most content in most directories to be indexed while keeping bots away from just a handful of individual URLs, there’s no need to clutter your file by creating a Disallow rule for each one.
Instead, just use the no index tag on those individual pages.
Example of a WordPress Robots.txt: Possible Optimizations
Below, you’ll find a number of code examples that you can use to optimize robots.txt for different scenarios:
1. Block Access to wp-admin Directory but Allow Access to admin-ajax.php
User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php
2. Block ChatGPT from Crawling Your WordPress Site
User-agent: GPTBot
Disallow: /
3. Prevent Crawlers From Accessing Non-Public Development and Staging Environments
User-agent: *
Disallow: /dev/
Disallow: /staging/
4. Block Access to Particular File Types Such as Members-Only PDF Reports
User-agent: *
Disallow: /*.pdf$
5. Block Crawlers From Duplicate Content Such as Print Versions
User-agent: *
Disallow: /print/
Let us now Test and Troubleshoot your newly created robot.txt File
With all your changes taken care of and your fully optimized robots.txt file back in place on your hosting server, there’s only one thing left to do:
Check that it actually works.
The easiest way to do that is with the Robots.txt Tester on Google Search Console.
Assuming you’ve already taken the sensible step of connecting your site to Search Console to manage your indexing and get insights into your organic traffic, here’s what you need to do:
Log into your account go to Settings – Robots.txt and click Open Report.
Google will then scan your file and, if there are any problems, you’ll find them listed under the Issues column.
Here, there are two things to look out for:
- Errors – Critical problems which should be addressed right away
- Warnings – Non-critical areas for improvement which are less of a priority but which could still improve the way Google crawls your site.
If you don’t have any warnings or errors to contend with, your robots.txt file report will look like the one above. In that case, congratulations!
You’ve successfully optimized your robots.txt file and can go about your day.
If you do have problems to attend to, Google Search Console provides you with useful suggestions and links to resources that can help you fix them.
Guide Search Engines To Prioritize Essential Pages With WordPress Robots.txt
Editing Robots.txt may seem like expert-level stuff, but as you’ve learned in this guide, it’s really quite a straightforward step that can reap big rewards in terms of improved SEO.
To wrap things up, let’s recap the key points that are most important to takeaway with you.
- Robots.txt helps to get priority pages ranked in search engines – This simple text file tells bots like the Google search crawler which pages are important to index and which don’t need to be indexed at all.
- You can create, edit, and optimize your file using an SEO plugin – Yoast SEO, RankMath, and All-in-One SEO (AIOSEO) each have robots.txt features that let you optimize your file from within the WordPress dashboard. Alternatively, you can save the file to your device and edit it manually.
- Allow and Disallow Are the Main Rules to Remember – Disallow tells bots not to crawl certain parts of your site, such as those you don’t want the public finding in search results, while Allow tells bots that it’s OK to crawl that page or directory. Check these rules carefully to ensure you’re not accidentally preventing major search engines like Google from indexing your content.
To learn more about how to boost your site’s organic traffic, check out these 20 WordPress SEO tips.
Frequently Asked Questions
I can’t find my robots.txt file. What should I do?
Robots.txt is usually found in your website’s root folder. If you don’t see it there, it might be that your hosting provider makes it a hidden file by default. Consult your provider to see if this is the case.
Alternatively, you may not have such a file, in which case you can build one from scratch using the instructions on this page.
Is robots.txt useful for WordPress Security?
Although robots.txt can help to ensure that bots from ethical companies steer clear of sensitive pages like your login screen, they’re designed for SEO, not security. Malicious bots designed for hacking will simply ignore the file and do what they were designed to do anyway.
So, be sure to use other WordPress security measures like firewall protection and two-factor authentication to keep your site safe.
Is a robots.txt file necessary for every site I manage?
That depends. The goal of a robots.txt file is to help with search engine rankings. So, if your site’s success depends on organic search traffic, then yes, it’s a good idea to have one. However, if the site is for internal or client use only and shouldn’t be indexed at all, then no, robots.txt isn’t necessary.