I love building web sites. And I especially love building web sites for other people. So when someone asks me to build them a website - either a new one or a re-design of an existing one - I jump at the chance. The best part is when the site goes from the staging server to production and is 100% live. But sometimes there can be issues with search engines when this happens.
I always try to create a robots.txt file in my staging environment to tell search engine crawlers to not index this content. Of course we don't want that, because the staging environment is usually a non-related domain name and that will just confuse visitors. But search engines don't always respect a robots.txt file or maybe I didn't put it in place when development started and a search engine crawled it before I could stop it.
Here are some things you should do when moving from a staging environment to production to make sure that visitors get your fancy, brand new site.
Create A Robots.txt File And Disallow Indexing
Every website should have a robots.txt file because it allows you to control which parts of your website are indexed by search engines. You probably already did this but, if you haven't, do it now.
It's as easy as creating a file in the root of your website hosting cPanel named robots.txt. For instance, a WordPress site should have a robots.txt which restricts search engine crawlers from indexing the /wp-admin and /wp-includes folders. Search engines only need to crawl your web site's content which is in /wp-content.
The contents of this file are important. When you have moved away from your staging environment, you do not want search engine crawlers to search your site any more. So you want to disallow everything. That would look like this.
User-agent: * Disallow: /*
Apply A Redirect From Staging To Production
If you're like me, you work in a local development environment and deploy to a hosted environment for customers to review your progress. This is great because you can work "off-line" and deploy when you are ready.
If you have not already looked into Local by Flywheel, I strongly recommend it. Especially if you do any kind of PHP or client-side development. It's not exclusively for WordPress development either. Check it out here.
Further to this, you may have a staging environment which is where changes are reviewed and approved before being deployed to production. This is a great strategy, however you may have problems with the staging site getting indexed and appearing in search engine results.
When development is complete in staging, or you are not actively working on it while waiting to do further work, you could apply a redirect on the entire site. You can do this in your .htaccess file.
Redirect 404 https://www.productionsite.com.au/
Essentially, browsers will arrive at the staging site and the redirect will forward them to the production site. This works perfectly because the staging and production sites permalinks are exactly the same.
That is, there will not be any links on the production site which do not also exist on the staging site. So the redirect tells the browser, "Hey, replace the domain you arrived at with this one and go check that out".
- Browser requests - about-us at https://www.StagingSite.com
- Server responds - nope - go to https://www.ProductionSite.com and append /about-us
- Browser says - "cool, thanks" and goes to https://www.ProductionSite.com/about-us
- Visitor smiles at the awesome content
Apply A Permanent Redirect (well.. kind of)
I found a better way to combat this is is to make the staging site return a 410 error for all requests. Google suggests this because it indicates that the resource is not just not found, it's actually gone).
You can do this in your .htaccess file.
Redirect 410 / https://www.productionsite.com.au/
Just like the 404, this response will tell the browser and search engine crawler that the resource cannot be found. But furthermore, it tells them that this is probably a permanent change. Which is exactly what we want because visitors should be visiting the production site from now on.
Til next time ...