Table of Contents Hide
Many webmasters sooner or later are faced with the need to remove from the search engine pages that got there by mistake, ceased to be relevant, are doubled or contain confidential customer information (the reasons may be different). A striking example of that is the already sensational situation with MegaFon, sms users whose users indexed Yandex, the situation with online stores, when the search could find customers’ personal information and details of their orders, the situation with banks and transportation companies and so on…
We will not touch in this article the reasons for the mishaps described above, but consider how to remove unnecessary pages from Yandex or Google. It is assumed that they belong to your site, otherwise it is necessary to contact the owner of the desired resource with the appropriate request.
5 ways to remove pages from Google and Yandex search results
1. 404 error
One easy way to remove a page from search is to remove it from your site, with the condition that in the future, when accessing the old address, the server issued a 404 error, which means that the page does not exist.
HTTP/1.1 404 Not Found
In this case, you have to wait for the robot to visit the page again. Sometimes it takes a considerable amount of time, depending on how it got into the index.
If the page when you remove it from search should exist on the site, then this method is not suitable, it is better to use the others presented below.
2. The robots.txt file
A very popular method of blocking whole sections or individual pages from being indexed is to use a robots.txt root file. There are many manuals on how to configure this file correctly. Here are just a few examples.
Close the admin panel section from getting indexed by search engines:
Close a certain page from indexing:
Disallow: /my_emails.html # close page my_emails.html
Disallow: /search.php?q=* # close search pages
In the case of robots.txt you will also have to wait for reindexing until the robot kicks a page or an entire section out of the index. In this case, some pages may remain in the index, if the reason they got there were a few external links.
This method is inconvenient if you need to remove different pages from different sections if you cannot make a common template for the Disallow directive in robots.txt.
3. Meta tag robots
This is an alternative to the previous method, but the rule is specified directly in the HTML code of the page, between <head> tags.
<meta name=«robots» content=«noindex,nofollow» />
The convenience of meta-tag is that it can be added to all the necessary pages (using the control system), whose entry is undesirable in the index search engine, leaving the robots.txt file at the same time simple and clear. The only disadvantage of this method is that it is difficult to implement for a dynamic site that uses a single template header.tpl, if you do not have special skills.
4. X-Robots-Tag headers
This method is used by foreign search engines, including Google, as an alternative to the previous method. Yandex has no official information on the support of this http-header, but perhaps in the near future will.
The essence of its use is very similar to the meta tag robots, except that the record should be in http-headers, which are not visible in the code of the page.
X-Robots-Tag: noindex, nofollow
In some, often unethical, cases, its use is very convenient (for example, when exchanging links and hiding a linkpost page).
5. Manual removal from the webmasters panel
Finally, the last and fastest way to remove pages from the index – is their manual removal.
The only condition for manual removal of pages – they must be closed from the robots in previous methods (in robots.txt, meta tag or 404 error). It has been noted that Google processes deletion requests within a few hours, while Yandex will have to wait for the next update. Use this method if you need to urgently remove a small number of pages from your search.