A few weeks ago, I wrote about how my site was indexed through no obvious means.
Since then I have:
o Set up some tests to see if Google would index URLs through unconventional methods;
o Had a look at my log files to see if I could find the answer there.
I conducted 3 tests with the use of 6 URLs – each sitting outside of my WordPress site, with no Google Analytics etc:
o Visiting URLs in Chrome: I visited three of the URLs in Chrome to see if this would result in any being indexed. Having waited almost a week, none of the three have been – and there has been no visit from Googlebot according to the log files.
o Emailing to Gmail: I emailed one URL to Gmail and clicked on it. Same result after 2-3 days (nothing).
o Searching for the URL: I searched for one URL in Google to see if this information would somehow be used. Nope.
The other URL was simply left as a constant.
Obviously this is testing something slightly different from the original scenario (a new domain being indexed, rather than a new URL), so a conclusion cannot be drawn there. I would have really like to have tested some new domains, and may do so if I decide to part with some money!
How My Site WAS Indexed
When looking at the logs to see if Googlebot had visited any of the URLs, I decided to have a dig back to when it first visited the site. It was here that I found my answer:
Googlebot first entered the site only half an hour after the ‘Hello World’ post was published by WordPress i.e. only half an hour after it was set up.
The first file it requested was not the Robots.txt file, as is standard when entering the site, but an XML sitemap. This sitemap was generated by Yoast’s WordPress SEO plugin, which I must have installed as soon as I set up the site.
A bit of text in Yoast’s guide says that it automatically submits the XML sitemap (which is often generated by default when installing the plugin) to both Bing and Google:
This seems to be exactly what happened! Mystery solved.
A few things that can be drawn from the experience:
o Google does not seem to find new URLs through the unconventional methods I tested (yet), but it is not fair to draw the same conclusion about new domains.
o Log files are a vital source of information for any webmaster!