Google Sitemap 03 Jun, 2005
Initially, I thought “wow, they’re running graphviz on their static copy of your site,” but of course that was too good to be true. Instead, they’re allowing you to ping them with an xml file that contains urls they are advised to index.
Meh.
Then I started reading the FAQ, and found out that they don’t restrict you to
just their custom xml format (nicely licensed under CC-SA), but also accept
a plain text list of urls or RSS 2.0/Atom 0.3. Yes, you heard correctly, it
supports RSS, which makes it a lot easier on me. The only problem is
that I keep my RSS feed under a mod_rewrite rule that makes it appear to be
at /rss/2.0 which means it’s only authoritative (to
Google) for the /rss directory which contains… precisely nothing.
It won’t hurt MT, because it keeps things at /index.rdf but it will hurt WP
which (with mod_rewrite) uses /feeds/rss or something like that.
The supplied Python scripts are cool. They’re BSD although the XML spec is CC-SA, which I found intersting. Can a spec not be put under the BSD license for simplicity?
Although the test script is very complete,
it’s also very chatty because there’s no assert_false, it assert_true’s then
counts the warnings. Found this in the test script, which reminds me of autoplay="yes please":
{'pattern' : '*', 'type' : 'wildcard', 'action' : 'look pretty'},
The shebang is set to python-2.2 which is all well and good except for those of us who only have python-2.3. The FAQ suggests that you have “knowledge of how to upload files and run scripts” (I’m not sure on the exact wording since the sitemaps pages are returning 502’s now) which I think is not being strict enough — especially for Windows people who will need to install Python, set up paths, etc.
Further Reading
- http://forums.searchenginewatch.com/showthread.php?t=6058
- https://www.google.com/webmasters/sitemaps/docs/en/protocol.html
- Code to Use With WP
P.S. Google: make it easier for people to get in touch and report typos in
your documentation. The FAQ mentions ?q= but the parameter is actually
something different.