A robots.txt doesn't physically prevent anything from happening, it just politely asks robots not to index certain pages. It's like putting up a "keep out" sign on an unfenced piece of land -- people know they're not supposed to trespass, but there's no physical barrier preventing them.
HTTP is a protocol not a markup language -- I assume you meant HTML. XML or HTML makes no difference, anything that is human-readable is also machine readable unless you hide it behind something that needs human-level pattern-matching skills like a CAPTCHA image.
no subject
HTTP is a protocol not a markup language -- I assume you meant HTML. XML or HTML makes no difference, anything that is human-readable is also machine readable unless you hide it behind something that needs human-level pattern-matching skills like a CAPTCHA image.