How to Forbid Google to offer the "View as HTML" link for pdf? is the question asked at webmaster world. On most occassions, Google search engine result pages show a "View as HTML" link just below where pdf documents' title is written but sometimes Google does index the "View as HTML" link. Some of the ebst suggestions were:
- "I think it may not be displayed when Google actually has a hard time converting the PDF document to HTML, not because the webmaster instructed them to. To really disable direct access to it, block the PDF doc in your robots.txt file.
- If you want to publish documents on the web, put them in a web format.
- How about securing it with "No Content Copying or Extraction" when you create the PDF?
- Incorporating a media="print" css file will allow you to control the layout of the printed output down to the last point – even if your screen css uses pixel values.
One funny reason was, "The problem is that every 'update' of Acrobat is to ensure Adobe protect their copyright – they have little interest in making it quicker, slicker or more user friendly – they just want to make it hard for people to milk their cash cow! Which means you can bet that adobe are trying to find ways to do exactly what the OP asks; just a matter of time. Meanwhile, the only hope for the rest of us is that Google buys Adobe and makes it free – before M$ buys Adobe and succeeds in making it 100% restricted!"
More on Webmaster World.