Google and Yahoo's Flash indexing is revealing... too much?
Adobe's announcement that Google and Yahoo! will be indexing Flash content at a much deeper level was met with all sorts of reactions last week, ranging from praise from the Flash and Flex communities to utter shock and horror from some HTML fundamentalists expressing fear that the end was nigh.
Well, it appears that Google has already started using this new indexing system and some Flash developers are not happy by how much it is revealing about their applications.
oh no, SWF indexing seems to do just as I feared -- already noticed Google was picking up my test Flash SEO swf but its now exposing URL's
… it will move through the states of your application, get data from the server when your application normally would, and it will capture all of the text and data that you’ve got inside of your Flash-based application.
Peter goes on to state why this could be dangerous:
The concern I have here is that URL requests to the backend will get indexed, those URLs getting exposed in search queries or spider bots hitting those URLs could cause issues. Its not like in HTML content where the search engines can ignore form submit URLs, there is no such context in a HTTPService or URLRequest.
Do you remember the damage Google Web Accelerator caused when it started deleting data by following badly-coded links in web applications? The problem there was web developers using GET requests for non-idempotent operations such as deleting data. It remains to be seen how zealous this indexing is with regards to following data calls from Flash and Flex applications and what, if any, side-effects this will have.
Peter also mentions that the SWF indexing is not opt-in but that's a fact of life with search engines in general and not something that is unique to Flash content. I am sure that this new indexer will follow the instructions in your robots.txt file.
Unfortunately, Flash developers have also been harboring under a false sense of safety that because Flash bytecode is compiled, they can put sensitive information inside their SWF files. I've been warning about the dangers of doing this for years now but this latest development should hopefully help to educate Flash and Flex developers that anything they put in a SWF should be considered public information. Previously, the only security afforded by the bytecode representation of SWF content was security by obscurity -- which we know is not security at all.
To quote Kristof, who posted a comment on Peter's post:
Argh! Google has actually started indexing my Flash Files and is revealing all the URL’s of the pictures in Flash. But also the url’s of the MP3’s I placed in Flash. I was hoping Flash would conceal it - because now, anyone can download our music without paying for it.
I don't know if Google is actually exposing URLs received from data calls or whether the URLs were hard-coded into the SWFs but, if the latter is true, than that information was always available to anyone with a decent SWF decompiler. It was just a little harder to get to.
I'd love to hear your thoughts. How is the new SWF indexing feature of Google and Yahoo! impacting your SWF applications? Leave a comment and let me know.