- 5 per cent is hacking tools searching for an unpatched or new vulnerability in a web site.
- 5 per cent is scrapers.
- 2 per cent is automated comment spammers.
- 19 per cent is from "spies" collecting competitive intelligence.
- 20 per cent is from search engines - which is non-human traffic but benign.
- 49 per cent is from people browsing the Internet.
Marketing is become that much more sophisticated and technologically inclined, so these types of pings and pokes are clearly going to increase over the next short (and long) while. Once this gets to the point where more website owners are aware of this intrusion, the government will step in and legislate this. Nobody wants government intervention here, but this is another prime case of technology and new media companies stepping over the line by the sheer act of overdoing it.The third-party problem. It's one thing for websites to be tracking their usage and allowing non-human crawlers from search engines to index their websites in order to rank higher. But -- if you look at the list above -- you'll note that scrapers, automated comment spammers, and spies are all third-parties trying to leverage the website for its own, personal marketing initiatives. This makes up over 25 per cent of all traffic. This allowance of third-parties to infiltrate and leverage website traffic is only a small fraction of the issue.
What about the other third-parties that the website has partnered with and allows them access to the website and their users? It's probably unimaginable to think about what that combined piece of website traffic may look like. We have to remember, that most consumers simply don't understand the terms and conditions of a website and have little knowledge and understanding into all of this tracking that is happening. The number must be nothing short of astounding.It's time for fair play. If we, as the New Media collective, do not start self-governing ourselves, you can rest assured that public outcry will increase and the government will step in. What information are we keeping and what information are we tracking and do we need it all? Understandably, it will be next-to-impossible to stop the malicious spies and infiltrators that are leveraging this information for spam (and knowing that this clocks in at over twenty-five of all website traffic, it should come as a rude awakening for publishers), but the crawling and sniffing that we can control, should be looked at with a discerning eye. The use of robots to crawl the Internet is nothing new.
The use of robots to crawl the Internet to grab as much information for possible in a malicious way is nothing new. The ability for website owners to get smarter and ensure that they are protecting their consumers (from both the robots and third-party deals) is nothing new, either, but the numbers are getting out of control and they're only going to increase.It's time to act. What are we going to do about it? Mitch Joel is president of Twist Image -- an award-winning digital marketing agency. His first book, Six Pixels of Separation, named after his highly-successful blog and podcast of the same name is a business and marketing bestseller. His next book, CTRL ALT DEL, comes out in Spring 2013.