I woke up early this morning to the sound of banging and scratching coming from my home office. After quickly descending the stairs, I found a total stranger sitting at my desk copying my latest batch of articles. Apparently he'd walked in through the front door, which I'd left unlocked. When he saw me, he said, “Hi, Mike. Love your work. I’m going to publish it too,” and he kept on copying.
Obviously, this didn't really happen. But if it had, any reasonable person would have dialed 911, because at least one crime was taking place—few people would argue with that. But what if instead of stealing from my home office, the stranger had gone to my website and copied the information directly from it and posted it on his, a practice known as Internet scraping? The response would most likely be quite different—far fewer people would see anything wrong with that, or at least they wouldn’t see it as being that wrong.
The Internet makes it very easy for anyone to copy information from anyone else. And because it’s so easy and because there are billions of pages of information available, few people see the harm in it. But it's harmful nonetheless when it’s done without permission.
And the type of copying that's done ranges from the "not so much" to the "practically duplicating all content." For instance, someone might like how a certain company has drafted its “About Us” page, copy the text, tweak it and paste it on his or her own website. At the other end of the spectrum are companies that have built entire businesses around other companies' content—they use scraping technology to take information from one website, then they reformat it and display it on their own websites. Either example can cause problems, but not always. It’s tricky.
Part of the confusion has to do with how copyright laws are written and how judges have ruled in cases involving alleged Internet copyright infringement. As a content owner, you might assume if the information is displayed on your website and there is a copyright message at the bottom of every page of your site, you're legally protected. That isn’t always the case.
For you to enforce your rights to certain content, copyright law requires you to be the owner of the content's copyright, which isn’t the same thing as owning the website. Customer-generated content, such as reviews, videos or images, doesn’t automatically belong to you just because it's on your website. If you don’t have the copyright to it, you may not have the right to prevent third parties from taking and using it.
In addition to copyright laws, content owners and scrapers need to be mindful of “trespass to chattels” torts. This is a civil—not criminal—wrong whereby the owner of the property is denied their rightful use because of the actions of a third party. Lawyers and the court system have applied the concept of trespass to chattels torts in cases of extreme Web scraping. For instance, if a content scraper targets a website on a server and continually bombards it with download requests, it could bog down the server and make it unusable for legitimate customers and the website owners. But because the server is private property, companies that have been the target of large-scale scraping operations have successfully sued the perpetrators for damages under this scenario.
Protecting Your Content And Yourself
The best way to protect yourself legally is to have a clear “Terms and Conditions” page on your website aimed at people uploading content to your website and those who plan to copy content from your site. Make it clear that you will be granted certain rights to the content that's uploaded by others to your site. Also be explicit about what content you own and what your company policy is regarding copying and scraping of your content. Require users to “click to agree” on the terms and conditions before participating. Simply having a link to a terms and conditions page could still leave wiggle room for the abusive copier to say they weren't aware of your policy.
If you don’t currently have an adequate terms and conditions page, you should go to a leading website and copy theirs. (Just kidding!) Instead, talk to an intellectual property lawyer who specifically has expertise regarding Web content to get the help you need to develop your own terms and conditions.
On the flip side, if you plan to appropriate content from a third-party website, be sure that you have thoroughly reviewed the terms and conditions set forth by the website owners to ensure you aren’t violating their rights. I would suggest, however, that if your business requires you to take information from someone else without their consent, you may need to rethink your business model.
Whether you're trying to protect your content or yourself, be sure to put enough protections in place to keep yourself and your business out of court.
Read more articles on technology.