What can I do if others steal content from my blog?

The fear of losing one's content bothers many bloggers, going by their queries on Quora and forums. This looks distant until one is attacked but when the attack happens, it can be ruinous. So this post.

Blogs and other websites are inherently prone to content theft

Once you are on the web and your content is worth stealing (good articles, photos, infographics, quotes, podcasts and videos), there will always be people (thieves) wanting to steal your content and pose it as their own. In some cases, it would be just for bragging about their talent but mostly it is for making money out of it. In some cases, it would be an innocent lift, but in most cases it's deliberate. In some cases, it would be lifting of a small para with some form of attribution; in most cases it would be outright lifting and pocketing credit/ money.

The irony is that whatever original is created by you, the copyright rests with you whether you announce it or not; yet, you cannot stop copying of your content once it is on the w.w.w. beyond a point. Even the biggest film and music producers (with huge resources) have not been able to check piracy of their creations.

That does not mean, you should allow everybody to re-use your content, that too at your cost. There are ways to prevent that and to take action when it hurts.

Is your blog/ website's content really stolen?

Not all copying of content and its re-use is stealing. There are many occasions when the content can be legally and legitimately re-used, sometimes even without attribution and permission. These are called 'fair use' practices. For example, a person reviewing your book is likely to copy-paste in his review a small para or a quotable quote from the book. Someone talking about your product on a review site may like to exactly quote you for the claims you make about the product. Similarly, giving a thumbnail of your art work or book cover, reproducing a stanza from your poem, copy-pasting your floor plan to illustrate how good or bad the flat being sold by you is - all such instances are fair use of your content by others.

The term 'Fair Use' is not easy to define and so even courts rule on copyright infringement matters on case to case basis. However, it is always better to err on the right side on such matters

For a more detailed discussion on fair use, visit this Wiki page and for a legal discussion, this US government resource of fair use.

DMCA. You will hear this expression a lot when discussing copyright matters. The Digital Millennium Copyright Act (DMCA) is a US law, which is gold standard for copyright on the web, that tells what is allowed and what not when it comes to sharing digital content. If you wish, you can read more about it on this Wiki page.

How much of your content is being stolen and by whom?

You can use a number of tools, some free and some paid, to find out who is scraping your content and where he is using it.

If you search for "plagiarism check" or "check copy paste content online" on Google, you'd find a large number of paid tools that search copy-pasted content for you, but we are in no position to suggest one of them.

Copyscape is one of them. If you go to this site and paste the URL of your blog post, it instantly gives you about ten places where your content is being shared on the web. (On paid version, you can see many more and with many options).

Why not Google your blog? On the search bar, paste a sentence from your blog post and see how many places it figures on the net. Then a few more sentences from different parts of the article and repeat the search. This takes time but is more effective than many other tools!

On such checking, you will spot thieves and also get results that come because your content is being legitimately shared through automated feed apps/ widgets. We Googled a long paragraph from one of our recent posts and found it on over a thousand websites and blogs. Interestingly, in the search results, many entries gave the publication date before we published the post! The earlier date is because in some blog aggregators, the page remains the same (and its date of the page's creation is picked by search engines) while links keep updating. 

The next step would be to make a table of all suspicious scrapers and keep checking them over a period and for different posts.

Is a fight with content thieves worth it?

If majority of cases of re-use of your content relate to genuine use (even slight abuse) of your content, forget it. Be alarmed if you find your content landing on suspicious sites or you are being wrongly quoted or you feel that the reuse is harming you by diverting your traffic or ranking higher on search engines than your original work or people are buying the duplicate work or search traffic is going down because you are being penalized for duplicate content, or in some other way.

If you do find some bad cases of stealing of your content, should you fight back? Assess beforehand, whether your effort will be worth it. Unless you feel you'd gain much by getting the content removed from their site or getting them penalized or winning a legal suit or forcing them to pay for its use, leave it at that.

Take on the content thief if stakes are high

If you are convinced that plagiarism is hurting you and you must take corrective action, you can proceed as follows:

1. Keep proof ready. Take full screen shots of pages with stolen content. Note down URLs of all such pages on these sites. Find their contact details through contact pages etc. If not, go to Whois sites to find who is the owner of the blog.

If you have put a copyright notice on the webpages (discussed in the section on preventive actions below), it helps in identifying the theft.

2. Contact them. It is always better to start with a friendly caution. Write what you have noticed and give a hint that it could be by mistake but it's not fair. Request to bring down and give the option that if they want to legally share your content, they may attribute/ give proper credit. If the intent of the scraper seems to hurt your interest, be straight and make the communication in the form of a 'cease and desist' warning.

3. If the response is not up to the mark, firmly tell him to take the steps in a time-bound manner and warn him of steps you might take. Make it your 'take down' notice.You can even send it through an attorney.

4. If he doesn't listen or shouts back, you can complaint about him to search engines and his web-host. This often works well. Google/ Blogger/ AdSense, Wordpress, Linkedin, Bing and Yahoo have forms on their sites for reporting content theft.

Here, one word of caution. In your zeal to punish the [perceived] guilty, you should not go overboard. There have been cases when acting upon complaints, the ISPs or hosts have removed content or blocked sites beyond what was intended, leading to complications. So, go to this level only if required, and state facts as they are - without exaggeration of theft.

5. Finally, if it is required (and that should happen only in extreme cases when your loss is too big), send the thief a legal notice and follow it with a legal action.

Preventive actions against content theft from blogs, websites and other webspaces

Since it is tough to protect your content against copyright violations and misuse on the web, why not take some easy steps that act at least as a scarecrow in the wheat field. For example,

i. Put a copyright notice on the blog/ website. You can use Creative Commons license and a DMCA.com badge. Both are free. The latter has payment options too. DMCA even helps you send 'take down' notices to the guilty.

ii. Do not put your digital art or high-resolution  photographs directly on the website/ blog. Watermark the high-res image or give a thumbnail only.

iii. Instead of posting videos directly on the blog, put them on YouTube and give a link on the blog. YouTube is very strict with copyright infringement by any of its users, and so if the thief re-uses your videos on his YouTube against your wishes, he is sure to be penalized.

iv. Set up a Google alert on some of your high quality content, so that you are notified whenever your content is shared on the web, and then check  whether it is theft. 
v. Allow only short RSS feed so that automated curating/ aggregating sites too have only a small part of the blog post and not the whole. 
If you want to give full post in RSS feed, add a line giving link to your original post. You can do this easily on Blogger through Settings>Other>Site Feed. Wordpress bloggers can use a plugin available on its site.

On your part, you the blogger should not infringe others' rights. Like to see our earlier post on attribution and related matters?