You don’t want to think about it.
(But you will after reading this post.)
What exactly is in our drinking water?
Think about it.
All our drinking water comes from rivers.
All our sewers lead to rivers.
No wonder they say that by the time a Louisiana resident takes a sip of his/her water, it’s gone through the bodies of 5 people.
Circle of life?
Now that I painted a very vivid picture in your mind, let me get to the point: how do we make the best of a foul situation?
Yes, content sCRAPing stinks.
However, it’s an inevitable by-product of running a blog.
So how do we turn stink into drink?
If you want to prevent your blog from being scraped, this post is NOT for you (just skip to the end of the post; I’ve got something for you there).
If you want to take advantage of the crappy reality, read on.
What the Scrape Am I Talking About?
If you haven’t had the “pleasure” of being introduced to the concept of “content scraping”, you must’ve been a blogger for less than five minutes.
Here’s a brief definition from Wikipedia:
A scraper site is a spam website that copies all of its content from other websites using web scraping.
By most site owners’ definition, the word “copies” can be easily and justifiably replaced by the word “steals”.
Is Content Scraping Really Inevitable?
As of right now – yes.
Scrapers seem to always stay one step ahead of any attempts to nip the crap out of them.
Google seems to be powerless to do anything about them.
And the better content you produce, the more popular you get, the more likely your site is to be scraped.
While all of that is true, I wouldn’t loose your sleep over it; s*** happens, right?
As Chris Coyier says:
Instead, you could spend that time doing something enjoyable, productive, and ultimately more valuable for the long-term success of your site.
Here’s another point of view:
“There are some people who really hate scrapers and try to crack down on them and try to get every single one deleted or kicked off their web host,” says Matt Cutts in the video below.
“I tend to be the sort of person who doesn’t really worry about it, because the vast, vast, vast majority of the time, it’s going to be you that comes up at the top, not the scraper. If the guy is scraping and scrapes the content that has a link to you, he’s linking to you, so worst case, it won’t hurt, but in some weird cases, it might actually help a little bit.”
Scraped Content: the Defense
There are a few things we can and should do to give scrapers a harder time stealing our content without linking to us.
1. Deep link within your content
Deep linking refers to interlinking your blog posts.
It can also refer to externally linking from other websites to your posts rather than your home page, but that’s not what I am talking about here.
Deep linking within your own content is very simple, entirely under your control, great for your readers and your bounce rate, and if your post gets scraped, at least you might score some links out of it.
2. Ping your posts
Simply put, a ping is a “this site has new content” notification that invites search engine bots to visit your blog.
By default, WordPress pings one pinging service called Ping-o-matic; that service will in turn ping others.
You can always add additional pinging services to be notified when new content is published.
Simply go to Settings > Writing in the admin panel.
By the way, you can find the list of services I ping in this post: What is a ping?
Ideally, since your posts are pinged the second they are published, that creates “proof” that you were the first one to publish the content.
It becomes important in cases where Google tries to figure out which content to rank and which to discard based on the evidence of authorship.
3. Implement rel=”author” Markup
As of today, this is the most reliable way to establish the authorship of your content.
Reason being: in order to correctly implement the markup, you need to have editing privileges for both the Google profile and the blog that are being linked to each other.
Scrapers can’t establish that, only you can.
To learn more about how to add the rel=”author” tag, check out the following posts:
4. Use Excerpts in RSS Feed
A lot of scraping is done through RSS feeds and not your actual blog.
Logically, if your RSS feed publishes excerpts of your posts only, that’s all the scrapers will be able to lay their bots on.
By the way, if you are using FeedBurner for your RSS feed, here’s an easy way to see if your feed is being scraped:
In your Feedburner account, go under Analyze ==> Uncommon Uses.
You’ll see something like this:
See all those peaks?
That most likely means that my feed was scraped quite a bit at those times.
What good does this info do? Not much really. Just fun to know.
I bet you are already typing in your Feedburner URL to see if anyone is scraping your feed…
It’s OK. Go do it. I am not going anywhere…
There are different opinions as to whether the benefit of publishing full posts for your readers outweighs opening yourself up to scrapers.
Your blog, you decide.
Scraped content: The Offense
Aha, now we are getting to the meat of the post.
I don’t know about you, but I have the kind of personality that likes to take advantage of crappy situations, and scraped content is no different.
The idea for this post was actually born when I read a brilliant post by Jon Cooper from PointBlankSEO.com Introducing Scrape Rate – A New Link Metric.
While I strongly suggest you read the post for yourself, here’s the gist:
Scrape rate is a guest blogger’s best friend.
For those who guest blog, the links you get in the post only go so far. Having the content scraped, although no longer on original content, gives you more link equity.
Imagine if you wrote a guest post on an average blog that was scraped 100 different times. Now compare that link power to a guest post written on a more authoritative blog that only gets scraped once or twice.
While the original content on the second option yields more quality & trust, you can’t beat the quantity of links the first option provides.
Argue all you want, but in terms of link building, having your content scraped (as long as the links are intact) like that trumps the quality of the original source in most cases.
My first reaction after I read it was “I need to figure out more ways to get scraped!”
And Jon’s reply to me was:
Well, Jon, here’s what I came up with so far – and thanks for throwing me in with the “great bloggers”. lol
1. Smarter Guest Posting
This is just a follow-up on what Jon said in his post: if you are guest blogging anyway, might as well choose blogs that get scraped a lot more often, right?
Jon has his own quick method of determining the “scrape rate” of a blog (once again, read his post), but here’s a more advanced formula to calculate both the scrape rate and the “shareability rate” (yet another newly invented term), published by IPullRank.com:
2. Getting Mentioned by Other Bloggers
Guest posting is certainly a great way to increase web traffic and build quality one-way links.
However, it takes time.
Here’s a much quicker solution that yields the same results: being mentioned on blogs with high scrape rate.
Let me give you an example.
I just published my monthly income report for January, in which I also gave a shout to the blogs that brought me the most referral traffic for that month.
Kristi Hines of Kikolani.com took the first spot.
She publishes a great series called “Fetching Friday“, in which she talks about the best reads of the week around the blogosphere.
She’s always very generous with linking to other blogs that caught her attention that week, and Traffic Generation Cafe happens to be one of the blogs Kristi reads and mentions.
However, for some inexplicable reason, WordPress absolutely refuses to send me a pingback to let me know when Kristi links to me.
The only way I find out that my blog is included on her Fetching Friday list is when I get pingbacks from the blogs that scrape her content!
So not only the scrapers do me a favor of building links to my posts, but they also let me know that I need to pay Kristi a visit to thank her for the mention.
Takeaway: write great content, make friends with power bloggers, and watch them mention your content and help you to build links.
3. Publishing YouTube Videos
YouTube is one of the most scraped sites on the internet.
Even some of my older videos get scraped all the time, not to mention my new ones.
Not only turning articles into videos is a great way to recycle your content, it’s also a great way to build links from YouTube itself, as well as take advantage of the scrapers.
A couple of pointers to maximize your YouTube “scrape link building“:
- Always include a link back to your blog in the description starting with http:// (duh).
- ALSO, include the link back to the same video in the description.
This way you’ll be building links to both your site AND your video, thus giving your video a chance to rank for your keywords as well.
Of course, the only way to add a URL to the video in the video description is by waiting for the video to be published, then copy the URL, go back to edit the video, and paste the URL into the description.
Just in case you were wondering…
4. Publish Full RSS Feed
Yes, I know – I said you need to publish partial RSS feed above. And you do if all you want to do is to DEFEND your blog.
We are on the OFFENSIVE now, remember?
So here’s how it goes:
1. Publish full RSS feed.
2. Install RSS Footer plugin by Joost de Valk.
By default, the plugin will add links to your home page (with your blog title as anchor text) and your post (with post title as anchor text).
3. Take it one step further.
…and insert a couple of additional keyword-rich anchor texts to whatever you want to build links to.
The trick is to rotate the anchor text every once so often to keep your link building pattern more natural.
According to Michael Gray from Wolf-Howl.com, this is the exact technique he used to rank his blog for the term “SEO blog“.
As you can see, he’s ranking #3 for this very competitive term.
Of course, no one is saying that’s the only link building he’s done for the term, but it did help.
5. Insert Content Tracer
Sometimes your content just gets copied and pasted, not even scraped.
Most of the times, it’s not malicious at all – I do it all the time as well, when I want to quote someone.
The problem arises when those “copy-and-pasters” forget or neglect to give you credit.
Solution: Tynt Publisher Tools.
This simple and free tool will insert a piece of code on every page of your site, so that every time your text gets copied, Tynt will add a link back to the post at the end of the copied portion.
What you see in this excerpt is somewhat customizable.
I definitely suggest you keep the URL part in though for those sites that strip all HTML, like Facebook.
I learned about this nifty tool from Ann Smarty in her Track and Get Links From Those Who Copy Your Content post; thanks, Ann!
6. Hotlink Your Images
This is something I’ve never tried before, but looks like a fun way to “stick it” to them; hat tip on this one goes to Gerald Weber.
“Hotlinking” refers to someone using the images/videos hosted on your site on their site.
Basically, you are paying for the bandwidth to host those images and the scrapers are freely using them.
Here’s what you can do:
Here’s a great tutorial on what you can do to prevent hotlinking:
Well, I am out of ideas.
If you have any clever ways to take advantage of scrapers, I’d love to hear them in the comments.
If you don’t, I’d still love to hear from you – this is a contest after all. lol
Now onto the last part of the post…
For My Pessimistic Friends
I do realize that some bloggers want to have nothing to do with content scrapers other than eradicate as many of them as possible.
Personally, there are only two reasons I’d spend my time fighting them:
- They strip all the links from my content, giving me no credit whatsoever.
- The content they scraped from me ranks higher than my original post.
It pays to make friends….
As far as I know, scraped content has never ranked above my own yet.
I suppose there’s a first time for everything?
Here are some of the best blog posts I found to equip you with everything you need IF and WHEN you are ready to take the scrapers on:
How to Put the Kibosh on Content Scrapers and Thieves – FamousBloggers.net
The Definitive Guide to Blog Content Scraping & How to Stop It! – HyperArts.com
Content Scraping: Prevention, Repercussions, and…Benefits? - BlueGlass.com
What Do You Do When Someone Steals Your Content – Lorelle.Wordpress.com
And here’s one that shows to what length scrapers could go to get away with what they do – in this case, using Homoglyphs:
How Scrapers Take What They Need And Leave - TheBitBot.com
We all hate scrapers – no question about.
What we are willing to do about them depends on the damage and the amount of time we are willing to invest in this quest.
You now should have all the resources you need to deal with the issue however you like.
Before you run off on me, do me a favor please:
I “borrowed” this image from Cori Padgett‘s blog – giving credit where credit is due!