You don’t want to think about it.
(But you will after reading this post.)
What exactly is in our drinking water?
Think about it.
All our drinking water comes from rivers.
All our sewers lead to rivers.
No wonder they say that by the time a Louisiana resident takes a sip of his/her water, it’s gone through the bodies of 5 people.
Circle of life?
Now that I painted a very vivid picture in your mind, let me get to the point: how do we make the best of a foul situation?
Yes, content sCRAPing stinks.
However, it’s an inevitable by-product of running a blog.
So how do we turn stink into drink?
If you want to prevent your blog from being scraped, this post is NOT for you (just skip to the end of the post; I’ve got something for you there).
If you want to take advantage of the crappy reality, read on.
What the Scrape Am I Talking About?
If you haven’t had the “pleasure” of being introduced to the concept of “content scraping”, you must’ve been a blogger for less than five minutes.
Here’s a brief definition from Wikipedia:
A scraper site is a spam website that copies all of its content from other websites using web scraping.
By most site owners’ definition, the word “copies” can be easily and justifiably replaced by the word “steals”.
Is Content Scraping Really Inevitable?
As of right now – yes.
Scrapers seem to always stay one step ahead of any attempts to nip the crap out of them.
Google seems to be powerless to do anything about them.
And the better content you produce, the more popular you get, the more likely your site is to be scraped.
While all of that is true, I wouldn’t loose your sleep over it; s*** happens, right?
As Chris Coyier says:
Instead, you could spend that time doing something enjoyable, productive, and ultimately more valuable for the long-term success of your site.
Here’s another point of view:
“There are some people who really hate scrapers and try to crack down on them and try to get every single one deleted or kicked off their web host,” says Matt Cutts in the video below.
“I tend to be the sort of person who doesn’t really worry about it, because the vast, vast, vast majority of the time, it’s going to be you that comes up at the top, not the scraper. If the guy is scraping and scrapes the content that has a link to you, he’s linking to you, so worst case, it won’t hurt, but in some weird cases, it might actually help a little bit.”
I agree.
Scraped Content: the Defense
There are a few things we can and should do to give scrapers a harder time stealing our content without linking to us.
1. Deep link within your content
Deep linking refers to interlinking your blog posts.
It can also refer to externally linking from other websites to your posts rather than your home page, but that’s not what I am talking about here.
Deep linking within your own content is very simple, entirely under your control, great for your readers and your bounce rate, and if your post gets scraped, at least you might score some links out of it.
2. Ping your posts
Simply put, a ping is a “this site has new content” notification that invites search engine bots to visit your blog.
By default, WordPress pings one pinging service called Ping-o-matic; that service will in turn ping others.
You can always add additional pinging services to be notified when new content is published.
Simply go to Settings > Writing in the admin panel.
By the way, you can find the list of services I ping in this post: What is a ping?
Ideally, since your posts are pinged the second they are published, that creates “proof” that you were the first one to publish the content.
It becomes important in cases where Google tries to figure out which content to rank and which to discard based on the evidence of authorship.
3. Implement rel=”author” Markup
As of today, this is the most reliable way to establish the authorship of your content.
Reason being: in order to correctly implement the markup, you need to have editing privileges for both the Google profile and the blog that are being linked to each other.
Scrapers can’t establish that, only you can.
To learn more about how to add the rel=”author” tag, check out the following posts:
and/or
4. Use Excerpts in RSS Feed
A lot of scraping is done through RSS feeds and not your actual blog.
Logically, if your RSS feed publishes excerpts of your posts only, that’s all the scrapers will be able to lay their bots on.
By the way, if you are using FeedBurner for your RSS feed, here’s an easy way to see if your feed is being scraped:
In your Feedburner account, go under Analyze ==> Uncommon Uses.
You’ll see something like this:

See all those peaks?
That most likely means that my feed was scraped quite a bit at those times.
What good does this info do? Not much really. Just fun to know.
I bet you are already typing in your Feedburner URL to see if anyone is scraping your feed…
It’s OK. Go do it. I am not going anywhere…
There are different opinions as to whether the benefit of publishing full posts for your readers outweighs opening yourself up to scrapers.
Your blog, you decide.
Scraped content: The Offense
Aha, now we are getting to the meat of the post.
I don’t know about you, but I have the kind of personality that likes to take advantage of crappy situations, and scraped content is no different.
The idea for this post was actually born when I read a brilliant post by Jon Cooper from PointBlankSEO.com Introducing Scrape Rate – A New Link Metric.
While I strongly suggest you read the post for yourself, here’s the gist:
Scrape rate is a guest blogger’s best friend.
For those who guest blog, the links you get in the post only go so far. Having the content scraped, although no longer on original content, gives you more link equity.
Imagine if you wrote a guest post on an average blog that was scraped 100 different times. Now compare that link power to a guest post written on a more authoritative blog that only gets scraped once or twice.
While the original content on the second option yields more quality & trust, you can’t beat the quantity of links the first option provides.
Argue all you want, but in terms of link building, having your content scraped (as long as the links are intact) like that trumps the quality of the original source in most cases.
My first reaction after I read it was “I need to figure out more ways to get scraped!”
And Jon’s reply to me was:

Well, Jon, here’s what I came up with so far – and thanks for throwing me in with the “great bloggers”. lol
1. Smarter Guest Posting
This is just a follow-up on what Jon said in his post: if you are guest blogging anyway, might as well choose blogs that get scraped a lot more often, right?
Jon has his own quick method of determining the “scrape rate” of a blog (once again, read his post), but here’s a more advanced formula to calculate both the scrape rate and the “shareability rate” (yet another newly invented term), published by IPullRank.com:
2. Getting Mentioned by Other Bloggers
Guest posting is certainly a great way to increase web traffic and build quality one-way links.
However, it takes time.
Here’s a much quicker solution that yields the same results: being mentioned on blogs with high scrape rate.
Let me give you an example.
I just published my monthly income report for January, in which I also gave a shout to the blogs that brought me the most referral traffic for that month.
Kristi Hines of Kikolani.com took the first spot.
She publishes a great series called “Fetching Friday“, in which she talks about the best reads of the week around the blogosphere.
She’s always very generous with linking to other blogs that caught her attention that week, and Traffic Generation Cafe happens to be one of the blogs Kristi reads and mentions.
However, for some inexplicable reason, WordPress absolutely refuses to send me a pingback to let me know when Kristi links to me.
The only way I find out that my blog is included on her Fetching Friday list is when I get pingbacks from the blogs that scrape her content!
So not only the scrapers do me a favor of building links to my posts, but they also let me know that I need to pay Kristi a visit to thank her for the mention.
Takeaway: write great content, make friends with power bloggers, and watch them mention your content and help you to build links.
3. Publishing YouTube Videos
YouTube is one of the most scraped sites on the internet.
Even some of my older videos get scraped all the time, not to mention my new ones.
Not only turning articles into videos is a great way to recycle your content, it’s also a great way to build links from YouTube itself, as well as take advantage of the scrapers.
A couple of pointers to maximize your YouTube “scrape link building“:
- Always include a link back to your blog in the description starting with http:// (duh).
- ALSO, include the link back to the same video in the description.
This way you’ll be building links to both your site AND your video, thus giving your video a chance to rank for your keywords as well.
Of course, the only way to add a URL to the video in the video description is by waiting for the video to be published, then copy the URL, go back to edit the video, and paste the URL into the description.
Just in case you were wondering…
4. Publish Full RSS Feed
Yes, I know – I said you need to publish partial RSS feed above. And you do if all you want to do is to DEFEND your blog.
We are on the OFFENSIVE now, remember?
So here’s how it goes:
1. Publish full RSS feed.
2. Install RSS Footer plugin by Joost de Valk.
By default, the plugin will add links to your home page (with your blog title as anchor text) and your post (with post title as anchor text).
3. Take it one step further.
…and insert a couple of additional keyword-rich anchor texts to whatever you want to build links to.
The trick is to rotate the anchor text every once so often to keep your link building pattern more natural.
According to Michael Gray from Wolf-Howl.com, this is the exact technique he used to rank his blog for the term “SEO blog“.

As you can see, he’s ranking #3 for this very competitive term.
Of course, no one is saying that’s the only link building he’s done for the term, but it did help.
5. Insert Content Tracer
Sometimes your content just gets copied and pasted, not even scraped.
Most of the times, it’s not malicious at all – I do it all the time as well, when I want to quote someone.
The problem arises when those “copy-and-pasters” forget or neglect to give you credit.
Solution: Tynt Publisher Tools.
This simple and free tool will insert a piece of code on every page of your site, so that every time your text gets copied, Tynt will add a link back to the post at the end of the copied portion.
Like this:

What you see in this excerpt is somewhat customizable.
I definitely suggest you keep the URL part in though for those sites that strip all HTML, like Facebook.
I learned about this nifty tool from Ann Smarty in her Track and Get Links From Those Who Copy Your Content post; thanks, Ann!
6. Hotlink Your Images
This is something I’ve never tried before, but looks like a fun way to “stick it” to them; hat tip on this one goes to Gerald Weber.
“Hotlinking” refers to someone using the images/videos hosted on your site on their site.
Basically, you are paying for the bandwidth to host those images and the scrapers are freely using them.
Here’s what you can do:

Here’s a great tutorial on what you can do to prevent hotlinking:
SO YOU WANT TO STOP HOTLINKING AND BANDWIDTH THEFT
Well, I am out of ideas.
If you have any clever ways to take advantage of scrapers, I’d love to hear them in the comments.
If you don’t, I’d still love to hear from you – this is a contest after all. lol
Now onto the last part of the post…
For My Pessimistic Friends
I do realize that some bloggers want to have nothing to do with content scrapers other than eradicate as many of them as possible.
Personally, there are only two reasons I’d spend my time fighting them:
- They strip all the links from my content, giving me no credit whatsoever.
- The content they scraped from me ranks higher than my original post.
The first instance happened to me twice, and both times it was brought to my attention by other bloggers, Mavis Nong and JamestheJust – thank you!
It pays to make friends….
As far as I know, scraped content has never ranked above my own yet.
I suppose there’s a first time for everything?
Here are some of the best blog posts I found to equip you with everything you need IF and WHEN you are ready to take the scrapers on:
How to Put the Kibosh on Content Scrapers and Thieves – FamousBloggers.net
The Definitive Guide to Blog Content Scraping & How to Stop It! – HyperArts.com
Content Scrapers – How to Find Out Who is Stealing Your Content & What to Do About It - KISSMetrics.com
How to Identify, Disable and Prevent Scrapers from Thieving Your Blogs’ Content - WPmu.org
Content Scraping: Prevention, Repercussions, and…Benefits? - BlueGlass.com
What Do You Do When Someone Steals Your Content – Lorelle.Wordpress.com
And here’s one that shows to what length scrapers could go to get away with what they do – in this case, using Homoglyphs:
How Scrapers Take What They Need And Leave - TheBitBot.com
Marketing Takeaway
We all hate scrapers – no question about.
What we are willing to do about them depends on the damage and the amount of time we are willing to invest in this quest.
You now should have all the resources you need to deal with the issue however you like.
BUT…
Before you run off on me, do me a favor please:

I “borrowed” this image from Cori Padgett‘s blog – giving credit where credit is due!



Ana -
I love that you take a controversial viewpoint here – one of the reasons that I stalk you online…I mean professionally, not in a creeper “get a restraining order” way. Ew.
But the food for thought here is solid – my only issue is that my scrapers seemed to strip out all my links in every case, or rewrite the content (but as the original author it’s obvious I was scraped).
Autobloggers, on the other hand – using various automated means to scrape – these may all of a sudden be my best friends. Very interesting P.O.V.!
James Hussey aka JamestheJust recently posted..Increasing Conversions And Knowing Why People Buy Online
Did you just say that you are a professional stalker, James? I’d keep that piece of info to yourself… lol
Chasing the scrapers – who has the time for that? Might as well make the best of it.
Thanks for coming by!
Ana Hoffman recently posted..Keep Your Money, Honey? (Internet Marketing Tools 2011 Recap)
Another way to have scrapers help you is to actually link to your google profile page at the end of every post. This way, not only are you getting deep linking benefits but you are also boosting your profile page cred as well because the scrapers are scraping your links.
Personally, I have never worried about scrapers. Usually, they seem too desperate and I kind of feel sorry for those behind them to be real honest.
Good point, Leo – might as well build up our profiles that matter!
Ana Hoffman recently posted..Traffic Generation Cafe Monthly Income Report: January 2012
It seems rather counterintuitive after the mantra of “article marketing is dead” that’s been preached for the last X number of years but actually, the way I’ve most often been scrapped is by putting articles up on these low rent article sites. If there is real value in being scrapped (I’m still not that convinced) then finding a technique to post up to article directories which works for being scrapped might actually be a great way to go??
…I don’t believe I just wrote that..
Stoked SEO recently posted..How fatherhood made me a better link builder
Can’t believe I didn’t think about it, but you are very right.
The entire point of article marketing is syndication, and syndication means links, if not traffic.
Again we are not talking about quality here, but quantity does help.
Ana Hoffman recently posted..How to REALLY Create a Popular Blog From Scratch
Hi there! Nice post. If you were to ask me, if I were to scrape content, something which I never did, nor plan to do (honestly), I’d take the full text, copy it, then paste it in a .txt, then copy it again, and then paste it in my own blog, adding a line at the end: Source… In that way, you’d get rid of any scripts, marks, html, and the lot.
I think that, if you do not want to be copied, then disable right-click of the mouse, in that way, you will discourage most of scrappers. Primitive, may be. Annoying, most definitely, but it surely works.
Regards
Andrea recently posted..Las redes sociales, otra herramienta para mejorar el posicionamiento en Google
Keep in mind most scrapers do not click and then copy and paste. The majority of them scrape via RSS feed. Usually these type of thieves are far to lazy to do something that takes 3 steps (copy, paste in text file, copy and paste again) or I mean 4 things.
Of course I’m sure there are some that do just copy and paste which is where things like tynt would help, although I don’t think these people would trouble with taking the stops to make sure they remove all links, we are still dealing with lazy people
That being said if someone did go to those lengths and I become aware of it then I would take the extra steps to pursue them. It’s one thing if some douche is using scraping software, although still unethical and a bad idea, it’s much different than when someone copies your work and takes extra steps to take away credit from the original source and then post it on their own blog and try to pass it off as their own.
Regarding the disable right click, while it might be fun if you leave a crafty message on the popup, it is still simple to get around so, it won’t stop anyone that is computer literate.
Good point about disabling the right-click, Andrea.
Too bad most scrapers do it automatically through the use of scraping software.
Ana Hoffman recently posted..35 Headache-Free Split Testing Resources to Increase Your Conversions and Sales
Hi Ana,
this is not only an interesting article on an important topic, but I will add my about not just two cents on it.
I work for a media that not only has issues with scrapers, but officially distributes its content to over 400 hundred newspapers, and most of them have websites, many of which outrank us often.
As per scrapers – I don’t really bother anymore because we hold the stronghold tight – implementing all of the above mentioned techniques and more, our situation is really funny because what works against us is the actual success over time – the more we reach out to what we are supposed to do the more we get outranked, thus resulting in a very hard to maintain equilibrium, which I called “walking on a thin ice between abysses of fire”, yes it is hell for an in-house SEO LOL. Especially when you are being asked why the traffic goes down.
The media name is “Project Syndicate” in case you want to know, I have no issues to disclose it. You may grab a feed and post on your page
Just came back from Project Syndicate – thanks for the resource, Boris!
And I do know what you mean when you said that popularity goes hand-in-hand with bigger scraping problems, ESPECIALLY when you are outranked.
Ana Hoffman recently posted..35 Headache-Free Split Testing Resources to Increase Your Conversions and Sales
Great, comprehensive post about Content scraping Ana.
Content scrapers don’t bother me much, as long they cite the original link.
When they don’t, I write them an email (to correct or else…) and so far they correct.
Most of the time I’ll find their emails from the domain name registrar.
Boutros.
Boutros recently posted..How To Add Sortable Custom Columns in WordPress Dashboard
I am absolutely with you, Boutros – I only spend as much time as I need to to resolve the situation and not a second longer.
Hey Ana,
Thanks for letting me know about your post. After just having written about Tynt, you’ve also shared a few other sources here I was not aware of.
Really great post with so many tips for those people worry about scraping. So far I’ve only known one blogger who got her entire site scraped and she did have it shut down. I haven’t found that any of my content has been scraped yet but I’m sure a few have snuck in.
These are some great tips so thank you so much for this detailed list. This is definitely a fabulous share so I’ll do my part.
~Adrienne
Adrienne recently posted..How To Get Others To Build Your Links
It’s kind of funny we ended up writing about the same tool, Adrienne, but it’s that easy and good!
Wow, her entire blog was scraped? That’s a bummer…
Ana,
I agree with Adrienne, this is a great post with tons of tips. Again thank you for sharing this in our discussion. I know this is a growing concern amongst many bloggers and yet there are tons of other bloggers who have no idea that their content could be scrapped, or recycled.
What I like about what you shared is that if you can’t beat them, let them help you with your traffic! As we can’t block everyone all the time from stealing our content we can take some preventive measures as well as a few hidden attributes that allow us to gain some potential traffic.
Thanks for the hot tips and tools.
Ken Pickard
The Network Dad
Ken Pickard recently posted..5 Benefits Of Having An Accountability Partner
You are very welcome, Ken.
Another thing I like about this is that fact that you don’t need to actively spend your time implementing these techniques. It’s more of “set it and forget it” kind of process.
Ana Hoffman recently posted..Throwing Spaghetti: How to Leverage the Traffic You Have To Get The Traffic You Want
At first I was really annoyed by scrapers, but I have come to a point that I no longer care whether someone does scrap my articles or not. While there is no proof, I believe that with certain “counter measures”, there is a way for search engines to find out where the original content from. For example, one technique that most scrappers use to scrape articles is via RSS and automatically. They don’t even bother looking at it, so in this case, my using RSS footer helps. Another way of giving signals to search engines that you are the original source of content is indeed via the authorship markup and using rich snippets. Thanks for mention btw… great article as always
DiTesco recently posted..Inbound Marketing And Headline Formulas (free ebooks)
I agree, Francisco – as long as we establish authorship (which we should do for many reasons), Google will most likely (even though not always) will do the rest.
Thanks for coming by!
Ana Hoffman recently posted..How to Increase Website Conversion Rates with Derek Halpern
Awesome Stuff here Ana! For sure guna put this to use.
I remember the first time I got scraped, I was mad, but more than that I was excited because I was about to send the guy a dmca! lol, course I’ve mellowed out a bit since then, not much I can do about it anyway. Might as well make lemonaid!
Micah recently posted..Three Kinds of People Who Have No business Blogging (so stop supporting them)
Yes, I remember my first time as well…
Took so much of my time!
Now I too prefer lemonade. lol
Thanks for coming by, Micah!
Ana Hoffman recently posted..Majestic SEO Site Explorer: How a Reader Made Me Eat My Words
Ana that was a very enjoyable post to read, certainly you touch on the many points that seem to keep businesses and bloggers up at night
One other point is to measure what topics will be more likely to get scrapped and sometimes I focus on them if I’m trying to get some content scrapped. Also you can review what was the quality of scrappers that hit you last time so you can measure how the new post performed. Also if you are doing this turn-off your auto publishing of pingbacks.
P.S. I also got a Internal Server Error (500) when I tried the link to your income report
David recently posted..Facebook Groups Changes Again
Good point, David – I did notice that some of the topics on my blog get scraped a lot more often (like anything related to making money online).
Must’ve been a temporary glitch; the site is back up and running.
Ana Hoffman recently posted..Conversion Optimization: How to Make More Money with Less Traffic?
I actually like getting the pingbacks because that is often how I can find someone scrapping my stuff. I don’t allow them to have the trackback link of course.
Gerald Weber recently posted..Get the Facts About Social Media
Actually, trackbacks and pingbacks are a bit different, Gerald – the WP system uses pingbacks, so any blogs that are legitimately linking to you (even if it’s scraped content), will show up even if you disable trackbacks.
However, there are different plugins out there, like DIgi Auto Links, that will send a spam trackbacks even when they are not actually linking to you – those are the ones I am concerned about.
Good thing Andy Bailey added an option to disable all Digi Auto links in CL Premium.
Content scraping is certainly an issue which is still to be countered on. I have previously used Tint hich adds some character at the end of url that’s why stopped using it.
Rahul Tilloo recently posted..11 On Page SEO Factors Which Really Counts
You have an option to disable adding those characters, Rahul.
Ana Hoffman recently posted..2535 Words on How to Turn sCRAPed Content into Link Building Goldmine
Hey Ana,
Until you brought this to my attention with this excellent post, I had no idea this even happened. Of course I knew that people may re-publish content, but not to this extent..
I feel that going on the offensive may be the best strategy, in my opinion anyway. Who knows though, I guess you just have to test it.
BTW, now I know why you were so worried about having the rel=”author” working correctly on your blog, when we re-designed it. I need to make sure it is working correctly on mine now. And Jon Cooper is right, you are a great blogger
Thanks for sharing this Ana.
Ian from IM Graphic Designs
(dofollow)
Ian Belanger recently posted..IM Graphic Designs ReDesigns The Traffic Generation Cafe Blog
Thank you so much, Ian – nothing like a healthy dose of flattery to start my Saturday with.
And yes, I think we’ve got rel=”author” covered now; thanks for your hard work on it!
Ana Hoffman recently posted..How to Increase Website Conversion Rates with Derek Halpern
Hi great post I am going to have to figure out the process .But the reality if they steel my writing they may get what they deserve. But I have started posting my author on every post she does with her ad and. I also credit her on my about page for the website. I also include in post links to most other posts in the same category but if they are just taking the rss this would not help. I also did the social hits on this post just to help get you more famous for all the write reasons.
Bruce Mackay recently posted..Survival Water
Sounds like you’ve got it covered as much as you can, Bruce.
Ana Hoffman recently posted..How to REALLY Create a Popular Blog From Scratch
Thanks so much for your valuable tips. It is really amazing to read and it is benefited me lot.
Imran Elahi recently posted..How to Convert Twitter Followers into Your Blog Traffic
You are very welcome, Imran.
Ana, when I found my content being scraped, the worst part was that I had to read it, again.
I thought “Uncommon Uses” stood for “actually reading.” I wasn’t mad. Before. Still, when I noticed that my URL was posted (checking the “moderated” pingbacks) I thought, “Well, you have to kiss a lot of frogs, and frogs that give me linkbacks . . . at least someone asked me to dance.” When the name of the game is publicity, he who steals my purse, gets my URL spray-painted on his back. Cool, fun-to-read article as usual.
Astro Gremlin recently posted..Watershed Day for Blogs News Reviews
Every sentence is a punch line, Astro… Love it!
This is a great post Ana.
My only objection has to do with the value of links from scraper sites. Getting links back from duplicate content pages is pretty pointless as it doesn’t pass any value. Also, scraper sites got hit hard after the various Panda updates, therefore those domains send signals of spam and low trust.
True.
However, time and time again, we see that pure quantity of seemingly useless links still plays a big role in rankings.
Those kinds of links can’t hurt your site; otherwise, your competition would be able to send a bunch of low quality links and have you penalized.
And they can’t hurt your site and you’ll get them no matter what, might as well take advantage of them.
Some of them are on well-established domains and do pass link juice.
Oh thanks for this post Ana which will prove amazing like another life line for them who all are suffering from these kind of things.
Beside that i would like to say to all only this thing if you will use “” Implement rel=”author” Markup”" this for your site/blog then all things will make right.
Awesome article Ana! Well-researched and great resources within.
I have been scraped many times in the past and used to really worry about it.
However, line of code by line of code and iteration by iteration, I am pretty sure that I have come up with a format that is pretty much self-rewarding in terms of off-site SEO when it comes to getting scraped.
Now, if I get scraped, I’m actually kinda flattered…it doesn’t bother me one bit.
Interesting point about scrape-rate. That is the first time I have heard that, but it kinda makes sense in a wierd “social media” kind of way. LOL!
It was very smart and very appropriate to include that excerpt from Matt Cutts video and he described exactly what I have found to be true empirically with respect to this subject and that is this:
If the content is indexed on your site first, 9 times out of 10, you have absolutely nothing to worry about and this has indeed been my experience. As you pointed out, pinging is key to that.
Initial indexation is the real deal and in many cases an SEO anchor of sorts.
In terms of SEO benefits, personally, I believe that using technologies (some of which you pointed out) is the key to siphoning the potential SEO benefits…at the very least, it will help reaffirm to Google that you are the author and source, and at best, it may actually push you up in search results for some long tail keywords, which is actually what I am researching right now.
We actually coded our blog to double as an article spinner (I know it sounds crazy) just to test how far we can leverage scraper activity.
See an example here: http://thebitbot.com/benefits-spinnable-articles/
(Refresh the page a few times and watch the article and the HTML beneath it change. Scrapers love these because the get auto-unique content.)
So far the results have been inconclusive, but we are collecting data.
If we could figure out a way to coerce black hat SEOs to completely take over our off-site SEO for us, half of which they are doing already, then I would be just fine with that. LOL!
Looking back, this article has reminded me of several posts I have written encouraging autoblogging/scraping technologies and techniques detailing how to get content from my own site and RSS feed. How’s that for bold?
Either way, I am glad you wrote this article Ana. I know that I and many of my friends can easily and effortlessly get to any content on anyone’s site and instantly transform it into great readable content that is completely unidentifiable by Google or Copyscape.
Even though I don’t have malicious intent, there are tons of people like me who do and use their skills on a daily basis.
I have never seen any technology or any content blocker or trick that could stop anyone even remotely determined to steal or scrape.
The bottom line for me is: If you write great content, continue to do so. You can spend a little time fighting content pirates, but you will soon see the effects of the “law of negative returns”. This war is being more effectively fought by the coders than the lawyers.
However, if you outsource your content generation, BE VERY CAUTIOUS.
Great read Ana! I like reading your posts that are a bit more on the aggressive side like this.
Mark
P.S. I live in Louisiana, Baton Rouge specifically and we figured out we could avoid drinking other peoples 5X pee by tapping a deep aquifer that is naturally filtered by sand and clay.
I actually have an extensive background in and have many expert colleagues very skilled in the detection of toxic and infectious agents…and one thing we will NEVER be doing is drinking Mississippi river water, you can bet on that…LOL!
Loved reading your comment, Mark; it’s like Part 2 to my post.
I know you do a lot of adventurous experiments on your blog and always love hearing about your concrete results, whenever you have any.
Now that HTML code under your posts are for copy and pasters, right? When content is auto-scraped, it just scrapes your post, that’s it.
So what benefit would auto-spinning your own post (I can’t believe you actually thought of doing it!) bring in that case? Different scrapers get different versions of the same post?
PS I hear you on the water issue!
Ana, this is such a great post. You changed the way I think about content scraping and I’ve got a couple takeaways I need to implement! Great job and good luck in the contest!
(dofollow)
Ben Jackson recently posted..1,000+ Easy Backlinks: FREE, Simple, & Effective
Glad to hear that, Ben; thanks for coming by!
I get sites that use or steal my screen shot images and my tutorials I write. They add them to their site, but use my bandwidth which is a little disappointing. I don’t have time to go hunting them down. I just happen to see it in my logs when I am checking. I was going to do the hotlink or leech protect thing, but keeping up with all that can be time-consuming too. I don’t want to block or prevent something by mistake either. I do like to vent about it on occasion, but don’t let it bother me too much.
Ray recently posted..Google Pagerank Update February 6th, 2012
Hotlinking sounds like a good solution for you, Ray, and kind of fun as well. lol
LMAO! Thanks for the link love Ana.
you know I have the same issue… I’m rarely aware of links back to me until I get pinged from a scraper site. OR I get the ping months later, and totally missed the party. Drives me nuts, I wish WP would fix that particular glitch somehow. And my view on scraper sites is that hey.. It means your shite is worth stealing so it’s almost a back handed compliment. Lol And pardon any typos, iPhones are a bear to comment on! Xo
Yes, what’s up with getting a pingback from someone who mentioned you on their blog 3-4 months later?
Glad you said it, Cori; I thought something was wrong with my blog…
Thanks for coming by!
Ana Hoffman recently posted..2535 Words on How to Turn sCRAPed Content into Link Building Goldmine
Ana, this is an excellent article to teach those of us who did not know there content is being stolen. I will work in many of these suggestions. I have been adding links back to my blog on my You Tube videos for 3 years and will add the link to the video also as suggested.
I have been using a great webhost since 2009 and I have unlimited video hosting there and add my link in on description there. I feel that hosting my videos with my web host is more professional and the viewer is not exposed to You Tube ads. When someone is on my site, I don’t want them distracted.
Danielle Parsons recently posted..Host Then Profits
Good point about hosting your own videos, Danielle – I agree, looks much more professional when it’s not a YT video.
Have a great week!
I see this scraping more with articles I post on article directories and on guest posts (there were about 16 of them for my ProBlogger post). The ones that just copy the article and include the link aren’t bad. But I did see some that stripped out the links and/or spun the content, which violate the terms of the article directories.
At first, I tried to get them to take it down or fix it. One guy did fix it, but most just ignored any attempts at contact. I reported a couple to their hosting provider and one took down the entire site while most asked me to fill out the copyright forms.
After a couple rounds, I figured that it really wasn’t worth the time and effort to find and challenge the scrapers. So now I just ignore it and keep on doing my own stuff.
(dofollow)
Bill (LoneWolf) Nickerson recently posted..Integrity Marketing – Is This The Internet You Want?
I am so with you, Bill.
I remember how long it took me to chase down the first scraper who took out all the links…
We definitely have better things to do, don’t we?
Hi Ana,
Great tips here. I knew that in most cases when your content is scraped it’s still your site that benefits, though, and I have this confirmed here.
That is a very good thing to be able to take advantage of an otherwise negative situation. I’ve learn earlier this week about Tynt from Adrienne Smith. This is a great tool that can protect you in some way from stolen content. I will check other tools that you mentioned here.
Thanks
Sylviane Nuccio recently posted..Introducing SylvianeNuccio.com
Yes, Adrienne and I decided to mention the same tool this week – it’s that good! lol
Thanks for coming by, Sylviane!
(dofollow)
Ana recently posted..CommentLuv Premium
Wow – my head is spinning after reading that! My thoughts are – all the more reason to focus on converting readers to subscribers with a great opt-in offer and building a loyal engaged following through other channels such as social media, so you’re not just relying on content alone, in case people do start to rip you off! (And Ana, you’re a great example of building a loyal tribe!)
I think this level of blog know-how would be well beyond most coaches, therapists and biz owners who just use their blog to showcase that they know their topic rather than to make money as a blogger per se.
thanks
Tanya Smith recently posted..Social Media is ‘Outing’ the Real You – Are You Ready for That?
First of all, thank you, Tanya.
I truly think that scraping becomes a problem when scrapers start stealing our traffic by ranking higher on Google.
However, as you said, if our traffic sources are diverse and we don’t rely on Google alone, we should be just fine.
Great post Ana, very interesting. I’m already using excerpts in my feed but now I’ll check that article on pings and Tynt. Thanks for your always interesting articles, wherever they are published.
Andrea Hypno recently posted..Self Hypnosis Techniques: How to Develop Healthy Habits
You are very welcome, Andrea!
Ana, another great post! I personally love scrapers as to me the number of scrapers is an indicator of how widely my content resonated across the web. Also, I have observed that with time quite a few of those scraped posts get downgraded and in many instances entirely removed from serps. I don’t spend time fighting them as, like you said, it is quite rare for scraped content to rank higher than one’s original post. Oh and the idea of using YouTube is pretty interesting – will definitely give it a try
Tanya McTavish recently posted..How to build up keyword phrases when your research tools fail
Thanks for coming by, Tanya!
(dofollow)
Ana recently posted..Internet Marketing Tools
Hi Anna,
Great post. A while ago I found a site (using Copyscape) which had had been scraping my entire blog’s content. At first I was ticked off, especially since they ‘linked’ to my site at the bottom, but the link actually just linked back to one of their own pages.
Then I started deep-linking within the content, so I still got some benefit.
Thanks.
I like the tip about ‘Tynt Publisher Tools’ – I’m definitely going to try implement that.
Rory recently posted..Ideas to Make Extra Money
Glad to hear deep linking helped, Rory – at least you’ll get some benefit from it.
(dofollow)
Ana recently posted..Aweber Review
Very informative post Ana. Deep Linking and the RSS footer plugin definitely will help Google find the origin of the site. If the origin is known, we dont have to worry about duplicate content (duplicates not in your site ofcourse).
However, dealing with copy pasters is really a pain; yup, those who just copy the post in paragraphs and manually editing the links.
Kukzee recently posted..Recipe for Love
There’s not that much we can do about them and they know it, Kukzee!
(dofollow)
Ana recently posted..Thesis Theme
Hi Ana,
What a thorough guide on how to deal with scraping! (I wouldn’t expect any less from you, of course!)
I have just started using Tynt and I am curious to see how that works. I haven’t even had it long enough yet to receive my first report so I am curious to see the results!
I’ve bookmarked this post so that I can come back later when I have more time to make some of the other changes that you’ve mentioned!
Stacy recently posted..Interview with Danny Iny from Firepole Marketing
I just started using it myself, Stacy, and so far so good, I like it.
(dofollow)
Ana recently posted..Free SEO Report
I have used the no-hotlinking htaccess trick. I use an image of my logo hosted on a free photo site. This means that if anyone scrapes my posts and images my logo is shown all over the page. Not sure if it really helps much though!
Jon recently posted..How To Lose Weight
Hosting it on a free website, Jon – good idea, thank you.
Let’s hope your image is plastered all over some scraper’s site!
(dofollow)
Ana recently posted..How to REALLY Create a Popular Blog From Scratch
I have just set a RSS footer insert too, but using Yoast’s SEO plugin for WordPress rather than Joost’s solution. I think the footer is a bit cheeky, but then any scraper deserves it!
Jon recently posted..How To Lose Weight
Since Yoast and Joost are the same person, I am not surprised that he made it available in his SEO plugin.
(dofollow)
Ana recently posted..Majestic SEO Site Explorer: How a Reader Made Me Eat My Words
Thanks for bringing this to my attention, Ana. I don’t know yet how to edit my WP code, but I’ll try to add that really useful code from tynt to my blogs.
Anne recently posted..Turn a Room Into a Stress Free Sanctuary
You add it exactly how you would Google Analytics code, Anne.
I am sure you’ll figure it out.
(dofollow)
Ana recently posted..Conversion Optimization: How to Make More Money with Less Traffic?
Ana,
Great point of view. Since there is no good way to get rid of scrappers entirely your methods seem to at least make lemonade out of lemons. The best you can do in a bad situation.
There is little you can do about them, and fighting them seems a waste of time (you could make a career out of it there are so many)
But I still have two concerns.
1. A lot of scrappers also strip out all links. The ones who link back are at least honest, but so many of them don’t.
2. The ones who don’t link back will show up as duplicate content. (I remember the matt cutts video from being well before Google Panda, so the “don’t worry” about it vibe, may be a bit old.
…but really this just strengthens the need for Defense #3. If the author tag is shown as you, I think you minimize the risks there.
Anyhow, great article and very important stuff
(dofollow)
Steve recently posted..Get My First Kindle Book on Amazon.com
Turning lemons into lemonade; that’s exactly what this post is about, Steve.
As far as your concerns go:
1. Not much we can do about it. The only reason to worry about it is if you find out they are ranking above you.
2. You are actually talking about syndicating content vs duplicating content.
Duplicate content issue actually relates to OUR sites vs someone else republishing our content.
Check out this post: http://www.trafficgenerationcafe.com/duplicate-content/
Ana Hoffman recently posted..Majestic SEO Site Explorer: How a Reader Made Me Eat My Words
Ana,
Incredible! Appreciate the stories, information and links to come back to later.
Thanks for a comprehensive lesson on what happens after you write original material for the Internet.
Ahhh… and I have indeed seen some of my Ezine Articles copied without giving me credit that have ranked above my original work. Once my blog gets going I suppose it’ll happen there too… and I’ll know where to go back to and see what’s next to implement from your suggestions.
blessings,
Cynthia
Cynthia Ann Leighton recently posted..How I Got 21 Days 6842 Dollars Online Revenue
Unfortunately, yes, it’s just a matter of time, Cynthia.
At least we can try to make the best of it!
Ana Hoffman recently posted..Blog Audit Friday: My Way or the Highway
These scrapers bug me. I’ve chased a few down through their various hosts and had pages taken down.
I even had one guy asking me (quite politely) whether it was ok to publish the article ‘inspired by’ my own. On closer inspection it was another scrape. He would have gotten away with it if he hadn’t asked.
Final straw was when another blogger copied my ABOUT ME PAGE! Him I just emailed and said “Look mate, if you can’t even get your own life story together how the @#$% do you expect to do any business?”
He took it down.
Anyway, I’m hoping plugging the gaps in my defences with a few of your suggestions here will mean I can rest a little easier. It’s not that my content is out there with secret links pointing back to my blog (Have to agree with Matt Cutt’s conclusion). It’s the principle of the thing…
Thanks for putting this one together…
Your About Me page, Jym? That’s hilarious…
Ana Hoffman recently posted..Blog Audit Friday: My Way or the Highway
Great info as usual Ana.
Seems like dealing with the inevitable is better than trying to prevent the inevitable lol.
@larryphoto
Larry Lourcey recently posted..Are You A Guru?
Very well put, Larry – exactly.
(dofollow)
Ana recently posted..So I Want to Get More Traffic to My Blog (Now What?)
Very insighful post Ana and it is hard to control these scrapers. I have sent DMCA take down notices in start, but seems like they keep coming.
Ahmad Wali recently posted..My Blogs Got Hacked! How To Clean and Increase Website Security?
And they will continue to come, Ahmad; no question about it.
(dofollow)
Ana recently posted..Thesis Theme
That’s a totally brilliant post Ana – I am going off to tweet, G+ and LinkedIn it now for you. Not keen on the full RSS part of it …. but I’ll work on overcoming that … heck!
Liz recently posted..Google Author Markup – New WordPress Plugin – AuthorSure
Much appreciated, Liz!
(dofollow)
Ana recently posted..So I Want to Get More Traffic to My Blog (Now What?)
Ana, I think this may be my favorite post by you so far! Much as I detest scrapers, fighting them is often futile. USING them, the way they use us, does have a certain…karmic appeal. Twisting their tactics to our advantage? Even better.
Holly Jahangiri recently posted..Good Blogger And A Good Blog: What Are The Signs?
Frankly speaking too scared of the scrappers. Sometimes they rank higher than the original and it’s all Google’s fault
Bishwajeet recently posted..Top 10 free Android apps for mobiles in 2012