• This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn more.

Content duplicated from other sites - how to identify?

Amin Sabet

Well-known member
#1
Given that I have camera forums, I occasionally come across posts that are duplicates of posts from other sites.

For example, Jim A posts a long nice post about how he gets the most out of his new Nikon D810 camera. Great content. But then I Google his post and find that he also posted it verbatim in the DPReview forum. Great for Jim - he gets responses on both sites. Not good for me - my forum looks like it's scraping DPReview.

Ideally I'd like to identify and delete all such posts, but I haven't found a way to do the identification part. Has anyone else found a way?
 

James

Well-known member
#3
I'm not sure his concern is seo I think his concern is that he doesn't want to appear to be stealing content from other site(s). I'm not sure there's an easy solution for this.
 

Solidus

Well-known member
#5
I'm not sure his concern is seo I think his concern is that he doesn't want to appear to be stealing content from other site(s). I'm not sure there's an easy solution for this.
He mentions Google, though. Not only will both his sites be under the same Webmaster Tools account (should be), but as stated above, Google doesn't care.
Who's gonna think he is stealing content?
 

Amin Sabet

Well-known member
#6
Google does not penalize for "duplicate content", it only rewards unique content. I don't think this will be an issue for you.
Has something changed?

From: https://support.google.com/webmasters/answer/66359?hl=en

"Duplicate content on a site is not grounds for action on that site unless it appears that the intent of the duplicate content is to be deceptive and manipulate search engine results. "

It seems to me that if someone Googles "Nikon D810 autofocus" and Google's search results show them largely the same content on DPReview and my site, that is the kind of bad user experience Google is talking about, and my site being the smaller fish could be perceived as trying to game the system, even though that is not the case.

I see some forums which are getting hit by Panda and believe this issue to be relevant.

For example: https://productforums.google.com/forum/#!topic/webmasters/RbJPJqzQ6sk[301-325]

"We also delete all duplicate content, and strictly forbid any community members from posting articles that exist elsewhere on the Internet. All such content is instantly removed and the poster banned from the site. "
 

Anthony Parsons

Well-known member
#10
"Duplicate content on a site is not grounds for action on that site unless it appears that the intent of the duplicate content is to be deceptive and manipulate search engine results. "
Google can and does ascertain user generated content, cross posting as you have outlined. That is not deceptive, that is merely you're not getting anything from that content because chances are, Google already has that exact content listed at competitor website.

Now... saying that, if you went and got a few links into that page, even though it was posted after your competitor site, you will actually obtain the ranking for it if the links are seen as more authoritative by Google. Hell, even if your site is more authoritative, then your user post will stick in the rankings, even though it is after the fact.

Google's algorithm should not be thought of as some simple mathematical equation. It's far from it, and uses limited AI within it to help it along.

Just because the user posted the content on your page second or third hand, your site is not punished and that page is not discarded in listings... it is merely not shown as the most appropriate match IF the other site has more authority to it. Authority though is not just domain level, but also page level. A relatively new site to the web could outrank some of the most authoritative domains, if a page on their site weighs more according to Google for relevancy.

If you want to find content as being original when posted by your users, then yes, you can do that. The service is called CopyScape, and you can subscribe for a fee to have your new content scanned and compared to everything else online for originality / duplication purposes.