SEO Question - Merging forums to have big sections + using filters

AzzidReign

Well-known member
We are working a new system to get rid of our massive forum index and turning to other sites that we have seen doing well with just a few huge sections [NeoGAF, ResetEra]. There are other examples of this format doing really well but I can't think of the others off the top of my head. IMO, the problem with these formats is finding what you actually need. Sure the search feature can work, but many times it's a bit hard. So we believe we've come up with a solution.

What we are planning on doing is following a similar idea but using advanced filters. So if you looked at my current index, we have 100's of forums with 100's of subforums and it just keeps growing with each new popular game. To remedy this, we are going to break it down by the following:
[IMG]


These sections will make use of a game and console database (connected through an API service) giving threads a unique ID for the game and consoles it is associated with; same with the platforms, if it's a PC discussion, that will have the PC ID attached to it. This will allow us to reduce the number of sections on the site, shrinking our forum index by probably 1/4 or better while allowing users to have quicklinks (or favorites) to the filtered URL's (set up in the profile and a few other ways) so they aren't always having to go into a section, type the game, and filter to their game. The way we are looking at this system is it will allow for an "infinite" number of "subforums", where we, as the admins, don't have to get involved to create the subforum, the game database will take care of that and if it becomes popular enough, we plan on showcasing that game on our board index.

My current problem is possibly the SEO aspect of this. So I have some questions that I wanted to see if anyone could help me with:
  1. Since everything will be in one section and you can filter out a specific game title (giving a unique URL output as well), will SE's, especially Google, see this as a problem? Almost like "duplicate content"? For example, since the Games section will be a catch all for all games, but when it is filtered out, it will have a unique URL but will then only show the topics related to that game. Will these "filtered game URLs" be considered duplicate content in SE's eyes?
  2. If yes, I'm assuming that the filtered pages need to be a canonical of the main category (i.e. Games)? Or better yet, the reverse, have the main be a canonical of all the individual games URLs (not sure if or how possible this option is)?
  3. I will be using 301's to link the old links up to the new URL structure with the filter URL, will canonical-ing be an issue with this?
  4. With the sitemaps, should I get our script to output each game ID that has move the x threads to that game? Or just use the sitemap as default and don't worry about the threads and how the sitemap categorizes it?
  5. Anything else I should be aware of?
I'm really excited to unleash this system into the wild but after all these hours and money put into this, SEO popped into my head and I'm hoping that I'm not shooting myself in the foot with this set up.

Let me know if I need to give more details. I didn't want to post the entire set up of the system as it would be longer winded than this already is. Anyone with an expertise in this that needs further information, I can do a screen share so you can see exactly how the system works but I think I've put enough details in this thread explaining how the system will basically work.
 

Alpha1

Well-known member
How do you plan to do the filtering? Prefixes? Tags? Something new?

You might find this thread of interest:
 

AzzidReign

Well-known member
How do you plan to do the filtering? Prefixes? Tags? Something new?

You might find this thread of interest:
Combination of things but the main is a modified custom thread fields that will be used for the games and consoles. We will have prefixes that can be filtered, and possibly tags but the main use will be for games.
 

Sim

Well-known member
My current problem is possibly the SEO aspect of this. So I have some questions that I wanted to see if anyone could help me with:
  1. Since everything will be in one section and you can filter out a specific game title (giving a unique URL output as well), will SE's, especially Google, see this as a problem? Almost like "duplicate content"? For example, since the Games section will be a catch all for all games, but when it is filtered out, it will have a unique URL but will then only show the topics related to that game. Will these "filtered game URLs" be considered duplicate content in SE's eyes?
  2. If yes, I'm assuming that the filtered pages need to be a canonical of the main category (i.e. Games)? Or better yet, the reverse, have the main be a canonical of all the individual games URLs (not sure if or how possible this option is)?

Given that I'm about to go through a similar exercise with one of my sites, I thought I'd do a bit of quick research into this.

A number of sources I read confirmed that yes - on every filtered page, you should point back to the unfiltered page as the canonical version - to avoid duplicate content.

This is because the list of threads without any filters applied contains all the content that is also visible when any filters are applied (the filters only show you a subset of the full content) - thus it will be considered duplicate.

When using thread prefixes to filter forums - XenForo already adds a canonical tag pointing back to the unfiltered version.

I'm debating the same questions as you - because right now I have lots of separate forum topics with their own canonical landing page - but if I merge them and use something like thread prefixes to filter them, I lose that.

I guess it comes down to a question of how valuable those topic specific thread list pages are versus the actual content pages?

Are you better off perhaps having specific canonical landing pages for each of your topics (games?) and then from there link to the various thread list pages or other content you have for those topics? These would not necessarily need to be in the main navigation tree - so doesn't need to break the current forum structure.

On ZooChat we currently have one forum per country and then each zoo gets its own thread prefix - so threads can either be generic for a country or zoo specific. I'm wanting to build a separate content / landing page for each zoo which links to the threads for that zoo, but also to the media, resources, maps and any other content we have created relating to that zoo.

On one of my other sites I'm looking to consolidate the number of forum topics and was considering using thread prefixes - but I also have concerns about losing those thread list landing pages. Some of them do get significant search engine traffic!
 

AzzidReign

Well-known member
@Sim
As I'm reading more, canonicalizing the URL's will effectively remove those pages from Google's Index. That's one thing I do NOT want. We have a lot of traffic to our GTA V and GTA V Glitches sections, and we show up #1-3 for multiple keywords for these sections. It appears if we use canonicals for the game filtered pages, we will be shooting ourselves in the foot. The question then becomes, since I include extra information on these filtered pages such as pulling game data like description of the game, game image, possibly places to buy the game, who's streaming that game, videos associated with the game, news for the game, etc. will that make it unique enough in Google's eyes to not need a canonical? And seeing as if you go to the overall Games page, the thread listings will be different than the ones when they are filtered.

I did come across a similar question:

Canonical is not the correct way, the filtered page is not the same as the unfiltered page. It may contain some of the same products, but isn't actually the same, which is the actual purpose of the canonical tag.
-Posted by a Expert Google Employee

And another similar question:

Canonical doesnt really sound approrate.

Its for when the same content is (often inadvertantly!) exposed on multiple URLs.

Filtering changes the content, so doesn't fit.
-Posted by an "Expert" - non-Google Employee

If this is in fact correct, and not canonicalizing the filter URL's is seen as "OK" from Google, this is the best news for me :) Now the question would be, how to handle the sitemap with this information. Can we modify it so that way we can submit links to games that have more than "x" threads for it?

What I'm gathering, if you aren't trying to deceptively manipulate things on your site, you should be ok. This whole change is to improve the visitor experience on the site, which is what Google is always saying you want to focus on. I'm hoping the above quotes are correct. I've posted a question to the community so we will see if we need to do any changes. It looks like right now, the dev has those pages as canonical, which I'll be telling him to remove unless I get different answers from my g community question thread.
 

Sim

Well-known member
I've done a bit more reading and it turns out this is a lot more subtle than many people seem to appreciate - I'm seeing some blanket advice out there which doesn't necessarily hold for all cases.

For starters - the canonical tag is NOT a directive (a 301 redirect IS a directive - web browsers / crawlers have to follow it) - it is a "suggestion" to Google to indicate which version of a page (or original source) we would prefer it to index to avoid duplicating content in the search index.

For filtered pages - it becomes complicated because there are many different use cases for filtering lists and some have different intent to others.

At the end of the day - the key is to consider it from an end user point of view: what would it look like to them if all these pages with largely duplicated content were returned in their Google search results? Too many results is not ideal from a usability perspective - so we really want to limit how many pages Google indexes where the content is largely similar.

However, if there is a filtered version which is important to users - then it is reasonable to make that a canonical page too.

I think the best example I've seen is from https://www.clickminded.com/canonical-url/

Let's say we run the Nike.com website and we have a section dedicated to mens shoes - we want to canonicalise this page:

HTML:
Page URL: http://nike.com/mens-shoes

<link rel="canonical" href="http://nike.com/mens-shoes" />

Now if we then filter that page to show only size 10 shoes, we might have a ?size=10 parameter to filter. But we don't need to show all shoe sizes in the search results, so best to point them back to the main page for this:

HTML:
Page URL: http://nike.com/mens-shoes?size=10

<link rel="canonical" href="http://nike.com/mens-shoes" />

This is suggesting to Google that the ?size=10 parameter isn't important enough to consider a separate page for it, and instead to direct people to the base mens-shoes page instead.

To make things worse, what about if we allowed filtering by colour as well? That just makes things worse - since we have just dramatically increased the possible combinations of pages for all sizes and colours and we really don't want all those appearing in the search results either.

HTML:
Page URL: http://nike.com/mens-shoes?size=10&colour=red

<link rel="canonical" href="http://nike.com/mens-shoes" />

However, there are exceptions where a certain filter may be important - the example is if we have a promotion on Nike men's Jordan shoes - it makes sense to have a page dedicated to showing just that brand of shoes, especially if we get a lot of search traffic looking for that specifically (search intent is important!).

So in this case, if we had a type parameter we may choose to canonicalise the parameterised version of the page to encourage Google to index the page:

HTML:
Page URL: http://nike.com/mens-shoes?type=jordans

<link rel="canonical" href="http://nike.com/mens-shoes?type=jordans" />

It makes sense to have this page in the index because people will be searching for that specifically and returning the generic page listing all mens shoes is not the desired outcome when they specifically want Jordans.

This is directly applicable to our situation.

If using a thread prefix (or other custom parameter) to filter threads - we may want to encourage Google to index these pages, while ignoring the other possible parameters which don't actually add value to searchers.

It's not a great example (because we wouldn't want to index it), but if we look at the URL: https://xenforo.com/community/forums/bugs/?prefix_id=21&last_days=365&order=post_date&direction=asc - there's a lot of information there in the parameters which is unnecessary.

So I wouldn't want to have each of those as separate pages in the index, but the prefix_id is potentially significant (at least on ZooChat, where that relates to a zoo!), so I would probably canonicalise the page to:

HTML:
Page URL: https://xenforo.com/community/forums/bugs/?prefix_id=21&last_days=365&order=post_date&direction=asc

<link rel="canonical" href="https://xenforo.com/community/forums/bugs/?prefix_id=21" />

... and also ensure that the page title and H1 also contained meaningful information for that prefix.

So I think @AzzidReign - from my understanding, you are generally on the right track. Focus on user outcomes and in general you should be okay.
 

AzzidReign

Well-known member
So I think @AzzidReign - from my understanding, you are generally on the right track. Focus on user outcomes and in general you should be okay.
Exactly. It sounds like the way we are doing things, me with separate games as the filter, you with separate zoos as the filter, we wouldn't want those pages to canonical to the "catch all" forum for those because the end user would use google to search for those specific strings.

The examples you gave above about shoe size don't necessarily relate to our issues since most people aren't going to google saying "size 10 shoe". However, I'm sure they are typing a specific zoo or a specific game in Google, which would make our pages relevant.

For us, the way we are setting things up with prefixes to show what the thread is about (i.e. Discussion, Tutorial, News), we will use that part of the filter as canonical to the game filtered page to help google out a bit more.
 

Dan Blather

Member
I'm trying changing the displayed thread length from 20 to 50 posts per page, so the page is more content-rich. In some subforums, I'm merging scattered smaller threads about the same topic into larger threads, again to increase page content. (We see every thread as potentially active, and don't care about bumping old threads.) I think it's better to have a smaller number of larger threads that search engines might look at more favorably, than a larger number of small threads that boost the thread count but stay un-indexed. My forum has no advertising, so I'm more concerned about maximizing visitors and users, rather than page views.
 
Last edited:

AzzidReign

Well-known member
In terms of the sitemap, something I'm thinking about and waiting on a response from my dev is if we can submit the game URLs into the sitemap for the SE bots to read easily and discover. The thought is to use a parameter such as "game has >50 threads" to include it into this "category system" into the sitemap, so that way we aren't letting search engines know about games that we don't have many threads for. That number can change and hopefully will be a back end option in this new system for us.

Our canonical plans will be:
Filter by game: GTA V
Takes you to the games section on the site with the GTA V ID URL to only show GTA V threads, includes every console and every prefix. This is what other non-game filters will canonical to
Filter by console in GTA V: Xbox One
Adds the URL parameter for Xbox One to the URL so now you see GTA V threads that only relate to Xbox One - this will be a canonical of GTA V from above
Filter by prefix in GTA V + Xbox One: Tutorial
Adds the prefix parameter to the URL so now you see GTA V threads that only relate to Xbox One and Tutorials - this will be a canonical of GTA V from the above. Or if the user does GTA V and doesn't pick a console and picks the prefix, that will also canonical to the main GTA V URL.

I think this game plan will work and seems from my reading, we won't be hit with duplicate content. I can only hope and time will tell all.
 
Top