Implemented Markup improvements for Google

rrlevering

Google
Hi, my name is Ryan Levering and I currently handle structured data ingestion at Google (this guy). I've been doing some fairly spontaneous spot checks (not based on any specific problem) of some of the major forum software markups on the web just to see whether the markup is being generated in an ideal way for our systems to ingest. You can stick a URL in http://validator.schema.org to get an idea of what your markup for a given URL looks like. I have a couple of high-level suggestions to take or leave as you see fit:
  1. Include more than the OP with "http://schema.org/Comment" nodes through a "http://schema.org/comment" property. We're trying to normalize the forum markup space to use DiscussionForumPosting for the OP and attach Comment typed markup for the replies in a flat list (or threaded if you are a threaded forum). Without that, it makes it harder to segment the rest of the page appropriately in our index.
  2. Co-typing WebPage and DiscussionForumPosting like you do is going to confuse our ingestion a bit. If you squint it's not that inaccurate, but it would be clearer to either have WebPage (separate node) -> mainEntity -> DiscussionForumPosting or DiscussionForumPosting -> mainEntityOfPage -> WebPage (separate node). A co-typed self-cycle needs to be detected specially often.
  3. Include profile URLs in your author -> Person nodes. Raw names are not nearly as useful for disambiguation.
There's a couple other smaller things, but those are the things that would improve the markup the most. Note also you don't need to use JSON-LD if you are worried about duplicating contents/page size. Microdata is fine for text/content-heavy schema (though can be harder to author/inject).
 
Upvote 41
This suggestion has been implemented. Votes are no longer accepted.
1701120963497.png
This is the main issue - it seems that neither mainEntity (as used by XenForo 2.2.13) nor parentItem does work to correctly nest the Microdata Comment under the JSON-LD DiscussionForumPosting.

@rrlevering

Can you confirm that
Code:
<meta itemprop="parentItem" itemscope itemtype="https://schema.org/DiscussionForumPosting" itemid="https://xenforo.com/community/threads/markup-improvements-for-google.213088/" />
should work to correctly nest a post (Comment) from this thread under its DiscussionForumPosting?

Or could you give a complete short working HTML example of a Microdata Comment correctly nested under a JSON-LD DiscussionForumPosting?
 
Not sure if it's the same issue being parsed differently, but after making the relevant changes the validator actually seems to indicate the comment nodes are valid, but it instead complains that the linked DiscussionForumPosting does not have the required attributes (which are present in the JSON-LD). Maybe the validator is not taking syntax merging into account yet...?
 
Last edited:
Maybe got it working?

Actually yeah, it seems the parentItem isn't being parsed correctly :( As @Kirby said, an example would be lovely.
 
Last edited:
As I mentioned in my last post, just change Comment -> mainEntity to Comment -> parentItem and that structural problem will go away. With that change, this exact forum page here will parse fully valid. Syntax merging should work fine currently.

The other issues (only image comments and deleted authors) don't show up as much in our web diffs, but we'll work on fixing them next round of updates. These are likely not leading to problems in our ingestion, it's just a reporting issue.4ekd6wNHaipVgdB.webp
 
As I mentioned in my last post, just change Comment -> mainEntity to Comment -> parentItem and that structural problem will go away. With that change, this exact forum page here will parse fully valid. Syntax merging should work fine currently.
Thank you very much for the reply.

Unfortunately it seems like I must be doing smth. wrong as this doesn't seem to work correctly for me:

If I apply this change the test tool detects X (= amount of posts on the page) unknown Comment with a parentItem DiscussionForumPosting and one known DiscussionForumPosting without any Comment while there should be just one DiscussionForumPosting with X Comment?

For this thread page the tool currently shows 9 DiscussionForumPosting
1701145911208.png

1701146073744.png

This doesn't look right to me - also the repeated type DiscussionForumnPosting seems strange?

I therefore still think that a fully working example would be awesome :)
 
Last edited:
For those who want to fix the missing "url" attribute add
Code:
"url" => $threadLink,
to the forumdiscussionposting schema under the "@id" in the file src > xf > threadtype > abstracthandler.php
Does that physically go under:

"@id" => $threadLink,
GOES HERE "url" => $threadLink,
"headline" => \XF::app()->stringFormatter()->snippetString($thread->title, 110),
 
Is anyone able to get the profile page schema working? I think the member_view template should have access to everything. My only question is how to get the join date to a proper timestamp.


JavaScript:
<script type="application/ld+json">
    {
      "@context": "https://schema.org",
      "@type": "ProfilePage",
      "dateCreated": "2019-12-23T12:34:00-05:00",
      "dateModified": "2019-12-26T14:53:00-05:00",
      "mainEntity": {
        "@type": "Person",
        "name": "Angelo Huff",
        "alternateName": "ahuff23",
        "identifier": "123475623",
        "interactionStatistic": [{
          "@type": "InteractionCounter",
          "interactionType": "https://schema.org/FollowAction",
          "userInteractionCount": 1
        },{
          "@type": "InteractionCounter",
          "interactionType": "https://schema.org/LikeAction",
          "userInteractionCount": 5
        }],
        "agentInteractionStatistic": {
          "@type": "InteractionCounter",
          "interactionType": "https://schema.org/WriteAction",
          "userInteractionCount": 2346
        },
        "description": "Defender of Truth",
        "image": "https://example.com/avatars/ahuff23.jpg",
        "sameAs": [
          "https://www.example.com/real-angelo",
          "https://example.com/profile/therealangelohuff"
        ]
      }
    }
    </script>
 
Is anyone able to get the profile page schema working? I think the member_view template should have access to everything. My only question is how to get the join date to a proper timestamp.


JavaScript:
<script type="application/ld+json">
    {
      "@context": "https://schema.org",
      "@type": "ProfilePage",
      "dateCreated": "2019-12-23T12:34:00-05:00",
      "dateModified": "2019-12-26T14:53:00-05:00",
      "mainEntity": {
        "@type": "Person",
        "name": "Angelo Huff",
        "alternateName": "ahuff23",
        "identifier": "123475623",
        "interactionStatistic": [{
          "@type": "InteractionCounter",
          "interactionType": "https://schema.org/FollowAction",
          "userInteractionCount": 1
        },{
          "@type": "InteractionCounter",
          "interactionType": "https://schema.org/LikeAction",
          "userInteractionCount": 5
        }],
        "agentInteractionStatistic": {
          "@type": "InteractionCounter",
          "interactionType": "https://schema.org/WriteAction",
          "userInteractionCount": 2346
        },
        "description": "Defender of Truth",
        "image": "https://example.com/avatars/ahuff23.jpg",
        "sameAs": [
          "https://www.example.com/real-angelo",
          "https://example.com/profile/therealangelohuff"
        ]
      }
    }
    </script>
I assume you are doing it with PHP... It would be:

PHP:
"dateCreated" => gmdate('c', $user->register_date)

If you want to do it from a template, you could probably do something like (haven't tested it):

Code:
{{ date_from_format('c', $user.register_date) }}

Did I mention I didn't test it? :)
 
My point is Reddit or Quora can implement best practices immediately rather than us having to wait 6 months for the next release. These new schemas are part of a coming change in the Google results to understand and surface more first hand experience content like forums.
Thanks for clarifying. :-) I thought you meant the profile schema pushed up their rankings.
 
Just an FYI, we've implemented this in 2.3 and 2.2.14. It's available on member profiles here.
Should issues with the implementation be reported as a new bug?

If not:
  • FollowAction seems to be missing entirely
  • interactionStatistic LikeAction counter is wrong - it gives the reaction score the user has received but it should be the total amount of likes the user has received (which IMHO doesn't really exist in XenForo)
  • agentInteractionStatistic LikeAction is missing
  • dateModified is missing
  • image should have three variants (aspect ratio 1x1, 16x9, 4x3, min 50K pixel)
  • sameAs is missing (if the user profile has social media identities and those can be seen by a guest)
 
Top Bottom