Link Checker for XenForo 2.x by AddonsLab

Link Checker for XenForo 2.x by AddonsLab [Paid] 3.8.0

No permission to buy ($54.99)
Thanks for the swift answer. I've been scanning my big board with various SEO tools and I've found found that XFMG, AMS, UBS and RMS attract a lot of dead links over time. Much more than forum. Which makes sense because articles and blogs often contain many off site links and images.

I understand. As I remember, some time ago you have also requested link checker for resource manager, which we have created and released - https://xenforo.com/community/resources/link-checker-for-resource-manager-by-addonslab.6115/ and there has been no sales for this add-on, not even from your side. So we are not much interested in putting so much effort to support additional content checkers if no one is going to use it.

If you are interested in these features feel free to open a ticket at https://customers.addonslab.com/submitticket.php and we will provide the terms to develop them for you.

Thank you!
 
Last edited:
This is odd. I decded to not (yet) buy the addon, and uninstalled it completely last week. (ie all files as wellas uninstalling)

But today I got this error:

Code:
Server error log

    Exception: Could not get runner for job AddonsLab\LinkChecker:DeadLink (unique: dead_link_check_4864). Skipping. src/XF/Job/Manager.php:244

    Generated by: Unknown account Feb 11, 2021 at 2:43 PM

Stack trace

#0 src/XF/Job/Manager.php(200): XF\Job\Manager->runJobInternal(Array, 7.98343)
#1 src/XF/Job/Manager.php(84): XF\Job\Manager->runJobEntry(Array, 7.98343)
#2 job.php(43): XF\Job\Manager->runQueue(false, 8)
#3 {main}

Request state

array(4) {
  ["url"] => string(8) "/job.php"
  ["referrer"] => string(56) "https://cafesaxophone.com/whats-new/posts/1625118/page-5"
  ["_GET"] => array(0) {
  }
  ["_POST"] => array(0) {
  }
}
 
This is odd. I decded to not (yet) buy the addon, and uninstalled it completely last week. (ie all files as wellas uninstalling)

But today I got this error:

Code:
Server error log

    Exception: Could not get runner for job AddonsLab\LinkChecker:DeadLink (unique: dead_link_check_4864). Skipping. src/XF/Job/Manager.php:244

    Generated by: Unknown account Feb 11, 2021 at 2:43 PM

Stack trace

#0 src/XF/Job/Manager.php(200): XF\Job\Manager->runJobInternal(Array, 7.98343)
#1 src/XF/Job/Manager.php(84): XF\Job\Manager->runJobEntry(Array, 7.98343)
#2 job.php(43): XF\Job\Manager->runQueue(false, 8)
#3 {main}

Request state

array(4) {
  ["url"] => string(8) "/job.php"
  ["referrer"] => string(56) "https://cafesaxophone.com/whats-new/posts/1625118/page-5"
  ["_GET"] => array(0) {
  }
  ["_POST"] => array(0) {
  }
}

The add-on is using background jobs to check link status. We will fix it to delete its background jobs during uninstalling, but meanwhile, please delete the rows from phpMyAdmin, xf_job table. If you are not able to do so, we will release a fix asap and you can just install the add-on, uninstall it again and this will delete the rows.

Thank you!
 
The add-on is using background jobs to check link status. We will fix it to delete its background jobs during uninstalling, but meanwhile, please delete the rows from phpMyAdmin, xf_job table. If you are not able to do so, we will release a fix asap and you can just install the add-on, uninstall it again and this will delete the rows.

OK, but I uninstalled this weeks ago, why would it still be using background jobs?


phpmyadmin shows 9555 rows safe to delete all of those?


View attachment 246130
 
Hello!

The tasks are logged when the add-on was still active. XenForo runs the background jobs, independent of the add-on that scheduled them is still active or no.

Please use the following query to remove the rows:

SQL:
delete from xf_job
where unique_key like 'dead_link_%'

This will delete only the rows added by our product.

Thank you!
 
I understand. As I remember, some time ago you have also requested link checker for resource manager, which we have created and released - https://xenforo.com/community/resources/link-checker-for-resource-manager-by-addonslab.6115/ and there has been no sales for this add-on, not even from your side. So we are not much interested in putting so much effort to support additional content checkers if no one is going to use it.

If you are interested in these features feel free to open a ticket at https://customers.addonslab.com/submitticket.php and we will provide the terms to develop them for you.

Thank you!
I think that is fair enough, as we cannot expect you to build addons on request and then get no sales from it.
There has been a lot of hesitance into buying the XF1 version due to the nature of this functionality (batch changing content) the sparse takeup of the xf1 product and the number of issues that were reported back then. It seems that the XF2 version has a much larger take up and therefore a more mature product.

Do mind that not only the idea of addon link checking but also the idea of this main addon was mine as well. I hope that this has resulted in satisfactory sales for you.

I bought both products just now.

Thank you for creating it.
 
Last edited:
My suggestions:
  1. I have noticed that the link checker does not distinguish between deleted posts and normal posts. In the case of normal posts, we want to fix or delete our dead links in order to improve user experience and SEO. In the case of deleted content we want to keep dead links, because this is often evidence of abuse. user experience and SEO are no factors with deleted posts, so there is no need to check hundreds of thousands of links in deleted posts.
    Please consider to offer an option to ignore deleted posts.
  2. Please also consider an option to check internal links that are restricted to guests. We have hundreds of thousands of links which are only accessible to registered members.
  3. One thing that would be mighty handy to have is a list top dead link urls. i.e. display the 40 most common url patterns that have dead links. For example if you have a lot of internal dead links because a directory no longer exists, then display the url to the directory. This would make it easy to identify the main issues that we need to fix.
  4. Since google has forced pretty much all sites to change from http to https, this change accounts for a very large percentage of dead links found. Probably around 40%. The links work fine, but no longer on http. As is we would automatically delete hundreds of thousands of valid links because the addon lists these as invalid. Please add function to select http links with a specific status and check for the https version.
 
Last edited:
5. Currently when removing dead links it only seems possible to either keep or remove the link anchor text. But this botches up a lot of posts as many posts have part of the written text hot-linked and removing that means removing words or phrases out of a text.​
If there is no anchor text or if the link anchor is just the url again, then it makes sense to completely remove it. However, when the link anchor is normal text then it makes sense to keep the anchor text.​
Please add an option to delete if there is no anchor or if the anchor has URL syntax, while keeping the text if the anchor is other text.​
6. When there are a large number of dead links for a status, it would be helpful to be able to review and process those in batches.​
7. It would also be really helpful if there was a function to go to the URL of the dead link.​
8. For those of us who run cookieless domains it would be nice to be able to define those domain as internal domains.​
 
Hello @Alpha1

Thank you for your notes.

In general, we are not planning large enhancements for the product, and we are adding only features we consider essential or very important for its functionality. Any such add-on can be enhanced to have 10 times more features than they have, but each such enhancement is very time consuming. The product has been intially developed for admins to "find dead links", and it has grown into a product with tens of times more features than just "finding the dead links", so I hope the feature set in general is satisfactory.

Here are some additional notes regarding some of your comments:

I have noticed that the link checker does not distinguish between deleted posts and normal posts. In the case of normal posts, we want to fix or delete our dead links in order to improve user experience and SEO. In the case of deleted content we want to keep dead links, because this is often evidence of abuse. user experience and SEO are no factors with deleted posts, so there is no need to check hundreds of thousands of links in deleted posts.

Link checker is not meant to validate your content, it validates links, based on HTTP response they give - no matter if the link is on your board or on someone else's board. None of the validation features it has depend on internals of your database, but only on HTTP status received from the link.

It can be enhanced and its link index can be used to find content on your board based on its status in the database, but those are not features are planning to add into this add-on.

Please also consider an option to check internal links that are restricted to guests. We have hundreds of thousands of links which are only accessible to registered members.
All links are validated as "guest", simply because that's how HTTP works, that's how search engines crawl your website. Links accessible to registered users will have appropriate HTTP status for search engines, and that's the status you see in the product.

One thing that would be mighty handy to have is a list top dead link urls. i.e. display the 40 most common url patterns that have dead links. For example if you have a lot of internal dead links because a directory no longer exists, then display the url to the directory. This would make it easy to identify the main issues that we need to fix.
I am afraid the feature is a bit hard to implement, especially based on a "pattern". You can use SQL queries to find such queries, as all links are indexed in the database along with their status.

Since google has forced pretty much all sites to change from http to https, this change accounts for a very large percentage of dead links found. Probably around 40%. The links work fine, but no longer on http. As is we would automatically delete hundreds of thousands of valid links because the addon lists these as invalid. Please add function to select http links with a specific status and check for the https version.
I am not quite sure what you mean by Google forcing https anyhow making http links invalid. Sites should still return correct response code for http links, e.g. both links in the screenshot are valid:

1614977299931.webp

Anyway, if you think you have links, which are shown as invalid just because they use http://, you can use the batch replacement feature to turn them to https://
1614977398880.webp

You can of course filter the list by the status, to make sure you work only with http:// URLs which are currently invalid.

5. Currently when removing dead links it only seems possible to either keep or remove the link anchor text. But this botches up a lot of posts as many posts have part of the written text hot-linked and removing that means removing words or phrases out of a text.If there is no anchor text or if the link anchor is just the url again, then it makes sense to completely remove it. However, when the link anchor is normal text then it makes sense to keep the anchor text.Please add an option to delete if there is no anchor or if the anchor has URL syntax, while keeping the text if the anchor is other text.
The product works almost as described, except that if the anchor text is a URL, it would keep the URL:

1614977924048.webp

We have updated the product to handle that case as well now, in case of the last row in the screenshot above the link will also be completely deleted, as it contains a URL as an anchor text.

6. When there are a large number of dead links for a status, it would be helpful to be able to review and process those in batches.

You can set the number of items you see in the admin panel preview using the option "Preview Limit". In the preview, use the checkbox on the top-left corner to select all links or just some links to process:

1614978866846.webp

7. It would also be really helpful if there was a function to go to the URL of the dead link.
We have added the link in the preview:
1614980113910.webp
8. For those of us who run cookieless domains it would be nice to be able to define those domain as internal domains.
Internal domain check is done with basic text match in MySQL, so adding an ability to specify multiple such domains is error prune. I also don't see why exactly cookie-less domains should be considered as internal links, as they are simply not and the fact they are cookieless does not anyhow affect the HTTP status of the links pointing to them. As I understand this is not a crucial feature to implement for now so we would prefer not to modify this aspect of the product and risk causing further issues related to it.

We will release the new version within some hours.

Thank you!
 
AddonsLab updated Link Checker for XenForo 2.x by AddonsLab with a new update entry:

Direct link to visit the link from the preview

In this version some minor enhancements have been implemented:
1. Replacing a link with its text deletes the link completely if the text itself is a link
2. A link is added in the Preview to visit the link directly

The new version is available for all licensed customers at

Thank you!

Read the rest of this update entry...
 
Thank you for your elaborate response! Its quite helpful. I see now that the XF2 version has quite valuable enhancements.
I am afraid the feature is a bit hard to implement, especially based on a "pattern". You can use SQL queries to find such queries, as all links are indexed in the database along with their status.
Too bad, because it would save many hours of work. Do you have examples of queries to use to find domains that have a lot of entries with a certain status?

Its a bit daunting when the scan results in 100k+ urls with a specific 4xx or 5xx status. Knowing which domains to address first would really help.
 
Indexing done on production board. Now the check for dead links...
Link statuses are being checked. 18227 out of 3766403 are completed (0.5%)

That is probably going to take weeks 😂
 
@AddonsLab Could you add an extra load check? Because it looks like the current check only checks the local server.

Currently my webserver has a low load so the dead link check keeps running....but my Mysql server has a load of 10 which causes Elasticsearch to not respond most of the times.

I guess I need to lower the default of 10 items check.
 
some questions:
  1. Is there a way to discard Deleted Threads?
  2. is there a way to produce NOT filtering e.g.: NOT Posts located in certain Nodes?
Many thanks in advance

PD: I have a licensed Addon.
 
Thank you for your elaborate response! Its quite helpful. I see now that the XF2 version has quite valuable enhancements.

Too bad, because it would save many hours of work. Do you have examples of queries to use to find domains that have a lot of entries with a certain status?

Its a bit daunting when the scan results in 100k+ urls with a specific 4xx or 5xx status. Knowing which domains to address first would really help.

You can use the following query to find all broken URLs ordered by frequency of their usage:

SQL:
SELECT cast(tag_url.tag_url as char) as tag_url, COUNT(tag.tag_id) as total_count
FROM `xf_allm_tag_url` as tag_url
INNER JOIN xf_allm_tag AS tag ON tag.tag_url_id = tag_url.tag_url_id
WHERE tag_url.last_status_code >= 400 AND tag_url.last_status_code <600
GROUP BY tag_url.tag_url
ORDER BY total_count  DESC

A similar query can be run to list the domains in order of frequency of having broken URLs:
SQL:
SELECT tag_url.domain, COUNT(tag.tag_id) as total_count
FROM `xf_allm_tag_url` as tag_url
INNER JOIN xf_allm_tag AS tag ON tag.tag_url_id = tag_url.tag_url_id
WHERE tag_url.last_status_code >= 400 AND tag_url.last_status_code <600
GROUP BY tag_url.domain
ORDER BY total_count  DESC

Both queries consider links as invalid if their status code is 40x or 50x. You can widen the definition of invalid URL detection by using this condition instead:

SQL:
WHERE tag_url.status!='valid'

so it will consider all non-valid URLs (e.g. also the ones that have invalid domains and their status code could not be checked at all).

@AddonsLab Could you add an extra load check? Because it looks like the current check only checks the local server.

Currently my webserver has a low load so the dead link check keeps running....but my Mysql server has a load of 10 which causes Elasticsearch to not respond most of the times.

I guess I need to lower the default of 10 items check.

Unfortunately, there is no way to check the load of another server other than the local server. MySQL server is accessible only via TCP protocol to make MySQL requests. A more advanced solution specific to your server setup could have been used, e.g. a cron job on MySQL server could send its server load periodically to the HTTP server (via an HTTP request to a custom endpoint), and we could store this value and use it later when we do server load detection. Feel free to contact us at https://customers.addonslab.com/submitticket.php if you need any assistance with creating such solution.

some questions:
  1. Is there a way to discard Deleted Threads?
  2. is there a way to produce NOT filtering e.g.: NOT Posts located in certain Nodes?
Many thanks in advance

PD: I have a licensed Addon.

As I understand, you mean to ignore the links from deleted threads. Do you need to ignore them from indexation process, or from the process of checking the dead links or applying a batch update? There is no such option now anyway, but we can implement it, we just need to know what is exactly your aim.

There is no general way to reverse the conditions, and this would be a bit more time consuming to implement, we might be able to work on it later.

Thank you!
 
"As I understand, you mean to ignore the links from deleted threads. Do you need to ignore them from indexation process, or from the process of checking the dead links or applying a batch update? There is no such option now anyway, but we can implement it, we just need to know what is exactly your aim." - > fr the Batch Update. because we have deleted threads so, this will be useful to don't lose our time updating info from deleted info not available fr the common people :P
 
Top Bottom