Fixed Image optimization CLI commands break existing content

Kirby

Well-known member
Affected version
2.3.0 Beta 3
Steps to Reproduce
  1. Make sure Image optimization is turned off
  2. Upload an Image attachment and insert it as thumbnail
  3. Copy the tumbnail URL
  4. Insert the thumbnail URL into in a new post via [img]
  5. Note down the thumbnail UR, verify that the file exists in the filesystem and note down its path
  6. Turn on Image optimization
  7. Run php cmd.php xf-rebuild:attachment-optimization
  8. Turn off any caching (Browser, Cloudflare, etc) or set to Bypass
  9. Reload the page containing the post created in step 4)
Expected Result
The post is displayed as before but the image is now WebP

Actual Result
The image is no longer being displayed / shows as a broken image (depending on the browser)
Checking the filesystem shows that the file (as noted in step 5) does not exist any longer

Of course users should not copy thumbnails into new posts via [img] - but they do that and it seems unaccaptable to me that those embedded thumbails will break when optimizing existing attachments.

On one of our forums for example this would affect several thousand posts.
 
Last edited:
My current comment on this is I don't think there's much we can (or maybe should) do here.

I know that it can happen, but the steps involved in making it happen are not the expected steps. I also understand users are users and they do whatever they want to make things work and that doesn't always align with our expectations.

Fixing this would involve a number of things that would either cause other problems or just not make sense.

Some thoughts in no particular order.

  • One potential fix is not changing the file_hash of the attachment data so that references to thumbnails do not change
  • file_hash is something that could be used by developers and I know it has been used in the past for things like deduplication systems
  • Another potential fix is keeping the original thumbnails alongside the new thumbnails - this is a waste of storage space and in cases where storage space is already tight, this could cause significant issues
  • We could fix this with some sort of file_hash map (or storing the old file_hash alongside new one) but redirects would likely have to be implemented manually

The most straightforward thing to do if you think that this affects you is to make a copy of the thumbnails before the conversion, and copy them back over after the conversion.

If anyone has any other thoughts on potential solutions happy to hear them but right now I'm leaning towards [Won't fix] unfortunately.
 
If anyone has any other thoughts on potential solutions happy to hear them but right now I'm leaning towards [Won't fix] unfortunately.
As "Won't fix" will not work for us I will have to find a solution anyway, though I'd prefer if this was handled by XenForo :)

The root cause is that the path part of thumbnail URL changes when the source file (hash) changes - this does not happen for avatars; avatar URLs stay valid no matter if the content is changed or not.

So I'd try to make thumbnails behave like avatars:
Instead of using file_hash as part of the thumb filename it could just be a query string, eg. data/attachments/110/110353.jpg?9d5214aacacf12591d9d381d6fb40754 instead of data/attachments/110/110353-9d5214aacacf12591d9d381d6fb40754.jpg

To keep the filenames of existing thumbnails, add thumbnail_path to XF:AttachmentData and store the existing path without placeholder %HASH%; keep empty / null for new uploads and modify AttachmentData::getThumbnailUrl() and AttachmentData::_getAbstractedThumbnailPath() accordingly.

Unlesss i am missing smth, this should not affect deduplication based on file_hash and does not require redirects while allowing to easily change file format / contents (AVIF?) in the future without breaking existing URLs.

So existing thumbnails would just keep their old URL forever (no matter if their content changes or not) and new uploads will have the hash as cache buster query string.

One thing that might break could be Add-ons that manually construct thumbnail URLs without using AttachmentData::getThumbnailUrl(), but Add-ons shouldn't do that anyway.
 
Last edited:
I think there's probably two different but related solutions here. One we can do now, potentially, but one we won't.

Obviously changing the URL format entirely and where they are stored on the file system doesn't actually help you because you will still be left with references to the old URL format in posts.

But you are correct that storing the actual path to the thumbnail i.e. 110/110353-9d5214aacacf12591d9d381d6fb40754.jpg and that never changing even if the file_hash does could be workable.
 
can you 301 the old image to the new image?
If they are delivered locally this would be possible theoretically (but not that easy).
If they are delivered externally (S3, R2, ec.) it would get a lot more complicated (maybe almost impossible).
 
Obviously changing the URL format entirely and where they are stored on the file system doesn't actually help you because you will still be left with references to the old URL format in posts..
Yeah, that was just another idea to avoid such issues for future uploads by eliminating file_hash as part of the path.

It's not necessary to do that to keep the URL consistent and also wouldn't work if file_hash stays the same but thumbnail content changes, see https://xenforo.com/community/threa...mbnail-size-and-rebuilding-thumbnails.220799/
 
We’ve resolved this in RC5.

Attachment data now has a file_key field. During upgrade this is just populated with a copy of file_hash, but going forwards for new attachments it is just a randomly generated string.

The file_key is designed to be persistent and never change, so even in situations where the file_hash changes, existing URLs and files on the file system will not actually change.

file_hash still maintains a role as a cache buster.

Here's a funny meme to check how it looks:

1718891993924.webp
 
Back
Top Bottom