Using DigitalOcean Spaces or Amazon S3 for file storage in XF 2.1+

Using DigitalOcean Spaces or Amazon S3 for file storage in XF 2.1+

No permission to download
wasabi - its a third party and you use at you own risk
Hello Jim, not sure that I got what you mean with this... :unsure: I consider AWS and DO third party services as well... isn't it?

Amazon s3FullAccess - if you ain't storing any other data in the account, then you are probably ok. If you get breached, they can take all your s3 data and wipe everything
Could you please also clarify this one? :) Do you just mean that "FullAccess" is not safe? Isn't it just restricted to the bucket itself? Why whould they wipe everything?





Anyway, according to this page on the AWS user guide, I assume the Replicate action shouldn't be needed by XenForo, if I'm not mistaking... :unsure:

Would appreciate a hint from Chris 😬 before going ahead and test it.

Almost forgot! Here's the full list of all S3 actions available in Wasabi:

1648309691146.webp
 
Hello Jim, not sure that I got what you mean with this... :unsure: I consider AWS and DO third party services as well... isn't it?
Yes - but I've never used wasabi, so cant say anything about it

Could you please also clarify this one? :) Do you just mean that "FullAccess" is not safe? Isn't it just restricted to the bucket itself? Why whould they wipe everything?
the managed S3FullAccess policy allows the user to do anything with s3 buckets and the objects inside. Is it safe? That depends on your context. If all you have in your AWS (or presumably wasabi) account is an S3 bucket that only has XenForo data and while it would be a bit of an issue you can cope with losing all that data (or having it exposed) in the unlikely event that something bad happens, then such a policy will fall within your risk tolerance levels. But if you have other buckets with more sensitive information, such as Privacy Information, then I would think twice about using the S3FullAccess managed policy for the associated user or role.

Anyway, according to this page on the AWS user guide, I assume the Replicate action shouldn't be needed by XenForo, if I'm not mistaking... :unsure:
Definitely not needed and should never have been in the set of permissions.
 
watch your costs with wasabi. they are more for storage then they are for serving images on the forum. the egress charges will get you even though they say there aren't any.... read the fine print.

read #8 here: https://wasabi.com/paygo-pricing-faq/

so, if you store 50mb of images and transfer 51mb of images, you're in trouble.
 
the egress charges will get you even though they say there aren't any...
Alright now I'm confused 😅
Amazon S3, DigitalOcean, Microsoft Azure, etc. they all charge for egres. So I wonder how do all users here that set up this remote XF data (from this thread) are managin the data transfer cost between XenForo and their users?


According to this #8 (thanks for pointing me there, by the way), we should consider that our XenForo should not exceed the 1TB (disk space) data transfer to avoid issues, isn't it?
👇
  • If your monthly egress data transfer is less than or equal to your active storage volume, then your storage use case is a good fit for Wasabi’s free egress policy
  • If your monthly egress data transfer is greater than your active storage volume, then your storage use case is not a good fit for Wasabi’s free egress policy

For example, if you store 100 TB with Wasabi and download (egress) 100 TB or less within a monthly billing cycle, then your storage use case is a good fit for our policy. If your monthly downloads exceed 100 TB, then your use case is not a good fit.

At this point, how to calculate the amount of data being transferred from a live XF installation? :unsure:


If your use case exceeds the guidelines of our free egress policy on a regular basis, we reserve the right to limit or suspend your service.

However, other than that, there shouldn't be any extra charges or hidden costs. They should just limit the service or cancel it...
But I hope they at least get in touch with you before shutting down your account!


Really.... I'm quite confused now... seeking for help, guys!
 
. So I wonder how do all users here that set up this remote XF data (from this thread) are managin the data transfer cost between XenForo and their users?
If it is a significant amount of data- then a CDN such as CloudFlare is the sensible approach. But that wont handle internal data. And given that wasabi also reserve the right to suspend operations on unverified applications if it doesn't like the way you use the API, then I would highlight that as a risk.
 
There's no way for us to know how much data you store and how much you transfer.

If you host 1 1kb image and it's accessed 2 times, technically, you're in violation.
if you host 10000 100kb images and serve each one exactly once, or less, you're ok.
if you host 1000000 100kb images and serve only the newest 20% at a rate that is lower than 10000000x100kb, you're ok. It's hard to know exactly how many images will be accessed via how many are stored.

Long story short, wasabi is for storage - not for real-time apps and it has no place in my architecture.

For storage, i use deep glacier for stuff like family photos, and it's as cheap as wasabi. so what if it takes a bit to retrieve it.

My website uses s3 with cloudfront to get to the edge, with cloudflare pointing to the cloudfront url. That may be stupid. I'm not 100% sure if cloudfront is redundant, but either way i like it because it abstracts my bucket names from the url.

I'm also using the themehouse image optimizer (but i think they pulled it... there's others),
Also, i've set the cache header with the addon Image Attachment Cache Control by TickTackk, along with the cloudflare rule to cache:
/atachments/*
Browser Cache TTL: a month, Cache Level: Cache Everything, Edge Cache TTL: a month

this violates some privacy but i don't really have private forums that use attachments, and the 2 or 3 that do get posted, the odds of the url being shared or ID'ed is slim to none.

I spend about $2 a month between s3 and cloudfront.
I have 140,000 objects taking up about 14.6 GB

So, the average size is just a bit over 100kb thanks to the image optimizer utilities.

Traffic will dictate price as well, and my site has slowed down quite a bit in recent years. But, popular images/threads will be cached at cloudflare and the storage charge would still be the big one for me.

Still, i could 10x my traffic like the good old days (2005) and spend all of 20 bucks on s3?

The trade off is that by offloading all the data, backups are so much faster/lighter/cheaper and the server itself doesn't need as much disk space. and DNS cname points out to the edge which keeps the traffic off the server as well filtered through cloudflare.

ultimately, i'm running a 2core 4gb vps for a 1.5mil post forum with 80k members.
on apache cpanel! lol on nginx/centminmod i could probably get away with 1core 2gb. it's on my cost reduction roadmap.
 
My website uses s3 with cloudfront to get to the edge, with cloudflare pointing to the cloudfront url. That may be stupid. I'm not 100% sure if cloudfront is redundant
Not stupid - by going via Cloudfront you can use OAI to keep items in the bucket private and not run it in 'website' mode. Cloudfront redudant? Technically yes. It may keep only a single copy of you file (which incidentally it stores unencrypted on disk), but if that is lost, it just grabs another copy from source.
 
Most of my attachments are memes and pics of member's projects, nothing secret by any means. But yes, my bucket is closed.

The way i was thinking about is that i can leverage cloudflare's cache for reduce s3 costs, which i think is working. cloudflare says over 50% of my hits are cached.
 
Is there a way to implement this without the /internal_data/ files on S3 being set to public?

I get that XF still uses permissions for it's own URLs, but in theory you'd be able to download attachments from private forums direct from DigitalOcean/Amazon if you can determine the S3 URL.
 
Is there a way to implement this without the /internal_data/ files on S3 being set to public?

I get that XF still uses permissions for it's own URLs, but in theory you'd be able to download attachments from private forums direct from DigitalOcean/Amazon if you can determine the S3 URL.
Couple of ways -
1. Have separate buckets for external and internal data and block all public access on the internal bucket - internal and external data locations are defined separately already in config.php
2. Use the same bucket, but use a prefix, such as 'internal_data', in your definition for internal data within config.php. Then put into your bucket policy
Code:
{
  "Effect": "Deny",
  "Action": "s3:GetObject",
  "Resource": "arn:aws:s3:::[your bucket]/internal_data/*",
  "Principal": "*",
  "Condition": {
    "StringNotEquals": {
      "aws:PrincipalArn": "arn:aws:iam::[your account number]:user/[your user]"
    }
  }
}
"your user" being the user who you are using for forum permissions
"your bucket" being the bucket used (am assuming your are using "internal_data" as a prefix)
"your account number" being your 12 digit AWS account id
 
Cloudflare R2 is in public beta today. I am going to try using it after dumping B2 few months ago. Would love to know if anyone else moves to R2 and post their experiences and possible issues one might face and how to solve them. Cheers.

A New Hope for Object Storage: R2 enters open beta

I have completely forgotten how I got S3 and then B2 working on my board so it is going to be a fresh start for me.

Update. Looks like it is not that straightforward. R2 Buckets are private in nature and might require some additional work to make it work with S3 based apps. Would wait for someone who has better idea of how to get it to work with XenForo first.

Get started guide · Cloudflare R2 docs
 
Last edited:
This part is attractive:

R2’s forever-free tier includes:
  • 10 GB-months of stored data
  • 1,000,000 Class A operations, per month
  • 10,000,000 Class B operations, per month
A shoe-in for my smaller sites, but it could still be very affordable for our larger sites.

R2 charges depend on the total volume of data stored and the type of operation performed on the data:
  • Storage is priced at $0.015 / GB, per month.
  • Class A operations (including writes and lists) cost $4.50 / million.
  • Class B operations cost $0.36 / million.
Class A operations tend to mutate state, such as creating a bucket, listing objects in a bucket, or writing an object. Class B operations tend to read existing state, for example reading an object from a bucket. You can find more information on pricing and a full list of operation types in the docs.
With a free tier, it's a good way to experiment and get it working without having to put out any cash.
 
yeah the free tier is very similar to b2. right now i cannot even get rclone to work. it gives authentication error but somehow still manages to create an empty bucket on r2. i cannot open the bucket folder using winscp. would love to see if anyone can look at the documentation and see if it can be made to work with xenforo. if wrangler thing would be required and so on!

S3 Compatibility · Cloudflare R2 docs
 
The more i learn about it, the more I am leaning towards staying on s3 for now. Even with the free tier, it seems cheaper for me. plus, the api seems complicated and not fully s3 enabled. plus, npm install work.
pass for now.
 
so r2 now works with winscp at least. back then it was not even opening existing directories on r2. i assume cloudflare rolled out changes to their s3 api and they work now. rclone beta (curl https://rclone.org/install.sh | sudo bash -s beta) also has the fixes to support cloudflare r2. it shows authentication errors on regular builds but i am able to upload files using rclone beta to r2 buckets. i guess xenforo support would not be there till they add support for public buckets and add more functionality to their s3 api. but for now, some improvement since beta release.

just noticed, cloudflare has added rclone in their support documentation. https://developers.cloudflare.com/r2/examples/rclone/
 
Last edited:
I have an Issue with loading the complete attachment (PDF, Image). The File is avaible complete in the S3 Storage. But the image itself is not loaded properly. If I open the attachment of an URL the first 10% are loaded. This is not always exact the same amount. Small images are working. Also Streaming MP3 Files.
Does anyone have a hint for me, where to investigate? Actually I ran out of ideas where dies this strange behaviour came from.

I moved in AWS to a t4g environment, after that, this error occured.
 
t4g shouldn't affect it - that only reduces your available CPU if you go too heavy. Is this external or internal data? Sounds like a caching issue either way. If using cloudfront to serve externally, you can invalidate the item, but I would (and already have) switched to using CloudFlare. As for streaming files, depending on your headers, streaming is really a matter for your browser
 
It's internal Data. I am using Cloudflare at the moment. Streaming works.

Edit: In addition the AddOn Page is not loaded fully. The Browser is loading infinite, and the page is not fully displayed.

I just checked the max-execution-time and raised it to 6000, no change.
 
Last edited:
Top Bottom