Amazon AWS S3 Help...

MattRock

Member
I'm looking for the best way to have two web servers load balanced in AWS. The focus of my concern is on the "data" and "internal_data" directories. I'd like to store these folders in S3 buckets mapped to the root folder. Does anyone have any information on how to properly mount an S3 bucket to replace these folders, and how to actually set up permissions correctly? I can mount the buckets just find, and can see files there, but they are not accessible through HTTP. It has to be a permissions thing I just can't figure it out!
 
I think I figured it out using FUSE... had to issue this command:
s3fs -o allow_other,uid=498,gid=498 bucketname foldername, where 498=NGINX user
 
Just an update... I found that FUSE (s3fs) was fairly unreliable. I mapped my data and internal_data folders using s3fs and set them up to automatically map after restart using /etc/fstab. Sometimes when the server would reboot, I would have to unmount and remount the s3fs folders. Also, I found that occasionally, the data folder would lose its s3fs mapping, then I would have to unmap and remap manually. This was terrifying for a production environment.

I found that a more reliable way, was to keep the folders on each server, and sync them with S3 behind the scenes. This can be done with s3cmd. Here is a great article I used to do this:

http://www.wong101.com/tech-cloud/configure-s3cmd-cron-automated-backup

This allows me to spin up multiple instances of my web server, and just make sure that the files are synced to the S3 bucket. Using this, I actually am synchronizing my entire web root, because why not? Also, I don't have to copy the web files over to each server, because I can just do the initial sync to grab them from the S3 bucket. Pretty sweet, and it runs great!
 
When do you sync, every ten minutes?

I would use amazon cli tools which are more reliable than third party ones.

When i deploy a new webserver I use #aws s3 sync /data s3://databucket and also sync it every ten minutes but it adds a lot to cpu overhead and data transfer.
 
Last edited:
It only syncs files that do not exist, that is the idea of the sync command. The first time you sync an empty server, all files are copied to it. The next time you sync the server, only changed files are transferred. You can further only sync when the file dates are different etc.
 
Top Bottom