What is the most efficient method to copy selected image files from a EC2 instance (Ubuntu 20.04) in production to a S3 bucket while checking if the file exists in EC2?
This is a one-time operation. The S3 bucket will store user-uploaded images that are resized on the fly, so I don't need anymore to pre-process each image into multiple files with different sizes. I need to copy only the original files to S3. The folder will be deleted from EC2 later.
I have a table with the original file names. I need to check if the file exists in EC2 and copy them to S3 bucket. The total image folder size is about 20gb and there is around 40k file names on the table.
I thought about downloading the whole image folder (~20gb) to my local machine through SFTP or SSH and run a function in my Laravel 9 API on a local server to select the files. After that I need to upload the processed folder to S3.
Would this be the most cost-effective solution without overloading the production server? What is the best way to upload the folder to S3? Its final size should be around 10gb, so I guess I could not upload it through AWS console. Maybe run a function to upload it in batches?
The S3 bucket is not on production yet and the API can connect to it in dev mode.
edit: I also realized that downloading files through SCP/SFTP is slow (300-400kb/s on WinSCP). Is there a faster way?
aws
utility? docs.aws.amazon.com/cli/latest/reference/s3/cp.html