Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backend: File upload, add generating S3 presigned urls for PUT and GET #1170

Closed
pbn4 opened this issue Apr 23, 2021 · 5 comments
Closed

Backend: File upload, add generating S3 presigned urls for PUT and GET #1170

pbn4 opened this issue Apr 23, 2021 · 5 comments
Assignees

Comments

@pbn4
Copy link
Contributor

pbn4 commented Apr 23, 2021

Blocker for #1153

@seanmalbert
Copy link
Collaborator

Hey @pbn4 ,

Can you flesh out your thoughts on this, so we can attempt to size it? Were you thinking a direct upload solution like outlined here https://devcenter.heroku.com/articles/s3-upload-node#direct-uploading? What about bucket strategy (i.e. one bucket per client or can we get away with one bucket for all?) and file naming conventions?

From what I can tell, we currently aren't using the aws-sdk package to interact with s3 storage, so would this be the first implementation of this?

@pbn4 pbn4 added the discussion label May 5, 2021
@pbn4
Copy link
Contributor Author

pbn4 commented May 5, 2021

Were you thinking a direct upload solution like outlined here https://devcenter.heroku.com/articles/s3-upload-node#direct-uploading?

Yes this is exactly what I wanted us to implement.

What about bucket strategy (i.e. one bucket per client or can we get away with one bucket for all?) and file naming conventions?

I'd go with file naming conventions and 2 buckets (one for public read and one for private reads). Right now there are two use cases I know of:

  • listing hero image Listings Management Fieldset: Listing Photo #1177 which could follow a pattern /listing_<listing_id>/images/hero-image-name.jpg, this is a public read, private write bucket (pattern does not really matter much as long as it's unique in this case),
  • uploading paper applications per listing (cc: @slowbot) which could also follow the same pattern but for reading those you would need a presigned URL too (same as for writing)

From what I can tell, we currently aren't using the aws-sdk package to interact with s3 storage, so would this be the first implementation of this?

yes first implementation of this.

There are react libraries for frontend that handle stuff like this e.g. https://react-dropzone-uploader.js.org/ see s3 uploads manual. Backend could return a standardized message for a presigning request. This is for frontend folks to evaluate if the tool suits our needs though.

@seanmalbert
Copy link
Collaborator

When we were discussing this last week, @jaredcwhite mentioned using Cloudinary. Giving it more thought, I think this is the way to go and cuts down on the overhead of having to deal with image transformations in the future (and allows us to do it right away). Plus, there's a free tier that gives us 25 credits. 1 credit is good for 1k transformation, 1gb of storage or 1gb of bandwidth. Transformations only count for the first time an asset is transformed, so requesting an asset with a transformation specified in its URL won't count against us each time. If storage or bandwidth becomes an issue, it also allows us to specify an s3 bucket to cut down on costs. Given the low number of listings, I don't think it'll be an issue. I created a Cloudinary account for Exygy to try out some things, and I think it'll work well. I can add the API secret to Heroku, so you can grab it for your local environment there too. This also supports signed URLs for documents we want to keep private, but I think we only need to handle the public use case of listings for MVP.

So here's what I think should happen (attempting to write this so that anyone can pick this up). This outline is in broad strokes, so if you need any clarification on any part, please ask.

Backend

  1. There is already some asset definition on the listing entity and an asset table, which I think can be repurposed
  2. create an assets module, you can create the structure by referencing other modules like listings or preferences
  3. asset.entity.ts should specify these fields:
    • id
    • created_at
    • updated_at
    • file_id (this will be Cloudinary's public_id
    • label
  4. because we want to be able to use assets for other entities, don't specify listing_id as a column
  5. For the controller/service, I think we only need Post/create to start, since the listing itself will be responsible for getting its images
  6. Upon receiving the asset, the service needs to handle two operations
    1. performing a signed upload stream to Cloudinary (see documentation https://github.com/cloudinary/cloudinary_npm#cloudinaryupload_stream, they do have Promisified versions of this I believe)
    2. Use the response of the upload to create a record using the public_id, and you can use the filename as the label
  7. The response of the create service is to be used by a FileUpload component added to ui-components/src/forms, so file_id and label should be sufficient (since you can construct the asset URL with the public ID, or you could return the full URL
  8. The listing entity contains a json asset column, for now, we can either keep using this, since we're still keeping the file_id, label keys or we can save them to the asset table. To minimize the amount of work now, I think we can stick with what's there.
  9. So on getting a listing, because the file ID should now be the public ID and we will have consistent sizing, at least to start with, when you fetch the listing, you can construct the image URL server side and specify the crop method (c_fill will probably be good) and the width and height, so in the URL for example would be /h_500,w_500,c_fill/

Future considerations:
At a later date, we may want to upload the files in chunks. For now, if we can keep the file size to a reasonable limit, we should be fine.

Frontend

  1. In ui-component/src/forms create a FileUpload.tsx which uses the react-dropzone-uploader package linked above in @pbn4 's comment
  2. In addition to some of the standard props the other form inputs receive, it should accept:
    • upload url
    • accepts (they type of assets it should accept, default to "image/*")
    • defaultValues
  3. There are a myriad of examples out there using react dropzone with cloudinary, most of which show direct unsigned uploads, so if you're going off an example you found, the difference will be that you're going to send the file(s) to our backend for the secure upload to Cloudinary
  4. As files are added, you should be able to check the file size and keep track of the total. We want to be able to notify the user when that limit is exceeded and not upload those files
  5. you should get back from the response a file_id and label, these are the values that will get saved with the listing
  6. since the server side listing service should be constructing the imageUrl for ImageCard (at least for now), you shouldn't have to change anything client side to render the image for the listing

@pbn4
Copy link
Contributor Author

pbn4 commented May 18, 2021

@seanmalbert OK for me. A side note is that we do not have to push a stream to the server for private upload, cloudinary supports the same flow as the one that I proposed with s3: create a presigned upload url server-side and expose it to frontend (manual), so that the upload can happen browser side.

@seanmalbert
Copy link
Collaborator

I added the cloudinary account keys to Heroku's config vars on https://dashboard.heroku.com/apps/bloom-reference-backend/settings. The only two that need to be kept secure are CLOUDINARY_SECRET and CLOUDINARY_ENV (if you need that one).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants