-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
caching of results, operating on external resources #5
Comments
Hi Andy, Thanks for your feedback! I certainly don't think you're hijacking anything. I would like to provide some justification for the design. Portions of this should perhaps be moved into the README so the tradeoffs are made immediately clear. CachingThe lack of a caching system is a feature. There are two possible caching options, a) using an in-memory/local cache or b) caching on an external/global store, like S3. Both of these have disadvantages, in that they limit throughput and complicate the retrieval logic. Further, it's trivial to set up a CDN and/or a proxy cache (e.g. Varnish) in front of Halfshell. Either option would do a better job caching than implementing a cache into Halfshell itself as they wouldn't limit memory or use up network resources that could be used to serve other requests. Source VersatilityThe decision to require configuration of each source (be it a filesystem, S3, or a URL root) is a feature and quite deliberate. At Oyster, the images that are served from Halfshell are publicly accessible (via a CDN), and so many of the design decisions are intended to limit access/abuse. For instance, we considered using something like the following for the URL scheme:
This would have been easier than setting up a separate route for each S3 bucket but would have the disadvantage that we're now exposing a resizing server that can be (ab)used to process images from any S3 bucket. Other examples of imposing limits are the I appreciate the feedback and I think this is a very worthwhile conversation. We've tried to design Halfshell to be quite versatile but have chosen to use explicit configuration to set the boundaries of that versatility rather than exposing a service where all configuration is established in the request. I'd like to keep that philosophy in tact, but continue to make Halfshell full-featured. If you have any feature suggestions or patches, feel free to contribute. Thanks again. |
Your explanation hints around the reasons my solution is closed-source; it's really a back-end service that drops into existing infrastructure and so doesn't meet the same goals. Have you given any thought to wrapping the ImageMagick bindings more thinly? I assume that was also a decision motivated by security concerns... Perhaps I'll contribute an extension to Halfshell to meet my needs; I wouldn't mind piggy-backing on someone else's hard work. :-) Thanks for the detailed response. |
The ImageMagick bindings are currently the least considered part of the design. In fact, I'm considering doing away with ImageMagick altogether. I've started investigating using libvips or GraphicsMagick and have a preliminary implementation. What do you mean about wrapping it more lightly? Even if we continue to use ImageMagick, one of the things I'd like to do is use the MagickCore API instead of MagickWand. If you have an opinion on the matter, I'd like to hear it. Can we close this issue and open a new one to discuss the image bindings? |
Closing and moving to #8. |
I've implemented a tool to solve the same problems (currently closed source) which uses ImageMagick to perform the transformations.
One feature that makes the service more powerful is the caching of the resulting image in an S3 bucket. That bucket can have various lifecycle attributes to auto-expire the cache. The service can return a HTTP/302 redirection to the cache object (or a CloudFront distribution pointing at the cache) or it can stream the object to the client itself.
My version takes an escaped URL as the source input. As the URL can point to either S3, CloudFront, or any other image on the web (or local host), it's quite versatile.
Here's an example URL that shows an escaped source URL which is built up with the following pseudocode, resulting in a URL which is more cacheable than something with a query string, yet still contains all the image transformations requested.
http://somewhere.com/halfshellesque/http%253A%252F%252Fwww.wallpaperswala.com%252Fwp-content%252Fgallery%252Fbill-gates%252Fcool-bill-gates.jpg%3Bgeometry%3D800x600%3Bcolors%3D256
That path info portion of the above URL is ideally hashed for use as the key in the S3 cache, and a simple test of the source URL's domain might allow one to choose the lifecycle of the cache item based on, for example, whether it is an image from your organization or from elsewhere on the web.
Anyway, I don't mean to hijack the project; I just thought I could contribute some experience and offer some ideas...
The text was updated successfully, but these errors were encountered: