Hacker News new | past | comments | ask | show | jobs | submit login
Content-aware image resize library (github.com/esimov)
614 points by boyter on Jan 30, 2018 | hide | past | favorite | 78 comments



Happy to see this here! Seam carving is one of those neat little algorithms that hit a sweet spot, being a sum of a few moderately complex algorithms, comes from a readable paper, and gives very satisfying result. I had a lot of fun implementing it in undergrad.

One thing that I always wondered is how Photoshop managed to make it so fast that you can resize in real-time. If n is the width or height of the image, then the dynamic programming part is O(n^2) and needs to be recomputed after every seam removed. Since every seam is a single pixel wide, resizing the image by a non-trivial amount (say half) is O(n^3). There are other papers that remove multiple seams at a time but the quality isn't as good. GPU acceleration perhaps?

Another thing I learned while testing seam carving extensively is that it works nicely in certain scenes/situations, but tends to break down most of the time. The two most common scenarios are: 1) lines that are off by the horizontal by more than a few degrees get cutoff 2) objects loose their proportions and even when the manipulation is not directly obvious, it tends to feel off (in the uncanny valley sense).

I expect some interesting work to use deep learning for content-aware resizing, since neural nets could theoretically be more semantically and holistically aware of objects in the image.


Well, there is this paper called "Real-time content-aware image resizing", maybe that describes the method you are curious about?

> In this paper, we present a more efficient algorithm for seam based content-aware image resizing, which searches seams through establishing the matching relation between adjacent rows or columns. We give a linear algorithm to find the optimal matches within a weighted bipartite graph composed of the pixels in adjacent rows or columns.

https://web.archive.org/web/20110707030836/http://vmcl.xjtu....


That's the one I implemented -- with a bit of low-level hackery, it could run fast enough to resize at ~5-8 fps on mobile back in 2013. But the result didn't look great on most images if you resized by more than ~15%. The technique works by calculating the "importance" field of the image once, and all the seams upfront. Then you can just number the seams from 1-n and remove the first m seams when you resize the image by m. The runtime is then O(n^2) instead of O(n^3). In contrast, with regular seam carving, you'd often update the "importance" field every time you remove a seam. The result is that the seams that get removed are less likely to introduce artifacts. I think it's a similar problem as the forward energy v.s. backward energy problem discussed elsewhere in the thread, but worse.


Hmm, then I don't know. Still, they mustn't have stood still since then and come up with new tricks.

I wonder if you could do a kind of quad-tree version where you hone in on the precise seam and minimise recalculations to bubbling new energy up the tree. That might also give a way to weigh global-to-local structure.


I also thought about augmenting this algorithm with the semantic information output of a neural network, but couldn't come up with a good way to generate training data for semantic segmentation which wouldn't distort objects.

For example, if you consider an image of a roof, the roof pixels will all be semantically similar, but if you remove any of them just based on that, the regular structure will be distorted. Do you have an idea how to solve this?


There's one paper I found that explores this approach: https://arxiv.org/pdf/1708.02731.pdf

I know little about deep learning though, so I can't really comment on it.

For situations like the roof problem you describe, I think PatchMatch is more likely to work out well. The downside is that it's considerably harder to implement correctly than seam carving. http://gfx.cs.princeton.edu/pubs/Barnes_2009_PAR/patchmatch....


Do you really need to regenerate the whole dynamic programming table though? It seems to me that it could be kept more conservative and only recalculate the needed parts.


IIRC, the "energy" may affect up to 3 below pixels (left, right, center), so each pixel of a removed seam requires re-calculations downwards in the shape of a funnel going to the left and right with a maximum of 45°. In the worst case, this means re-calculatable pixels are in the form of a triangle with a top row pixel as its tip and a 90° angle there.

So, for a landscape-oriented image of aspect ratio 2:1 that would mean re-calculating up to 50% of its pixels. For narrower ones, that percentage is even higher, so not too much to gain.


Sure, the worst case would have a large number of re-calculations. But it seems like the common case would converge quickly.


Just saying, but GIMP had this for years with Liquid Rescale plugin (I think before even Photoshop): http://liquidrescale.wikidot.com/


Neither GIMP nor Photoshop are libraries ("just sayin'"). It sounds like there were other previous libraries: that would be more relevant of a callback.

http://liblqr.wikidot.com/


I remember first seeing content aware image scaling at Adobe MAX in Barcelona I think it was. We were completely dumbfounded by what was surely magic happening on stage. When they showed removing objects we just lost it, jaws on the floor and all. That was a fun conference.


I'm the author of this library. The algorithm permits to remove an object from the image, only it needs somehow to be localized. After localization it can be applied to that area a higher energy map, which means that part will be avoided by the seam carver. I will implement a face detection algorithm to automatically exclude faces to get not altered by the carver.


There are a few interesting things to explore in that field, with different twists:

https://algorithmia.com/algorithms/opencv/SmartThumbnail

https://blog.twitter.com/engineering/en_us/topics/infrastruc...


Out of curiosity, liblqr was not good enough for you? :)


See also the Patch-Match algorithm:

http://gfx.cs.princeton.edu/pubs/Barnes_2009_PAR/index.php

(Licensed by Adobe for noncommercial research use only.)


Yeah Adobe sure had a lot of magic to show in that conference.


It's super fun but I can't think of many real-world uses. Maybe in subtle, element-wise stretching and reshaping as part of more complex editing? Squeezing images is a huge no-no in any graphic design, and while content-aware resizing is a neat trick to avoid the most glaring problems, it's still image squeezing. You'd rather just crop the image or choose a different one.


> Squeezing images is a huge no-no in any graphic design

Surely this is true in journalism. I find it hard to believe it's true in ad work.


I think the previous comment was about changing the aspect ratio. And while you'd rather crop or choose another image, content aware resizing is now the 3rd option


I wrote a GUI for another seam carving library back in 2009[1], and it looks like although in archive mode, you can still access the source as well as the windows / mac binaries. Just tested it and it still works!

Not as fancy as photoshop I'm sure, but does have the ability to paint a mask of regions to keep / remove to aid the algorithm and get the desired result. Multi-threaded too!

[1] https://code.google.com/archive/p/seam-carving-gui/


Thanks for that. I used it way back when(mid 2010) to make it look like a a 30' cliff jump was a 60' one. Really impressed my friends with it.


High praise indeed - you know the software is good when you can use it to lie to your friends.


Now that this thread is essentially tially dead, I can let out my little secret - I'm not the bad boy you think I am. I told them thw truth eventually -about 30 seconds after showing them the doctored one. I hope this doesn't change your opinion of me.


That's cool! I tried it, and the app is really nice to use, but the picture still came out looking weird. See my other comment below.


Came for the “seam carving”... okay.

Didn’t expect image expanding... was blown away.

Mix in temporal energy plus a ML objects DB, put that in a chip, and turn decades of 4:3 TV into widescreen!


I'm already quite interested in resizing wide images to square images for Convolutional Neural Networks training


I actually also implemented seam carving in JS a while ago with a live demo in the browser http://davidalbertoadler.com/projects/seam-carving-js/ https://github.com/mfbx9da4/seam-carving-js


Would this still be covered under the original patent until 2029 [1]?

[1] https://www.google.com/patents/US8213745


That patent was filed in 2009, and appears to be a copy of a 2007 paper http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.570... with no novel contributions, and no overlap in authorship.


This one has one of the original authors and was filed in 2008: https://www.google.com/patents/US8280191 not sure about differences in claims. IANAPA


     2008: Original Assignee	Abode Systems Incorporated

     2009: Original Assignee	Eastman Kodak Company
Perhaps the Kodak patent covers a hardware-implementation for their cameras?


_Eastman_ Kodak manufactures and distributes cinema film. The camera division is the _other_ Kodak.


The algorithm is not based on this patent. I made a reference on the github page to the original paper the project is based on.


Not in New Zealand, France or a host of other countries without software patents.

Where is the creator located?


Seam carving is fascinating, and especially the first version of the algorithm really really simple to implement. However, that algorithm introduces certain artefacts - the boat in this picture[0] makes it pretty obvious.

The cause of these artefacts are mainly that the original algorithm did not look at energy that was introduced by removing a seam: each time this happens, the pixels adjacent to this seam become neighbours, which creates a new energy gradient. So sometimes removing the least-energy seam would produce a net increase in energy.

The original authors of the seam carving paper realised this[1], which lead to the obvious fix for it: instead of only looking at the current energy of a picture, and removing the least-energy seam, look how much of a net energy difference the removal of a seam would make, aka "forward energy".

A few years later another paper came out that used "forward gradient difference maps" which supposedly work even better, but to be honest the formulas described in that paper are too complex for me to understand[2]. Conceptually though, I think they just extended the original energy function (a simple Sobel operator[3]) with a few other ones that include orientation and others:

> The energy function measures the curvature inconsistency between the pixels that become adjacent after seam removal, and involves the difference of gradient orientation and magnitude of the pixels. Our objective is to minimize the differences induced by the removed seam, and the optimization is performed by dynamic programming based on multiple cumulative energy maps, each of which corresponds to the seam pattern associated with a pixel. The proposed technique preserves straight lines and regular shapes better than the original and improved seam carving, and can be easily combined with other types of energy functions within the seam carving framework

With that in mind, it shouldn't be that complicated to come up with more improvements to seam-carving: just stack on different energy functions (either forward or current energy), and compute the ideal seam based on the (weighted) sum of them.

[0] https://user-images.githubusercontent.com/883386/35498498-3c...

[1] http://www.eng.tau.ac.il/~avidan/papers/vidret.pdf

[2] http://cvlab.postech.ac.kr/~hyeonwoonoh/acmmm2012.pdf

[3] https://en.wikipedia.org/wiki/Sobel_operator


Seam carving is also slow and doesn't work well for stretching images beyond around 50%. Moreover, computing optimal sequence of carving steps using dynamic programming for large images is not practical.


Looks really interesting. I'd love to see an animated version that shrinks from large to small.


I wrote a version of this in javascript with a live demo you can play with at https://alexander.soto.io/seam-carving


Your JavaScript version is very easy to use, and I'm so glad I can upload my own images instead of just using examples!

The algorithm itself looks like it needs face (and body) detection to be really useful, though.

Example: My girlfriend and I on holiday in Australia.

Original:

http://peterburk.free.fr/seamCarving/lighthouse.jpg

Shrunk:

http://peterburk.free.fr/seamCarving/lighthouse_seamCarving....


Am I the only one getting a laugh out of this? I also tried it on myself and the result was funny.


With a little change to the algorithm, you can mark areas as "must not change" and "must change", basically manually marking areas as "high energy" and "low energy".


In all fairness, the read me says “Todo: face detection”


That's true, and I agree with the author!

It will be more complicated than it seems, though, because I just tried the GUI version from gabeiscoding above, and got the following result:

http://peterburk.free.fr/seamCarving/lighthouse_seamCarvingG...


Yeah, but the lighthouse looks great!


That's really neat!

With my tongue somewhat-in-cheek, I wanted to point you that you may want to be careful with that sample pic of the Eiffel Tower at night!

See: https://petapixel.com/2017/10/14/photos-eiffel-tower-night-i...


Unable to resize on mobile. Consider giving some margin or padding on the right of the image so one can drag the line to resize.


That's _extremely_ cool! Thanks for sharing


Super awesome!



Try this short video from SIGGRAPH 2007: https://www.youtube.com/watch?v=vIFCV2spKtg



Flash implementation from 2007 here: http://rsizr.com


I am always blown away by this stuff, but it still applies some distortion to the image in a way that feels slightly unnatural.

It's similar to the effect that changing focal length on a camera has. eg. Wider angle lenses seem to squish far away objects and enlarge closer objects. However, this affects the entire image in a consistent way whereas changing focal length does not.

I wonder if this could be used to simulate focal length changes?


Nice implementation. Like face detection also text-detection would be nice, to avoid distorting / aliasing text (like in advertisements).


How well can it remove objects from images? Can it remove an object in a cluttered environment? I'm thinking of image augmentation for deep CNN's. Normally, augmentation can help with invariability to small rotations, translations and flips, but if we could remove objects from images, it might be useful for creating many examples from one image, for image based reasoning.


The algorithm has the ability to remove image parts based on the energy level of each object calculated by a simple sobel operator. So in order to remove an object you have to either localize the object and give a higher energy value or to train a CNN to localize some specific object type, then again increase the energy value. Right now i'm working to implement the face detection part. So if anyone is interested in extending the library even further and to integrate a CNN, it's more than welcome.


I read the paper long time ago, but if I remember correctly, it does not remove objects, the way it works is it iteratively finds a "seam" - a pixel-level path that goes perpendicular to the direction of resizing - that has the lowest "energy" and removes it. So objects are preserved and "empty space" is reduced.


Surprisingly, imagemagick already supports this.


Link: http://www.imagemagick.org/Usage/resize/#liquid-rescale. Seems like it was added as an experimental feature in v6.3.8-4. It requires compiling with the option `--with-liblqr`.


Not to downplay the cool factor of a new implementation, Liquid Rescale (long-time GIMP plugin) also has protective masks, which prevent or delay the laying of seams in some parts of the image. Caire seems to intend to do a more automagical subset of that, with the "Face detection" todo, but the better short term solution is to accept masks directly.

http://liquidrescale.wikidot.com/en:examples


Looks cool, will check out soon. I remember I created an image resize web application using seam carving during college for my software engineering project as I was fascinated by the algorithm. Not sure if in working state currently. https://github.com/pixation/pixation


Loosely tangental; can anyone recommend an image library management system they've had success with? Looking for uploading, thumb-nailing, search metadata, access levels and ideally some on-the-fly transformations like Cloudinary offer


I would suggest having a look at Uploadcare, https://stackshare.io/uploadcare/how-uploadcare-built-a-stac... (though I am biased as a CEO)


Is anyone aware of applying a similar technique to computer models ie: fake parametric design?


Could we use any direction for pixel removal, rather than only vertical or horizontal ?


You either shrink picture horizontally by one pixel or vertically by one pixel; seams must be 8-connected and always advance by 1 pixel in the opposite direction to the one being removed. It's a heuristics and by making it that way you can guarantee exactness of 1px removal.

These days it's a popular algorithm to be implemented as a midterm project in computational photography courses at top schools.


Well, think about what it would mean to remove a seam diagonally: the edges of the picture would no longer line up. The only way to avoid this is to always remove a seam along one of the axes at a time.


Just run the algorithm twice?


Better yet, scale the image proportionally down to the largest of the two target sides, then run the algorithm once in the direction that needs to shrink.


It looks promising, but the current result isn't great. Use cases for something like this is very specific... perhaps mass conversion of thousands of photos from 4:3 into 16:9...


is this still the best approach to this problem? I would think that a GAN or similar would be better for keeping the correct features for things like faces etc.


Time to write an iOS app and make millions?


I think they had this in Photoshop circa 1999, or I def. remember seeing a demo of it by Adobe around that time.

Anyways, neat! Nice contribution.


It was much more recent than that. Content Aware Fill was introduced in Photoshop CS5 which was released in 2010. Perhaps you were thinking of the Spot Healing Brush of Photoshop which is somewhat similar but much more manual and limited.


I distinctly remember the tech being demo'd in 1999-2000. Huge amounts of "wow's" being echo'd in the auditorium. This was around when PS 4.5/5 was out. I guess it was a feature to be slated for PS 6?


Seam carving is cool and has a lot of potential, especially for certain kinds of images. I can't wait until we see something like it in pro photo editors like Affinity Photo.


Photoshop has it for ages; even bought the original research.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: