Happy to see this here! Seam carving is one of those neat little algorithms that hit a sweet spot, being a sum of a few moderately complex algorithms, comes from a readable paper, and gives very satisfying result. I had a lot of fun implementing it in undergrad.
One thing that I always wondered is how Photoshop managed to make it so fast that you can resize in real-time. If n is the width or height of the image, then the dynamic programming part is O(n^2) and needs to be recomputed after every seam removed. Since every seam is a single pixel wide, resizing the image by a non-trivial amount (say half) is O(n^3). There are other papers that remove multiple seams at a time but the quality isn't as good. GPU acceleration perhaps?
Another thing I learned while testing seam carving extensively is that it works nicely in certain scenes/situations, but tends to break down most of the time. The two most common scenarios are: 1) lines that are off by the horizontal by more than a few degrees get cutoff 2) objects loose their proportions and even when the manipulation is not directly obvious, it tends to feel off (in the uncanny valley sense).
I expect some interesting work to use deep learning for content-aware resizing, since neural nets could theoretically be more semantically and holistically aware of objects in the image.
Well, there is this paper called "Real-time content-aware image resizing", maybe that describes the method you are curious about?
> In this paper, we present a more efficient algorithm for seam based content-aware image resizing, which searches seams through establishing the matching relation between adjacent rows or columns. We give a linear algorithm to find the optimal matches within a weighted bipartite graph composed of the pixels in adjacent rows or columns.
That's the one I implemented -- with a bit of low-level hackery, it could run fast enough to resize at ~5-8 fps on mobile back in 2013. But the result didn't look great on most images if you resized by more than ~15%. The technique works by calculating the "importance" field of the image once, and all the seams upfront. Then you can just number the seams from 1-n and remove the first m seams when you resize the image by m. The runtime is then O(n^2) instead of O(n^3). In contrast, with regular seam carving, you'd often update the "importance" field every time you remove a seam. The result is that the seams that get removed are less likely to introduce artifacts. I think it's a similar problem as the forward energy v.s. backward energy problem discussed elsewhere in the thread, but worse.
Hmm, then I don't know. Still, they mustn't have stood still since then and come up with new tricks.
I wonder if you could do a kind of quad-tree version where you hone in on the precise seam and minimise recalculations to bubbling new energy up the tree. That might also give a way to weigh global-to-local structure.
I also thought about augmenting this algorithm with the semantic information output of a neural network, but couldn't come up with a good way to generate training data for semantic segmentation which wouldn't distort objects.
For example, if you consider an image of a roof, the roof pixels will all be semantically similar, but if you remove any of them just based on that, the regular structure will be distorted. Do you have an idea how to solve this?
I know little about deep learning though, so I can't really comment on it.
For situations like the roof problem you describe, I think PatchMatch is more likely to work out well. The downside is that it's considerably harder to implement correctly than seam carving. http://gfx.cs.princeton.edu/pubs/Barnes_2009_PAR/patchmatch....
Do you really need to regenerate the whole dynamic programming table though? It seems to me that it could be kept more conservative and only recalculate the needed parts.
IIRC, the "energy" may affect up to 3 below pixels (left, right, center), so each pixel of a removed seam requires re-calculations downwards in the shape of a funnel going to the left and right with a maximum of 45°. In the worst case, this means re-calculatable pixels are in the form of a triangle with a top row pixel as its tip and a 90° angle there.
So, for a landscape-oriented image of aspect ratio 2:1 that would mean re-calculating up to 50% of its pixels. For narrower ones, that percentage is even higher, so not too much to gain.
Neither GIMP nor Photoshop are libraries ("just sayin'"). It sounds like there were other previous libraries: that would be more relevant of a callback.
I remember first seeing content aware image scaling at Adobe MAX in Barcelona I think it was. We were completely dumbfounded by what was surely magic happening on stage. When they showed removing objects we just lost it, jaws on the floor and all. That was a fun conference.
I'm the author of this library. The algorithm permits to remove an object from the image, only it needs somehow to be localized. After localization it can be applied to that area a higher energy map, which means that part will be avoided by the seam carver. I will implement a face detection algorithm to automatically exclude faces to get not altered by the carver.
It's super fun but I can't think of many real-world uses. Maybe in subtle, element-wise stretching and reshaping as part of more complex editing? Squeezing images is a huge no-no in any graphic design, and while content-aware resizing is a neat trick to avoid the most glaring problems, it's still image squeezing. You'd rather just crop the image or choose a different one.
I think the previous comment was about changing the aspect ratio. And while you'd rather crop or choose another image, content aware resizing is now the 3rd option
I wrote a GUI for another seam carving library back in 2009[1], and it looks like although in archive mode, you can still access the source as well as the windows / mac binaries. Just tested it and it still works!
Not as fancy as photoshop I'm sure, but does have the ability to paint a mask of regions to keep / remove to aid the algorithm and get the desired result. Multi-threaded too!
Now that this thread is essentially tially dead, I can let out my little secret - I'm not the bad boy you think I am. I told them thw truth eventually -about 30 seconds after showing them the doctored one. I hope this doesn't change your opinion of me.
Seam carving is fascinating, and especially the first version of the algorithm really really simple to implement. However, that algorithm introduces certain artefacts - the boat in this picture[0] makes it pretty obvious.
The cause of these artefacts are mainly that the original algorithm did not look at energy that was introduced by removing a seam: each time this happens, the pixels adjacent to this seam become neighbours, which creates a new energy gradient. So sometimes removing the least-energy seam would produce a net increase in energy.
The original authors of the seam carving paper realised this[1], which lead to the obvious fix for it: instead of only looking at the current energy of a picture, and removing the least-energy seam, look how much of a net energy difference the removal of a seam would make, aka "forward energy".
A few years later another paper came out that used "forward gradient difference maps" which supposedly work even better, but to be honest the formulas described in that paper are too complex for me to understand[2]. Conceptually though, I think they just extended the original energy function (a simple Sobel operator[3]) with a few other ones that include orientation and others:
> The energy function measures the curvature inconsistency between the pixels that become adjacent after seam removal, and involves the difference of gradient orientation and magnitude of the pixels. Our objective is to minimize the differences induced by the removed seam, and the optimization is performed by dynamic programming based on multiple cumulative energy maps, each of which corresponds to the seam pattern associated with a pixel. The proposed technique preserves straight lines and regular shapes better than the original and improved seam carving, and can be easily combined with other types of energy functions within the seam carving framework
With that in mind, it shouldn't be that complicated to come up with more improvements to seam-carving: just stack on different energy functions (either forward or current energy), and compute the ideal seam based on the (weighted) sum of them.
Seam carving is also slow and doesn't work well for stretching images beyond around 50%. Moreover, computing optimal sequence of carving steps using dynamic programming for large images is not practical.
With a little change to the algorithm, you can mark areas as "must not change" and "must change", basically manually marking areas as "high energy" and "low energy".
I am always blown away by this stuff, but it still applies some distortion to the image in a way that feels slightly unnatural.
It's similar to the effect that changing focal length on a camera has. eg. Wider angle lenses seem to squish far away objects and enlarge closer objects. However, this affects the entire image in a consistent way whereas changing focal length does not.
I wonder if this could be used to simulate focal length changes?
How well can it remove objects from images? Can it remove an object in a cluttered environment? I'm thinking of image augmentation for deep CNN's. Normally, augmentation can help with invariability to small rotations, translations and flips, but if we could remove objects from images, it might be useful for creating many examples from one image, for image based reasoning.
The algorithm has the ability to remove image parts based on the energy level of each object calculated by a simple sobel operator. So in order to remove an object you have to either localize the object and give a higher energy value or to train a CNN to localize some specific object type, then again increase the energy value. Right now i'm working to implement the face detection part. So if anyone is interested in extending the library even further and to integrate a CNN, it's more than welcome.
I read the paper long time ago, but if I remember correctly, it does not remove objects, the way it works is it iteratively finds a "seam" - a pixel-level path that goes perpendicular to the direction of resizing - that has the lowest "energy" and removes it. So objects are preserved and "empty space" is reduced.
Not to downplay the cool factor of a new implementation, Liquid Rescale (long-time GIMP plugin) also has protective masks, which prevent or delay the laying of seams in some parts of the image. Caire seems to intend to do a more automagical subset of that, with the "Face detection" todo, but the better short term solution is to accept masks directly.
Looks cool, will check out soon. I remember I created an image resize web application using seam carving during college for my software engineering project as I was fascinated by the algorithm. Not sure if in working state currently. https://github.com/pixation/pixation
Loosely tangental; can anyone recommend an image library management system they've had success with? Looking for uploading, thumb-nailing, search metadata, access levels and ideally some on-the-fly transformations like Cloudinary offer
You either shrink picture horizontally by one pixel or vertically by one pixel; seams must be 8-connected and always advance by 1 pixel in the opposite direction to the one being removed. It's a heuristics and by making it that way you can guarantee exactness of 1px removal.
These days it's a popular algorithm to be implemented as a midterm project in computational photography courses at top schools.
Well, think about what it would mean to remove a seam diagonally: the edges of the picture would no longer line up. The only way to avoid this is to always remove a seam along one of the axes at a time.
Better yet, scale the image proportionally down to the largest of the two target sides, then run the algorithm once in the direction that needs to shrink.
It looks promising, but the current result isn't great. Use cases for something like this is very specific... perhaps mass conversion of thousands of photos from 4:3 into 16:9...
is this still the best approach to this problem? I would think that a GAN or similar would be better for keeping the correct features for things like faces etc.
It was much more recent than that. Content Aware Fill was introduced in Photoshop CS5 which was released in 2010. Perhaps you were thinking of the Spot Healing Brush of Photoshop which is somewhat similar but much more manual and limited.
I distinctly remember the tech being demo'd in 1999-2000. Huge amounts of "wow's" being echo'd in the auditorium. This was around when PS 4.5/5 was out. I guess it was a feature to be slated for PS 6?
Seam carving is cool and has a lot of potential, especially for certain kinds of images. I can't wait until we see something like it in pro photo editors like Affinity Photo.
One thing that I always wondered is how Photoshop managed to make it so fast that you can resize in real-time. If n is the width or height of the image, then the dynamic programming part is O(n^2) and needs to be recomputed after every seam removed. Since every seam is a single pixel wide, resizing the image by a non-trivial amount (say half) is O(n^3). There are other papers that remove multiple seams at a time but the quality isn't as good. GPU acceleration perhaps?
Another thing I learned while testing seam carving extensively is that it works nicely in certain scenes/situations, but tends to break down most of the time. The two most common scenarios are: 1) lines that are off by the horizontal by more than a few degrees get cutoff 2) objects loose their proportions and even when the manipulation is not directly obvious, it tends to feel off (in the uncanny valley sense).
I expect some interesting work to use deep learning for content-aware resizing, since neural nets could theoretically be more semantically and holistically aware of objects in the image.