Artificial Intelligence,
Generative Fill and
Underwater Photography


Stacks Image 308

Artificial intelligence has been in the news lately, especially generative AI.  It seems like every industry is trying to put this technology into their products.  The idea of generative AI is that a system can produce text, images or other media in response to original content and/or prompts (user supplied instructions to modify the process).  Public versions such as ChatGPT or DALL-E are available for online use by anyone.  By working with a massive database, parsing the meaning of the prompts, and using a predictive model to figure out what word or other data element is most likely to follow the last one, GAI systems produce impressive results. 
There are serious legal and ethical concerns about this.   While text GAI can generate flawless output from a purely linguistic point of view, the underlying meaning can be completely wrong or outdated.  Furthermore, these systems typically work by harvesting online content, often without the consent of the original creators.  These issues are both beyond the scope of this article, and fortunately less applicable to the processes described below.

I’m an underwater photographer.  And while my work is dependent on my diving skills and ability to produce well-lit and composed images in the camera, a huge part of this has always been post-production.  More than most topside photographers, I spend hours dealing with the image degradation that comes from shooting through a dense medium that absorbs light far more than air, differentially for various points on the visible light spectrum, and with reflective particulate matter suspended in it.  I consider that to be part of the art.

So when Adobe released their beta version of Photoshop that contained a “generative fill” command, I figured I would see what this could do with underwater images.  I was very impressed.

One task that I had often done manually was the adding of extra background.  This was necessary if you rotated a photo a small amount for the sake of composition, but you didn’t want to crop it further because that would make the subject too constrained or cut off.  This was easy for small areas with bland backgrounds.  If there was a lot of detail in the background, just using something like the clone tool made the image look obviously “faked”, so I worked out other approaches for this.  But this only worked for very small areas around the margins.  What if we wanted to dramatically increase the negative space?  This seemed like a job for GAI, where the computer will actually produce high resolution image detail based on the original image alone or along with user supplied prompts.

I first tried this without prompts on some topside photos.  What you do here is take an image, expand the canvas, then select this empty space and a small amount of the image and apply the generative fill tool.  You will get three versions to choose from each time you click.  The image now fills the entire new canvas, based on what the computer finds at the margins of the selected space.

Below is a topside photo that I loved, but it felt cramped, especially with the hand on the left side of the shot extending right to the border.  I had tried expanding the background manually, but it took a lot of work and still didn’t look great.  This was a perfect job for generative fill .


For underwater photos, I was able to do a similar manipulation, and I found an additional benefit.  If your subject has a little negative space around it, and you only select background pixels of the original image, then you will get the effect of significantly increased water clarity (visibility).  The reason for this is that you now have a high resolution image with more background and a relatively smaller subject.  BUT, the original subject retains its resolution, detail and contrast.  If you had just backed up from the subject during the dive to produce an image with the same ratio of subject size to canvas size, you would be adding a lot more water between the subject and the lens.  And unless you were in crystal clear water with even ambient lighting (something that virtually never happens), the subject would be dimmer, less contrasty and with less detail due to the water column between the lens and the subject.  With GAI, since you have retained the contrast and clarity of the subject, your brain interprets the image as being in much clearer water, as you can see in the two photos below.

My Image

My Image


The photos above show what you get if your subject doesn’t extend to the edges of the original image.  If it does, and it is cut at the border, then the system has to try to recreate something more complex than just background.  Sometimes this is very accurate, as in this photo below of me snorkeling.  See how the system has generated the bottom half of my body underwater.  For this one, I did a much larger fill, and the new image included a generated horizon.  I didn’t “ask” for a horizon with a prompt, the system just figured out that a horizon would look right there based on probability and the underlying massive image database. I am no longer snorkeling a few feet off a dock, but in the middle of the ocean!  And since the system is creating new, generated pixels, the resulting image is huge with a lot of zoomable detail and a sharp subject.  This image was downsized for online use, but the large files are great for big prints.

My Image


Another approach is when there is enough detail in the margins for the system to “get creative”, and instead of just filling in drab negative space to give the impression of distance, it comes up with interesting possible environments.  Here is a selfie that I took while snorkeling under a dock in a lake with a lot of bottom grass, followed by what GAI came up with.

My Image


Sometimes the system gets this right, and sometimes it makes guesses that aren’t what we would call accurate.  For example, below is the original tight crop of the face of a fish.  My original attempt (with no prompts) resulted in the system guessing “wrong”.  Even though a human recognizes this as a fish, the system has no real knowledge – it just makes assumptions based on a massive image database.  For this photo, the closest match (used to generate the neck) looked more amphibian.

My Image


So I decided to try giving the system a prompt for this photo, and suggested that it create a neck that looked like a snake.  I also gave it more space on the right side of the frame, and it came up with this amazing but disturbing image at the top of this page! I tried that again with a different fish, and got the effect below.

My Image


I’m having a lot of fun with this, and these images were the result of just a few days of playing around with this.  I am continuing to work on this and figure out various ways of tweaking the output to get better results.   These features are now out of beta testing, so if you have the current version of Adobe Photoshop, they are available for use, along with some helpful tutorials.

And one final thing.  I want to address a common question that I get when I lecture about post-production work on images.  Is this “cheating”?  Some photographers feel that any sort of image manipulation is inappropriate, and that the photo is what comes out of the camera.  I personally disagree. 

We spend huge amounts of money on better and better technology to improve the quality of our images -  high end sensors, fast lenses, powerful strobes, etc.  Why is technology used AFTER the shutter is pressed any different on principal?  I am just trying to make the best images that I can, so how is cleaning up a shot by masking out a bit of backscatter off limits, while angling my strobes out to do the same thing is fair play? 

The one exception I make is if the post-processing is done to deceive.  For example, if someone is trying to sell a dive trip and they put a whale shark in the background of a reef shot, that’s not right.  But other than that, my work isn’t done until I publish the photo.

Give this a try, you will be surprised how you can salvage many of your “meh” images with generative fill!

Michael Rothschild

My Image