8 bit to half float Copycat

This machine learning model is for the Nuke’s copycat/inference tool. It attempts to convert common images of limited dynamic and bit-depth (x3 rgb 8bit). Into a image with a wider dynamic range. i.e. as if from:

  • an EXR’s rendered from a CG application
  • a profesional camera.

It specifically recovers:

  • clamped highlights (maintain highlight detail when graded by colourist)
  • 8bit banding of colour (to pass broadcast quality checks )

Artists might source images of low fidelity jpegs/pngs etc. from:

  • web images i.e. from Flickr
  • generative AI image software

Limitations:

  • This model cannot guess accurate detail (as an M.L. ‘In-fill’ model).
  • It will however provide more information than the previous procedural method compositors use of a key/blur&grade.
  • Try grading the input image to adjust the level of over 1 highlights in the resulting image.
  • It is tool rather than a magic bullet. An artist might want to create a difference mask with the output to then vary the strength of the output in different areas.

Comparison of Online Jpegs and A.I. images 1 2


Instructions:

  • Download the model / .cat file from github here.
  • In Nuke load the model file with the inference node and the example script.
  • The example script places the inference node either side of an OCIOColorspace ADX10 log and a add color transformation.

Training Methodology

It’s a fairly simple cat..

  • Trained in Nuke on available Arri Alexa sample footage from Arri’s website.
  • Trained in 2kDCP / log. Old School Kodak ADX10 was chosen as it has a limited range and does not change the gamut.
  • A add adjustment of 0.18 used to compensate for negative pixels
  • Main input being a rendered as: ACES sRGB – Texture. jpeg
  • The ground truth being the: original exr ap0 image
  • Additionally a Posterized (512) ACES 1.0 SDR – SDR video image was blended with a key of values below 0.2
    • So black areas of the image were protected to prevent crazy value grain
  • CopyCat settings:
    • Model size: large
    • Crop: between 128 & 512
    • Total steps: 85000 (aprox 2.5-days of training on RTX3090)

Caveat

For commercial jobs remember to check all images used have an unrestricted licence attached to them.


Footnotes:

  1. Source image: Flickr licence linktop Image2nd Image ↩︎
  2. Source image: Google Imagen3 ↩︎

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *