Estimating the memory usage of ImageMagick

Firstly, if all you want to do is resize images, we recommend you use vipsthumbnail instead of ImageMagick. The memory footprint of vipsthumbnail is so much lower that you may be able to run it without worrying too much about its memory usage.

However, if you need all the features of ImageMagick, then keep reading: we will show you how to train a model to accurately estimate the memory usage of an ImageMagick command on the system it'll be running on.

A few things you should know

  1. Estimating ImageMagick's memory usage is a dark art. There are too many variables involved to produce a simple universal calculation for estimating ImageMagick's memory usage: the operating system, processor architecture, hardware emulation and ImageMagick version all factor into how much memory ImageMagick consumes. In our experience, the theoretical calculations you might find on StackOverflow – which use image size, image format and QuantumDepths (i.e. Q8/16) to arrive at memory estimates – are often very different from the real memory usage you will observe on your system.
  2. Memory usage varies wildly between systems (even when using the exact same ImageMagick binary). Therefore, any constants and coefficients you derive for your memory estimation formula should come from the same system you'll be performing your image transformations on.
  3. Think of "space required" instead of "memory required". ImageMagick abstracts over memory, so rather than thinking about memory, think about space instead.

Space requirements

When you run an ImageMagick command, a certain amount of space is required to run it, which is measured in bytes.

Where does ImageMagick allocate space?

ImageMagick allocates space across tranches, but will try to keep within the upper tranche(s) if possible. ImageMagick will spill into a subsequent tranche after filling the current one, e.g. if 90MB of space is required, ImageMagick may allocate 30MB in RSS, 30MB in CACHE and 30MB in DISK.

The order of precedence is as follows:

  • RSS (incredibly fast).
    Note 1: RSS is essentially a fancy word for RAM.
    Note 2: There is a minimum RSS ImageMagick requires for a given transformation, and is a function of the input image and the transformation. This cannot be lowered with -limit (see "Does the -limit flag work?" below).
  • CACHE (very fast).
  • DISK (very slow).

ImageMagick does not knowingly allocate memory to CACHE(2): this type of allocation occurs when ImageMagick allocates to DISK(3), but the OS transparently allocates to CACHE(2) when it sees enough free RAM is available (e.g. within the cgroup) to prevent going to DISK(3).

Empirically we've found CACHE(2) adds a 100% time penaly, whereas DISK adds a 2500% time penalty, compared to storing in RSS(1).

What's the minimum memory (i.e. RSS) ImageMagick requires?

There's a theoretical and a practical answer to this question:

Theoretically you can measure the minimum RSS ImageMagick requires by running:

$ /usr/bin/time -f %M ./magick input.jpg -resize 400 png:output

462112

$ /usr/bin/time -f %M ./magick -limit memory 0 -limit map 0 input.jpg -resize 400 png:output

96848

The first command tells you how much space is required for the transformation.

The second command tells you the peak RSS ImageMagick used when instructed to use as little RSS as possible (by passing the -limit memory 0 -limit map 0 options): the additional space will be allocated from DISK and/or CACHE.

Here we can see ImageMagick requires 451MB space, and a minimum of 95MB RSS to accommodate its peak memory usage.

However: this is only a theoretical answer...

Peak RSS usage does not equate to how much memory you realistically require!

You must still consider the following two factors when limiting the amount of memory you're making available to ImageMagick:

  1. Memory fragmentation: if you are only providing the bare minimum memory, you must (somehow) ensure this memory isn't fragmented, as ImageMagick will likely want to allocate large arrays (which obviously requires contiguous memory).
  2. Time penalty of spilling to disk: if the total operation requires say 451MB of space, but you only provide the bare minimum of 95MB of memory, then you'll incur huge time penalties as the transformation spills to disk: the wall clock time of the operation can inflate by around x15 per our findings.

So I just set the -limit flag, right?

Nope! Even if you know how much you want to limit ImageMagick's memory usage to, the -limit memory flag isn't actually very effective at limiting memory, for the following two reasons:

  1. The -limit memory flag does not prevent ImageMagick from going to memory, since ImageMagick will just spill to CACHE instead, which is essentially still RAM (assuming there's enough available memory on the system).
  2. The -limit memory flag behaves discretely, not continuously: e.g. for an image that requires 451MB of space, you may be able to limit the memory to 50MB, 129MB and 451MB – but not to values in between.

    Worse yet: ImageMagick picks the closest boundary, which may result in your  -limit memory being rounded up by several hundred MBs: making it fairly useless as a "limit" flag!

Introducing cgroups...

The solution to ImageMagick's unreliable -limit memory flag – if you're on Linux – is to use Linux Control Groups (aka cgroups).

Using cgroups, we can set hard limits on ImageMagick's memory usage.

Unfortunately, this introduces two new challenges:

1. OOMs if you underestimate the memory!

Of course, if you underestimate the memory usage for an ImageMagick operation, and assign too-small of a cgroup, then the ImageMagick process will OOM.

2. Memory fragmentation.

Cgroups appear to bind themselves to specific block(s) of memory when memory.limit_in_bytes is set (discussed below). Sometimes the block(s) come more fragmented than other times, such that given several cgroups each with the same memory.limit_in_bytes, one cgroup can consistently fail to run an ImageMagick command, whereas another will consistently succeed to run the same command.

Furthermore, as ImageMagick runs it will further-fragment this limited number of blocks.

Since ImageMagick requires large blocks of contiguous memory, it will OOM when the cgroup's blocks become (or start) fragmented. We advise including some additional buffer to account for this.

How much memory.limit_in_bytes should I give the cgroup?

Granting any less than the total space required for an ImageMagic operation (i.e. 451MB in the example we've been using) will cause ImageMagick to spill into DISK, which comes with a 2500% time penalty compared to RSS based on our findings.

As such, we always want DISK to be 0.

We also want CACHE to be 0, since intentionally using CACHE is nonsensical as this is equivalent to saying "I know the cgroup has memory available for RSS, but I'd like to use that memory for CACHE instead, which is 2x slower".

Since DISK must be 0, and CACHE must be 0, that means we must plan for everything to fit into RSS.

The value of memory.limit_in_bytes must therefore be equal to (or a little higher than) the output of /usr/bin/time -f %M ./magick ...args-without-limit-flags... multiplied by 1024.

So: How do I estimate the space required by an ImageMagick command?

It's difficult to produce an equation that precisely predicts how much space ImageMagick will require.

The space required by an ImageMagick command depends on a large number of variables – including the environment you're running ImageMagick on, the input image, how evenly the image fits into the various pixel buffers, the nature of the command being run, and so forth – that there isn't realistically a single equation to answer it.

Instead: the best way we've found is to produce a linear model (from a simple linear regression).

We did this by taking a sample of images in various formats and sizes, and using the peak RSS as the "continuous label" with the pixel count as the model's only "feature". We then performed a linear regression (i.e. drawing a straight line of best fit through our scatter graph of memory_used/image_pixel_count data points) to arrive at a simple linear equation.

You can optimise this approach by performing a separate linear regression per image format, ultimately producing a new linear model per image format. This will allow you to reduce overestimations while still avoiding underestimations. In effect, the image format becomes another "feature" of your model.

Since we want to avoid underestimating memory at all costs (as this will result in an OOM), we adjust the regression line such that all observations fit beneath it. This means we will overestimate space requirements for some images, but hopefully never underestimate.

Our linear model will look something like:

space_required = constant + pixel_count * coefficient

Building a linear model for the input

For demonstrative purposes, the ImageMagick command we want to estimate will be a "resize and transcode" operation, e.g. taking a JPEG image of any size, and outputting it as a PNG with reduced dimensions.

We'll start by outputting a 1 pixel image, just to remove the output image's dimensions from the equation (we'll focus on this separately):

$ /usr/bin/time -f %M:%e ./magick input.jpg -resize 1 png:output

We then create our initial samples: many image formats, but with the same number of pixels, just to identify the most expensive format.

Our findings show that JPEG 2000 (JPF) images require the most space, when compared to JPG, JPS, PNG, HEIC and GIF.

We then create more samples: many images with different dimensions, all as lossless JPEG 2000 (as this seems to be the largest format).

Our findings show a perfectly linear relationship between pixel count and space required.

The linear regression line is as follows (y is space required in KB, x is pixel count):

y = 0.0249x + 16105

We then adjust the constant to ensure all our samples fall below the line. The updated equation is:

y = 0.0249x + 21000

This gives us a good estimation for transcoding and resizing images to 1-pixel PNGs!

Now we need to consider the space required for the output...

Building a linear model for the output

Our findings show that if the output image size is sufficiently smaller than the input image, no additional space is required. Beyond a certain point, space is acquired linearly as before.

We found that where x is the input image's pixel count, if the output image's pixel count stays below y then no additional space is required:

y = 0.4058x + 525009

If the output pixels exceed y, we take the difference and pass it through our linear function for determining space required for input images, above.

Putting it all together:

space_required_for_input_kb = 0.0249 * input_pixel_count + 21000

pixels_shared_with_output = 0.4058 * input_pixel_count + 525009

space_required_for_output_kb = max(0, 0.0249 * (output_pixel_count - pixels_shared_with_output) + 21000)

space_required_kb = space_required_for_input_kb + space_required_for_output_kb

Wrapping up

We hope you found this article useful.

We've been using this approach in the wild for our Image Upload API, and have found it works well. As with any machine learning model, be warned: the accuracy of your model is going to greatly depend on the training data you've fed into it. If you're going to use this approach, make sure you use a large training set!