3 minutes, 22 seconds
At work I was tasked with implementing a retention policy for our images in Amazon Elastic Container Registry (ECR). I first wanted to see how many images were in there, how many had tags and how much space they were taking up. While this information is available via the AWS Web GUI, it involved a lot of clicking back and forth when trying to tally up all the different repos and their images.
I asked a friend who does a lot of AWS work what he recommended and he replied:
I often see folks developing tools for AWS and posting them in the r/aws subreddit. For example this post.
I liked the idea of leveraging prior art, especially if it is open source and I can improve it and re-share it. I quickly found this starter gist by RafaelWO. Then I noticed it’d been forked and improved upon by nbr23. I love it!
I needed a bit more details as I noticed it didn’t run against both public and private ECR repos, didn’t give a break down of tagged vs untagged, so I made my own fork. After some (probably too much ;) hacking, here’s what my output looks like, which it does for both your private and public repos* :
PUBLIC_REPOSITORY SIZE IMAGE_COUNT TAGGED/UNTAGGED
foo/bar-healthcheck <no-images> 0 0/0
foo/curator 40.82MB 1 1/0
foo/http-redirector 53.81MB 1 1/0
foo/bar-server 53.86MB 1 1/0
foo/ci-images 78.62MB 1 1/0
foo/auto-ssh 232.85MB 6 3/3
foo/volume-monitor 354.26MB 1 1/0
foo/corge 405.79MB 2 2/0
foo/bar-quux 2.82TB 25373 12269/13104
foo/bar-sentinel 3.10TB 25472 12306/13166
foo/bar-api 3.38TB 25494 12312/13182
foo/bar-smang 3.99TB 25267 12323/12944
----------------------------- -------- -------- --------
TOTAL 14.45TB 152413 74000/78413
* NB – the TOTAL row at the bottom isn’t accurate in this sample output – it works IRL!
Running the script
The script is dead simple to run. Make sure you have the aws command line app installed and that you’re logged in (aws login). I think the only specialty tool you’ll need is jq – if you don’t know about this tool already – today is you’re lucky day – it’s amazing! Download or clone my gist and call ./aws_ecr_stats.sh You’ll see a nice ASCII table output, similar to above.
Note that the script uses a JSON file per repo to cache the results in the cache_private and cache_private directories. If you loose connectivity or cancel the script from running – it might generate a zero byte cache file. Simply delete the file and re-run the script.
If you run the script a 2nd (3rd for 4th etc.) time, it will use these cache files to accelerate calculating the table. For repos with 100s of thousands of images, you want this! It also enabled me to do development much faster.
Deleting images
One of my primary goals, after seeing how many images we had, was doing a bulk delete of most of the images. While ultimately by retention policy will handle this, it felt great to explore the images with jq and then queue them up for bulk deletion.
Speaking of jq, you’ll very likely need to use jq to generate file which the delete script which expects CSV lines in this format "DATE","sha256:HASH","repo/path". So, for example:
"2018-08-24T10:28:12-07:00","sha256:43348e8cd8b809213565447b773ff1abc7e2919503880d9cad4549f313b88f84","foo-bar/quux-smang"
"2020-08-12T07:10:32-07:00","sha256:a06f9a68ee807827cadb7887cdeac9abd8376f9e2711475ecec8edbbc0a3ddb6","foo-bar/bash-dashboard"
To generate this with jq you could call it simply with .imageTags != null to find all tagged images but exclude any from this year (grep -v '2026-') and save it to a file:
jq -r \
'.imageDetails[] | select(.imageTags != null) | [.imagePushedAt, .imageDigest, .repositoryName, .imageTags[]?]|@csv' \
foo_bar-smang.json\
| sort | grep -vE '2026-' \
>> tagged.images.to.delete.csv
Or our could choose to find all tagged images but exclude any images with SemVer tags using a regex:
jq -r \
'.imageDetails[] | select(.imageTags != null) | [.imagePushedAt, .imageDigest, .repositoryName, .imageTags[]?]|@csv' foo_bar-api.json\
| sort | grep -vE '2026-' \
| grep -vE '"[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{3,20}"' \
>> tagged.images.to.delete.csv
When you have you a set of public images to delete and a set of private images to delete and the CSV files have the correct format, call the delete script. The script accepts the name of the CSV file and public or private as input:
./batch.delete.images.sh CSV_FILE_NAME PUBLIC_PRIVATE > LOG_FILE
For example:
./batch.delete.images.sh tagged.images.to.delete.csv private > tagged.images.to.delete.log
Best of luck and comment here or on the gist with any questions!