Reducing Memory Footprint While Creating Archive in Django
Python built-in zip library is commonly used to create archive. However, there is a concern when creating zip files using built-in library. Consider the case which we are zipping files larger than our available memory, we would easily run out of memory.
I was building a feature that requires zipping of files and upload to our Django backend storage. After digging around the internet, I summarize the logic I used to support this feature.
Use NamedTemporaryFile Instead Of Memory
A NamedTemporaryFile resides in secondary memory instead of main memory, thus using it does not consume extra memory.
Pseudocode
- Create a NamedTempFile
- Create a ZipFile with NamedTempFile as file output
- Write files into ZipFile
- Move the cursor of the NamedTempFile back to the beginning
- Wrap it with Django File
- Inject a file name
- Upload it to storage
Sample Code
Assume we have a model as in models.py, the logic of using secondary memory to create zip files lies in zipping.py.
I hope this helps. Please leave me a comment if you have better idea on reducing memory load when zipping files.
First published on 2018-09-11
Republished on Hackernoon