There are quite a number of unnecessary memory allocation on hot paths that could be moved on the stack.
This really hurts in multi-threading cases when other threads allocate a lot of memory.
Ideas for improvements:
- Pipeline barriers should not use std::vector
- Blur filter weights can be calculated in constructors
- Basically all usages of std::vector where we resize on a regular basis or even worse use .push_back without a proper .reserve