I think I managed to reduce GNU #Mes memory usage by quite a bit just being stubborn on understanding every single detail of the things I use.
3 lines of code for a max of 12% reduction on memory usage in a simple program execution.
It was hard to catch but easy to do, it was just the TBYTES memory layout had some size calculation error.
I think I did it right but I still need to test a little bit more.