@Jona Looks great! Yes it should be possible. But I think you would need to replace [ofTouchListener] with [_locRcv]s to listen to the event from the main patch so the module interface can listen to the mouse click according to the render order. (Just like how it's done with draggableShapes example)
[_locRcv] should be used one level lower from the main patch. This is to use main patch's local variable name and still communicate with other abstractions. Also note if you use [ofMouseListener], it will not handle multitouch on mobile devices. (it will respond to one finger at a time) That's why I used [ofTouchListener] in pdgui abstraction. But if you're targeting desktop only, [ofMouseListener] is enough and easier to handle.
If you really want to build the modular environment using multiple modules, I suggest you first consider how each module should work commonly and try to first build a minimal module that only has common attributes. (e.g. window bar, interface section, render order..) Then it would be easier to maintain and to create other module later on since you only need to add the non-common part on top of your minimal module.
I think creating such large system requires thorough planning and clear idea of how things should work in the first place otherwise it is likely that you will continuously face many unexpected problems and have to rework many times.
P.S.: You probably know this but your patch currently uses left audio channel only.