Imagine the whole thing like a graphics card that is in a different PC. Your app wants to draw it’s content on the remote screen. Only it’s own content inside it’s own window. This is not screen sharing. Your app cannot touch any other apps.
X11 is the connection between your app and the remote graphics card. It may be the local card as well, it is the same.
Technically, a wm is not needed. The app and X11 would work anyway.
Shouldn’t window managers abstract all that for the software
The wm does not interrupt or change any communication between the app and the screen. It amends it with decoration and control buttons etc. for example it draws the window borders around the app’s own window area.