You would need to run the LLM on the system that has the GPU (your main PC). The front-end (typically a WebUI) could run in a docker container and make API calls to your LLM system. Unfortunately that requires the model to always be loaded in the VRAM on your main PC, severely reducing what you can do with that computer, GPU-wise.
I support this law (fuck cars), but if you step into the street thinking an oncoming car won’t destroy you like a pinata stuffed with ketchup packets, you have survived the luckiest lawsuit-free 28 years.