The vllm-metal inference backend in Docker Model Runner on macOS unconditionally sets trust_remote_code=True when loading model tokenizers, and runs without sandboxing. This causes transformers.AutoTokenizer.from_pretrained() to import and execute arbitrary Python files included in any model pulled from an OCI registry, resulting in arbitrary code execution on the Docker host as the Docker Desktop user when inference is triggered.
Any container on the Docker network can trigger this by calling the model-runner.docker.internal API to pull a malicious model and request inference.
Docker Model Runner enabled with the vllm-metal inference backend on macOS
Disable Docker Model Runner or only run trusted containers on Docker Desktop instances where Model Runner is enabled.