▲ 1 vLLM-mlx – 65 tok/s LLM inference on Mac with tool calling and prompt caching (github.com) by raullen | Feb 26, 2026 | 1 comments on HN Visit Link