Local LLMs

Local LLMs

LLM Leaderboards

Local LLM Applications

Apps and Utilities

Frameworks

  • MLX - framework for machine learning to be run on Apple silicon.

Articles

Resources

Performance Testing

  • asitop - performance testing for Apple silicon

Results

Oct 2024

Apple M2 Max (cores: 4E+8P+30GPU) 64GB RAM

1
2
3
4
mlx_lm.generate --model mlx-community/Llama-3.2-3B-Instruct-4bit \
    --max-tokens 2048 \
    --temp 0.7 \
    --prompt "Write a SwiftUI demo app in Swift. Show the swift code"
1
2
3
Prompt: 48 tokens, 461.703 tokens-per-sec
Generation: 829 tokens, 128.740 tokens-per-sec
Peak memory: 1.808 GB
1
2
3
4
mlx_lm.generate --model mlx-community/Llama-3.2-1B-Instruct-4bit \
    --max-tokens 2048 \
    --temp 0.7 \
    --prompt "Write a SwiftUI demo app in Swift. Show the swift code"
1
2
3
Prompt: 48 tokens, 844.210 tokens-per-sec
Generation: 827 tokens, 314.059 tokens-per-sec
Peak memory: 0.686 GB

© Mark Norgren. Some rights reserved.

Build Date: 2025-06-06

3f535e3