Local LLMs
LLM Leaderboards
Local LLM Applications
- LM Studio - GUI to download and run LLMs locally
- ollama - run LLMs locally on macOS, Linux, and Windows
- chat with MLX
Apps and Utilities
- TextGen - text gen in Obsidian
Frameworks
- MLX - framework for machine learning to be run on Apple silicon.
Articles
Resources
Performance Testing
- asitop - performance testing for Apple silicon
Results
Oct 2024
Apple M2 Max (cores: 4E+8P+30GPU) 64GB RAM
1
2
3
4
mlx_lm.generate --model mlx-community/Llama-3.2-3B-Instruct-4bit \
--max-tokens 2048 \
--temp 0.7 \
--prompt "Write a SwiftUI demo app in Swift. Show the swift code"
1
2
3
Prompt: 48 tokens, 461.703 tokens-per-sec
Generation: 829 tokens, 128.740 tokens-per-sec
Peak memory: 1.808 GB
1
2
3
4
mlx_lm.generate --model mlx-community/Llama-3.2-1B-Instruct-4bit \
--max-tokens 2048 \
--temp 0.7 \
--prompt "Write a SwiftUI demo app in Swift. Show the swift code"
1
2
3
Prompt: 48 tokens, 844.210 tokens-per-sec
Generation: 827 tokens, 314.059 tokens-per-sec
Peak memory: 0.686 GB