Trending Model:#1Qwythos-9B-Claude-Mythos-5-1M-GGUFempero-ai⬇1251kTrending Model:#2Unlimited-OCRbaidu⬇758kTrending Model:#3GLM-5.2zai-org⬇176kTrending Model:#4Ornith-1.0-35B-GGUFdeepreinforce-ai⬇285kTrending Model:#5Ornith-1.0-9B-GGUFdeepreinforce-ai⬇255kTrending Model:#6Ornith-1.0-9Bdeepreinforce-ai⬇58kTrending Model:#7gemma-4-12B-agentic-fable5-composer2.5-v2-3.5x-tau2-GGUFyuxinlu1⬇314kTrending Model:#8DeepSeek-V4-Pro-DSparkdeepseek-ai⬇8kTrending Model:#9Ornith-1.0-35Bdeepreinforce-ai⬇186kTrending Model:#10Qwen-AgentWorld-35B-A3BQwen⬇39kTrending Model:#1Qwythos-9B-Claude-Mythos-5-1M-GGUFempero-ai⬇1251kTrending Model:#2Unlimited-OCRbaidu⬇758kTrending Model:#3GLM-5.2zai-org⬇176kTrending Model:#4Ornith-1.0-35B-GGUFdeepreinforce-ai⬇285kTrending Model:#5Ornith-1.0-9B-GGUFdeepreinforce-ai⬇255kTrending Model:#6Ornith-1.0-9Bdeepreinforce-ai⬇58kTrending Model:#7gemma-4-12B-agentic-fable5-composer2.5-v2-3.5x-tau2-GGUFyuxinlu1⬇314kTrending Model:#8DeepSeek-V4-Pro-DSparkdeepseek-ai⬇8kTrending Model:#9Ornith-1.0-35Bdeepreinforce-ai⬇186kTrending Model:#10Qwen-AgentWorld-35B-A3BQwen⬇39k

Nex-N2-mini-Turbo-Phase-Twin By Frosty40 Runs Local AI Fast

A futuristic graphics card featuring glowing circuit pathways and placed to the right side of view.

Nex-N2-mini-Turbo-Phase-Twin is a new release that brings a large language model to Intel Arc graphics cards. It packages the Qwen3.5-35B-A3B model into a single file that holds two different precision levels. Users can choose which precision to load without downloading anything extra.

Developer Frosty40 created this project to support local AI on Intel hardware. They tuned the model to run efficiently on Intel Arc GPUs by using a custom build of llama.cpp. This approach lets the software detect hardware and pick the right settings for optimal speed.

Features and hardware compatibility

Key Features
  • Tuned specifically for Intel Arc GPUs.
  • Carries two expert precision levels in one.
  • Runs multimodal reasoning and vision tasks.
  • Supports single or dual graphics card setups.
  • Includes automatic hardware detection and calibration.

This tool is for people running local AI tasks on Intel graphics hardware who want fast performance. It benefits those who need multimodal reasoning and vision capabilities directly on their own machines. Users with one or two 16GB graphics cards can expect high token speeds without relying on cloud services.

Important setup details

The software requires a custom llama.cpp build because standard versions cannot load this specific file type. Users must disable prompt cache and context checkpoints or the model will become incoherent after the first turn. A helpful launcher script is included to automatically detect graphics memory and adjust the setup accordingly.

"Made this for the A770 crowd... reports are good, its setup to work well with one card, and really come into its own with 2." - Source: Reddit