Ettin-Reranker-1b-V1 Delivers Speedy Relevancy Checks Locally

The new cross-encoder, Ettin-Reranker-1b-V1, scores pairs of text to reassess search results and boost retrieval quality. It is a 1-billion-parameter transformer that handles sequences up to 7,999 tokens long. The model uses a ModernBERT backbone and was fine-tuned on 143 million query-passage pairs for pointwise relevance scoring.
Tom Aarsen developed the Ettin Reranker family, with this 1B version striking a balance between accuracy and hardware demands. They trained it in bfloat16 with Flash Attention 2 for one epoch on the large-scale `ettin-reranker-v1-data` dataset. The release targets anyone who wants to replace massive cloud rerankers with a checkpoint that still reaches an NDCG@10 of 0.6114 on the MTEB retrieval benchmark.
Fast, local-friendly reranking with ModernBERT
- 1B-parameter cross-encoder built on ModernBERT architecture.
- Supports inputs up to 7,999 tokens in length.
- Delivers 189 text pairs per second on an RTX 3090.
- MTEB retrieval mean NDCG@10 of 0.6114.
- Trained on 143 million high-quality (query, document) pairs.
- Requires Flash Attention 2 for best throughput.
- Licensed under Apache 2.0 for commercial use.
- Runs easily via the Sentence Transformers library.
Professionals who keep all data on-premises will benefit the most, as the model processes documents without a network call. Small agencies can blend it into local retrieval pipelines and skip recurring cloud expenses. Hobbyists with a single RTX 3090 get a reranker that is fast enough for interactive use while still catching subtle relevance signals.
What developers should know
Training took 20 hours on a single high-end GPU, and the checkpoint was saved after exactly one epoch, so further task-specific fine-tuning is viable. Without bfloat16 or Flash Attention 2, throughput drops noticeably; on a consumer card, the best configuration yields 189 pairs per second, while CPU-only inference falls to about two pairs per second. The model was evaluated against 13 other public rerankers and landed ahead of similarly sized alternatives, though the 4B Qwen3-based teacher still holds the top spot.
"It computes scores for pairs of texts, which can be used for text reranking and semantic search." — Source: Hugging Face