SecureCode v2: 8 AI Models Stop 45% Code Risks

Developer Scthornton have unveiled SecureCode v2, a dataset designed to address critical security vulnerabilities in AI-generated code. With 1,215 rigorously validated examples spanning 11 programming languages, the project targets the alarming statistic that 45% of AI coding assistants produce vulnerable code implementations (Yikes!).
Comprehensive Security Model Collection
The SecureCode v2 project introduces eight security-focused code generation models ranging from 3 billion to 20 billion parameters. These models are specifically trained on real-world security incidents, providing developers with AI assistants that understand and prevent common vulnerability patterns.
The 8 key models in the collection include:
- DeepSeek-Coder 6.7B - SecureCode Edition
- CodeLlama 13B - SecureCode Edition
- StarCoder2 15B - SecureCode Edition
- Llama 3.2 3B - SecureCode Edition
- Qwen 2.5-Coder 7B - SecureCode Edition
- IBM Granite 20B Code - SecureCode Edition
- CodeGemma 7B - SecureCode Edition
- Qwen 2.5-Coder 14B - SecureCode Edition
Incident-Grounded Training Approach
'I built the SecureCode model collection to solve this problem. Eight security-aware code models, trained on real-world breach patterns, designed to generate secure code by default.' states Scthornton. Each model leverages a unique dataset that includes actual security incidents like the Equifax 2017 breach, which cost $1.4 billion due to a single vulnerable code pattern.
The developer continues 'Every model in this collection was trained on lessons learned from real breaches. When you use them, you're getting security knowledge that companies paid for with incident response costs, regulatory penalties, and reputational damage.'.
Unique Dataset Characteristics
The SecureCode v2 dataset distinguishes itself through:
- 100% incident grounding in real security breaches
- 4-turn conversational training structure
- Comprehensive operational security guidance
- Coverage of OWASP Top 10:2025 vulnerabilies
Learn More About SecureCode v2
- Huggingface dataset: SecureCode V2 datasetSecureCode v2 coverage snapshot
- Model collection: 8 Models
- Project Paper: SecureCode v2 arxiv paper