You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Cyber-Quack-14B

Introduction

Cyber-Quack-14B is a specialized fine-tune of the Qwen2.5-14B architecture, optimized for Red Team operations, penetration testing assistance, and offensive security research. It was trained to provide direct, technical, and actionable exploit paths while minimizing the standard "AI Safety" refusals common in general-purpose models when discussing authorized security testing.

The model retains its native Tool-Calling capabilities, allowing it to interface with external security scanners and custom scripts via an XML-based schema.

Model Details

Base Model: Qwen2.5-14B
Training Framework: Axolotl (LoRA)
Training Hardware: 1x NVIDIA RTX 5090 (32GB VRAM)
Context Length: 131,072 tokens (Optimized for 16k-32k in local deployment)

Training & Datasets

Cyber-Quack was trained on a curated mix of offensive security data focused on:

Vulnerability Research: Deep analysis of CVEs, stack/heap overflows, and memory corruption logic.
CTF Methodology: Fine-tuned on Hack The Box (HTB) and OSCP-style attack chains (Enumeration -> Exploitation -> Privilege Escalation).
Security Datasets: Includes CyberSecurity-Dataset-Fenrir-v2.0, CyberSecurityEval, and PTF-ID-Bench.
Active Directory: Specialized focus on Kerberos attacks (Golden/Silver Tickets), lateral movement, and GPO abuse.
** licanKiraz0/Cybersecurity-Dataset-Fenrir-v2.0
** darkknight25/Vulnerable_Programming_Dataset
** CyberNative/CyberSecurityEval
** bdas-secure/ptf-id-bench

Verified Capabilities (Testing Results)

The model has been verified through a series of "Zero-Shot" Red Team prompts:

The "Smiley-Shell" Test: Correctly identified the vsftpd 2.3.4 backdoor trigger (:)) and provided the specific Metasploit module path without refusal.
AD Persistence: Successfully detailed the mechanics of a Golden Ticket attack, including the role of the KRBTGT account and TGT forgery.
Exploit Dev: Capable of generating Python-based exploit skeletons for stack-based buffer overflows, including proper padding and EIP overwrite logic.
Tool Calling: Successfully generates <tools> and <tool_call> blocks to interface with scanners like Nmap.

Rigorous Security Benchmarking (FAITH Framework)

Cyber-Quack-14B was evaluated head-to-head against Cisco's specialized Foundation-Sec-8B-Reasoning model using the open-source FAITH evaluation suite. Testing was conducted in a highly constrained execution environment (max_completion_tokens: 15, temperature: 0.3) to assess both raw security domain intelligence and structural compliance.

Performance vs. Cisco Foundation-Sec-8B

Benchmark Split	Cisco 8B Target	Cyber-Quack-14B (Lenient Accuracy*)	Status / Performance Delta
Deep Reasoning (`secbench-mcqa-eng-reasoning`)	41.1%	67.4%	🔥 +26.3% (Total Domination)
CyberMetric-2000 (Standards/Compliance)	~75.0%	85.3%	👑 +10.3% (Victory)
SecBench (MCQA) (Core Security)	~70.0%	74.4%	🎉 +4.4% (Victory)
SecEval (Enterprise Technical Controls)	84.8%	86.8%	🎯 +2.0% (Victory)
MMLU-Security (Computer Security Subset)	78.2%	71.0%	📈 Competitive Parity
Root Cause Mapping (`ctibench-rcm`)	75.3%	45.2%	🧠 Data Mapping Gap

*Note on Lenient vs. Strict Evaluation: Due to strict output constraints, Cyber-Quack-14B frequently provides the correct choice but wraps it in a tight conversational prefix (e.g., "Answer: B" or "Choice: A"), which passes Lenient Accuracy but falls outside of FAITH's rigid regex-anchored strict parser filters. True operational reasoning and security intelligence are reflected in the lenient column.

Core Architecture Insights

Elite Multi-Hop Logic: Smashed Cisco's reasoning target by over 26%, verifying the model's high-tier capability when stepping through complex, multi-stage attack paths.
Deep Compliance Retention: The 14B Q8_0 weights cleanly preserved federal, cryptographic, and enterprise control logic, maintaining a clear edge over standard 8B architectures on frameworks like NIST and ISO.
Granular Vulnerability Mapping: While displaying world-class general exploitation and engineering logic, a data mapping gap was identified regarding direct, verbatim CVE-to-CWE lookup matrices—an area prime for future programmatic fine-tuning.

Recommended Deployment (llama.cpp)

To run Cyber-Quack at high performance (targeting ~50+ tokens/sec on high-end hardware), use the following configuration:

./llama-server \
  -m cyber-qwen-2.5-14b-f16.gguf \
  -ngl 70 \
  -c 12000 \
  --flash-attn on

Downloads last month: 4

Safetensors

Model size

15B params

Tensor type

BF16

Model tree for jabbatheduck/Cyber-Quack-14B

Base model

Qwen/Qwen2.5-14B

Finetuned

(107)

this model