On-premise security scanner powered by local LLMs.
OVERVIEW
TorchSight scans files for sensitive data, credentials, malicious payloads, and compliance violations using a locally-running LLM. No data leaves your machine.
It classifies documents across 7 categories and 51 subcategories — PII, credentials, financial, medical, confidential, malicious, and safe.
CATEGORIES
Malicious (14) injection, exploit, shell, phishing, prompt injection, supply chain, SSRF, SSTI, XXE
Confidential (9) classified, military, intelligence, weapons systems, nuclear, internal, M&A
Credentials (8) passwords, API keys, tokens, private keys, connection strings, cloud config
PII (6) identity, contact, government ID, biometric, behavioral, metadata
Safe (6) documentation, code, config, media, email, business
Financial (4) credit cards, bank accounts, transactions, tax records
Medical (4) diagnosis, prescription, lab results, insurance
BEAM MODEL
Base modelQwen 3.5 27B
MethodLoRA (r=128, α=256)
Training data78,358 samples
Sources18 (all permissive)
Category accuracy95.1%
vs Claude Opus 479.9%
vs Gemini 2.5 Pro75.4%
Default quantizationq4_K_M (~17 GB)
LicenseApache 2.0
SUPPORTED FILES
Text — txt, csv, json, xml, yaml, toml, log, md, sql, env
Code — py, rs, go, java, js, ts, c, cpp, rb, php, sh
Documents — pdf, docx, doc, xlsx, xls, pptx
Images — png, jpg, gif, bmp, tiff, webp
Email — eml, msg, mbox, pst, ost
Secrets — pem, key, crt, pub, env
CI/CD
# GitHub Actions
- run: torchsight /path --format sarif --fail-on high
- uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: torchsight_report.sarif
# Scan git diff
$ torchsight --diff HEAD~1 --fail-on medium