Files
compliance-scan/docs/detailed-guide.md
Heiko f60de7c2da Add SSH scan support with BSI TR-02102-4 compliance
- SSH scanning via ssh-audit (KEX, encryption, MAC, host keys)
- BSI TR-02102-4 and IANA compliance validation for SSH
- CSV/Markdown/reST reports for SSH results
- Unified compliance schema and database views
- Code optimization: modular query/writer architecture
2026-01-23 11:05:01 +01:00

17 KiB

compliance-scan - Detailed Guide

LLM-optimized developer reference for compliance scanning.

Quick Reference: Entry points, architecture, workflows, compliance logic, and extension guide.

Entry Points

Function Module Purpose
scan_tls(hostname, port) sslysze_scan.scanner TLS/SSL scan
scan_ssh(hostname, port) sslysze_scan.ssh_scanner SSH scan
write_scan_results(...) sslysze_scan.db.writer Persist scan data (core)
fetch_scan_data(db_path, id) sslysze_scan.reporter.query Retrieve scan for reporting
fetch_scans(db_path) sslysze_scan.reporter.query List all scans
fetch_scan_metadata(db, id) sslysze_scan.reporter.query Get scan metadata only
check_compliance(db_path, id) sslysze_scan.db.compliance Validate against BSI/IANA
generate_csv_reports(...) sslysze_scan.reporter Generate CSV reports
generate_markdown_report(...) sslysze_scan.reporter Generate Markdown report

Architecture

CLI (commands/) → Scanner (scanner.py, ssh_scanner.py)
                ↓
              Database (db/writer.py, db/compliance.py)
                ↓
              Reporter (reporter/query.py, reporter/*.py)

Key Modules:

  • commands/ - CLI command handlers (scan, report, update-iana)
  • scanner.py - TLS/SSL scanning via SSLyze
  • ssh_scanner.py - SSH scanning via ssh-audit
  • db/writer.py - Core database write operations (scan records, host info)
  • db/tls_writer.py - TLS-specific database operations
  • db/compliance.py - BSI/IANA compliance validation
  • db/compliance_config.py - Compliance check configurations
  • db/generic_compliance.py - Generic compliance logic
  • reporter/query.py - Database queries for reporting
  • reporter/*.py - Report generation (CSV, Markdown, reStructuredText)

CLI Commands

Scan

compliance-scan scan <hostname>:<port>[,<port>...] [--print] [-db <path>]

Examples:

compliance-scan scan example.com:443,636
compliance-scan scan [2001:db8::1]:22 --print

Report

compliance-scan report [scan_id] -t <csv|md|rest> [-o <file>] [--output-dir <dir>]

Examples:

compliance-scan report -t md -o report.md
compliance-scan report 5 -t csv --output-dir ./reports
compliance-scan report --list

Update IANA

compliance-scan update-iana [-db <path>]

Database

Scan Data Tables

Table Columns Purpose
scans scan_id, hostname, ports, timestamp, duration Scan metadata
scanned_hosts scan_id, fqdn, ipv4, ipv6 Resolved addresses
scan_cipher_suites scan_id, port, tls_version, cipher_suite_name, accepted TLS cipher results
scan_supported_groups scan_id, port, group_name Elliptic curves/DH groups
scan_certificates scan_id, port, position, subject, key_type, key_bits Certificate chain
scan_ssh_kex_methods scan_id, port, kex_method_name SSH key exchange
scan_ssh_encryption_algorithms scan_id, port, encryption_algorithm_name SSH encryption
scan_ssh_mac_algorithms scan_id, port, mac_algorithm_name SSH MAC
scan_ssh_host_keys scan_id, port, host_key_algorithm, key_type, key_bits SSH host keys
scan_compliance_status scan_id, port, check_type, item_name, passed Compliance results

Reference Data Tables

Table Source Purpose
iana_tls_cipher_suites IANA TLS Cipher suite recommendations
iana_tls_supported_groups IANA TLS Group recommendations
iana_ssh_kex_methods IANA SSH SSH KEX recommendations
bsi_compliance_rules BSI TR-02102-* Unified compliance rules
bsi_tr_02102_1_key_requirements BSI TR-02102-1 Certificate key sizes

Unified BSI Schema

The bsi_compliance_rules table consolidates all BSI TR-02102-2 and TR-02102-4 compliance data.

Column Type Description
standard TEXT TR-02102-2, TR-02102-4
category TEXT cipher_suite, dh_group, ssh_kex, etc.
algorithm_name TEXT Algorithm/cipher/method name
additional_param TEXT Optional context (e.g., TLS version)
valid_from INTEGER Start year
valid_until INTEGER End year (NULL = no expiration)
specification TEXT Reference (RFC, etc.)
notes TEXT Additional remarks

Views

View Purpose
v_compliance_tls_cipher_suites TLS cipher suites + compliance flags
v_compliance_tls_supported_groups TLS groups + compliance flags
v_compliance_tls_certificates Certificates + key size compliance
v_compliance_ssh_kex_methods SSH KEX + compliance flags
v_compliance_ssh_encryption_algorithms SSH encryption + compliance flags
v_compliance_ssh_mac_algorithms SSH MAC + compliance flags
v_compliance_ssh_host_keys SSH host keys + compliance flags
v_summary_port_compliance Aggregated compliance per port
v_summary_missing_bsi_groups Missing BSI-approved groups
v_summary_missing_iana_groups Missing IANA-recommended groups

Workflows

Scan

# 1. Parse CLI
hostname, ports = parse_host_ports("example.com:443,22")

# 2. Perform scans
tls_result, tls_duration = scan_tls(hostname, 443)
ssh_result, ssh_duration = scan_ssh(hostname, 22)
scan_results = {443: tls_result, 22: ssh_result}

# 3. Write to database
scan_id = write_scan_results(
    db_path="compliance_status.db",
    hostname=hostname,
    ports=[443, 22],
    scan_results=scan_results,
    scan_start_time=datetime.now(UTC),
    scan_duration=tls_duration + ssh_duration
)

# 4. Check compliance
check_compliance(db_path, scan_id)

Report

# 1. Fetch data (uses views internally)
data = fetch_scan_data(db_path, scan_id)

# 2. Generate report
generate_csv_reports(db_path, scan_id, output_dir="./reports")
generate_markdown_report(db_path, scan_id, output_file="report.md")

Compliance

Configuration

Compliance checks are defined in db/compliance_config.py via COMPLIANCE_CONFIGS:

  • cipher_suites - TLS cipher suite validation (only accepted cipher suites, unique per port)
  • supported_groups - TLS group validation (all discovered groups)
  • ssh_kex - SSH key exchange validation
  • ssh_encryption - SSH encryption validation
  • ssh_mac - SSH MAC validation
  • ssh_host_keys - SSH host key validation

Each config maps scan tables to IANA/BSI reference tables.

Filtering: Cipher suites use scan_filter_column: "accepted" with scan_filter_value: 1 to check only accepted cipher suites. Other checks evaluate all discovered items.

Duplicate Prevention: The compliance query uses DISTINCT on (port, algorithm_name) to count unique algorithms per port. Cipher suites tested across multiple TLS versions are counted once.

Validation Logic

Function check_compliance_generic() in db/generic_compliance.py:

  1. Query unique items from scan tables using DISTINCT
  2. Join with IANA and BSI reference tables
  3. Check BSI approval first (higher priority)
  4. Verify validity period if BSI-approved
  5. Fall back to IANA recommendation if not BSI-approved
  6. Assign severity: info (passed), warning (deprecated), critical (failed)
  7. Store one result per unique item in scan_compliance_status table

SSH Duplicate Prevention: SSH scanner (ssh_scanner.py) uses set() to track unique encryption and MAC algorithms. Only client_to_server lists are populated and stored. Database writer (scan_data_types.py) no longer concatenates client_to_server and server_to_client lists to avoid duplicates.

Certificate Validation

Function check_certificate_compliance() validates against BSI TR-02102-1:

  1. Extract key type (RSA, ECDSA, DSA)
  2. Query bsi_tr_02102_1_key_requirements for minimum key size
  3. Verify key size and algorithm validity period
  4. Validate signature hash algorithm
  5. Store result in scan_compliance_status table

Standards

BSI TR-02102-1 - Certificates

Algorithm Min Bits Valid Until
RSA 3000 -
ECDSA 250 -
DSA 3072 2029 (deprecated)

BSI TR-02102-2 - TLS

Stored in bsi_compliance_rules:

  • category='cipher_suite' - Approved TLS cipher suites
  • category='dh_group' - Approved elliptic curves and DH groups
  • valid_until - End year (NULL = no expiration)

BSI TR-02102-4 - SSH

Stored in bsi_compliance_rules:

  • category='ssh_kex' - Approved key exchange methods
  • category='ssh_encryption' - Approved encryption algorithms
  • category='ssh_mac' - Approved MAC algorithms
  • category='ssh_host_key' - Approved host key types

IANA Recommendations

Column recommended values:

  • Y - Recommended
  • N - Not recommended
  • D - Deprecated

Testing

Test Structure

tests/
├── cli/                 # CLI parsing
├── scanner/             # TLS/SSH scan functions
├── db/                  # Database queries
├── compliance/          # BSI/IANA validation, duplicate detection
├── iana/                # IANA import/validation
├── reporter/            # CSV/MD/reST export
├── fixtures/            # Test data
└── conftest.py          # Shared fixtures

Compliance Tests:

  • test_no_duplicates.py - Verifies no duplicate compliance checks
  • test_compliance_with_realistic_data.py - Realistic scan scenarios
  • test_plausible_compliance.py - Plausibility checks
  • test_summary_ssh_duplicates.py - Verifies SSH algorithms counted once (no duplicates)

Run Tests

# All tests
poetry run pytest tests/ -v

# Specific category
poetry run pytest tests/scanner/ -v

# With coverage
poetry run pytest tests/ --cov=src/sslysze_scan

Development

Code Style

poetry run ruff check src/ tests/
poetry run ruff format src/ tests/

Requirements

  • Python 3.13+
  • SSLyze 6.0.0+
  • ssh-audit (external tool)
  • Poetry
  • Ruff

Module Sizes

File Lines Purpose
scanner.py ~225 TLS scanning logic
ssh_scanner.py ~240 SSH scanning logic
db/writer.py ~172 Core database operations
db/tls_writer.py ~700 TLS-specific write ops
reporter/query.py ~850 Database read operations
db/compliance.py ~205 Compliance checking
db/compliance_config.py ~80 Compliance configuration
db/generic_compliance.py ~236 Generic compliance logic

Extending

Add Compliance Standard

  1. Insert data into bsi_compliance_rules with new category
  2. Add entry to COMPLIANCE_CONFIGS in db/compliance_config.py
  3. Create test in tests/compliance/
  4. Create database view for reporting if needed

Add Report Format

  1. Create reporter/format_export.py with generate_format_report() function
  2. Use fetch_scan_data() for data retrieval
  3. Register in reporter/generate.py dispatcher

Add Scanner Feature

  1. Extend scanner.py or ssh_scanner.py
  2. Update db/writer.py to persist new data
  3. Create database table and view
  4. Add compliance rules to bsi_compliance_rules if applicable

SSH Parser Notes:

  • SSH host key bits are parsed from ssh-audit output using regex (\d+)-?bit
  • SSH encryption/MAC algorithms use set() for duplicate detection
  • Only client_to_server lists are populated (not both directions)

Database Management

Template Database

File: src/sslysze_scan/data/crypto_standards.db

Contains reference data (IANA, BSI), schema definitions, and views. Schema changes are applied directly to template. New scan databases are created by copying template.

CSV Headers

File: data/csv_headers.json

Defines CSV column headers for all compliance views. Headers are stored in csv_export_metadata table in the database.

SSH Host Keys CSV: Columns are Algorithm, Type, Bits, BSI Approved, BSI Valid Until, Compliant (no fingerprint column).

Summary Statistics

Summary calculation in reporter/query.py via _calculate_summary():

TLS Metrics:

  • ports_with_tls - Ports with TLS/SSL support
  • total_cipher_suites - Accepted cipher suites checked
  • compliant_cipher_suites - Cipher suites passing BSI/IANA validation
  • total_groups - Supported groups discovered
  • compliant_groups - Groups passing BSI/IANA validation

SSH Metrics:

  • ports_with_ssh - Ports with SSH support
  • total_ssh_kex - KEX methods discovered (unique per port)
  • compliant_ssh_kex - KEX methods passing BSI/IANA validation
  • total_ssh_encryption - Encryption algorithms discovered (unique, no duplicates)
  • compliant_ssh_encryption - Encryption algorithms passing validation
  • total_ssh_mac - MAC algorithms discovered (unique, no duplicates)
  • compliant_ssh_mac - MAC algorithms passing validation
  • total_ssh_host_keys - Host keys discovered
  • compliant_ssh_host_keys - Host keys passing validation
  • total_ssh_items - Sum of all SSH items
  • ssh_overall_percentage - Overall SSH compliance rate

Summary CSV Output (csv_export.py):

  • Includes both TLS and SSH metrics
  • Shows counts and percentages for each category
  • Reports critical vulnerabilities count

Query Optimization

Modular Design (reporter/query.py):

fetch_scan_data() delegates to 12 focused helper functions:

  • _fetch_tls_cipher_suites() - TLS cipher suites with version detection
  • _fetch_tls_supported_groups() - TLS elliptic curves and DH groups
  • _fetch_tls_certificates() - Certificate chain with compliance
  • _fetch_vulnerabilities() - Known vulnerabilities
  • _fetch_protocol_features() - Protocol-level features
  • _fetch_session_features() - Session resumption data
  • _fetch_http_headers() - HTTP security headers
  • _fetch_compliance_summary() - Per-port compliance stats
  • _fetch_ssh_kex_methods() - SSH key exchange algorithms
  • _fetch_ssh_encryption() - SSH encryption algorithms
  • _fetch_ssh_mac() - SSH MAC algorithms
  • _fetch_ssh_host_keys() - SSH host keys with compliance

Each helper function:

  • Has single responsibility
  • Returns structured data (dict/list)
  • Uses database views for compliance joins
  • Minimal coupling to main function

Benefits:

  • Main function reduced from 387 to ~35 lines
  • Easy to test individual data fetchers
  • Clear separation between TLS and SSH queries
  • Consistent error handling per data type

Writer Modularization

Separation of Concerns (db/writer.pydb/writer.py + db/tls_writer.py):

Original writer.py (929 lines) split into:

db/writer.py (172 lines) - Core operations:

  • write_scan_results() - Main entry point
  • _insert_scan_record() - Scan metadata
  • _resolve_hostname() - DNS resolution
  • _save_host_info() - Host information
  • _save_ssh_scan_results() - SSH wrapper (delegates to generic_writer)

db/tls_writer.py (700 lines) - TLS-specific operations:

  • save_cipher_suites() - TLS cipher suite persistence
  • save_supported_groups() - Elliptic curves and DH groups
  • save_dhe_groups_from_cipher_suites() - DHE group extraction
  • save_certificates() - Certificate chain storage
  • save_vulnerabilities() - Heartbleed, ROBOT, CCS injection
  • save_protocol_features() - Compression, early data, fallback SCSV
  • save_session_features() - Renegotiation and resumption
  • save_http_headers() - Security headers (HSTS, HPKP, Expect-CT)
  • FFDHE helper functions (group name/IANA mapping)

Integration:

  • generic_writer.py imports from tls_writer instead of writer
  • Clean module boundaries: Core vs Protocol-specific
  • TLS functions now reusable across modules

Benefits:

  • 81% reduction in writer.py size (929 → 172 lines)
  • Clear separation: Core logic vs TLS logic vs SSH logic
  • Easier navigation and maintenance
  • Independent TLS module can be tested/modified separately