Files
compliance-scan/docs/detailed-guide.md
2025-12-19 20:10:39 +01:00

600 lines
22 KiB
Markdown

# compliance-scan - Detailed Guide
Complete reference for developers and advanced users.
## Core Entry Points
| Component | Path | Purpose |
| --------------- | ------------------------------------------ | ------------------------------------- |
| CLI | `src/sslysze_scan/__main__.py` | Command-line interface entry |
| Scanner | `src/sslysze_scan/scanner.py` | SSLyze integration and scan execution |
| Database Writer | `src/sslysze_scan/db/writer.py` | Scan result persistence |
| Reporter | `src/sslysze_scan/reporter/` | Report generation (CSV/MD/reST) |
| Compliance | `src/sslysze_scan/db/compliance.py` | BSI/IANA validation logic |
| Query | `src/sslysze_scan/reporter/query.py` | Database queries using views |
| IANA Update | `src/sslysze_scan/commands/update_iana.py` | IANA registry updates from web |
| IANA Validator | `src/sslysze_scan/iana_validator.py` | IANA data validation |
| IANA Parser | `src/sslysze_scan/iana_parser.py` | IANA XML parsing utilities |
## Installation
```bash
poetry install
```
## Quick Reference
```bash
# Scan server on multiple ports
poetry run compliance-scan scan example.com:443,636
# Generate Markdown report
poetry run compliance-scan report -t md -o report.md
# Generate CSV reports
poetry run compliance-scan report -t csv --output-dir ./reports
# List all scans
poetry run compliance-scan report --list
```
## CLI Commands
### Scan Command
```
compliance-scan scan <hostname>:<port1>,<port2> [options]
```
Note: SSLyze outputs INFO-level log messages during scanning that cannot be suppressed. These messages are harmless and can be ignored.
| Argument | Required | Description |
| -------------------- | -------- | ---------------------------------------------------------------- |
| `<hostname>:<ports>` | Yes | Target with comma-separated ports. IPv6: `[2001:db8::1]:443,636` |
| `--print` | No | Display summary in console |
| `-db <path>` | No | Database file path (default: compliance_status.db) |
Examples:
```bash
compliance-scan scan example.com:443,636 --print
compliance-scan scan [2001:db8::1]:443,636 -db custom.db
```
### Report Command
```
compliance-scan report [scan_id] -t <type> [options]
```
| Argument | Required | Description |
| -------------------- | -------- | ------------------------------ |
| `scan_id` | No | Scan ID (default: latest scan) |
| `-t <type>` | Yes | Report type: csv, md, rest |
| `-o <file>` | No | Output file (md/rest only) |
| `--output-dir <dir>` | No | Output directory |
| `--list` | No | List all available scans |
Examples:
```bash
compliance-scan report -t md -o report.md
compliance-scan report 5 -t csv --output-dir ./reports
compliance-scan report -t rest --output-dir ./docs
```
### Update IANA Command
```
compliance-scan update-iana [-db <path>]
```
| Argument | Required | Description |
| ------------ | -------- | ------------------------------------------------------- |
| `-db <path>` | No | Database file to update (default: compliance_status.db) |
Updates IANA registry data from official sources:
- TLS Parameters: https://www.iana.org/assignments/tls-parameters/tls-parameters.xml
- IKEv2 Parameters: https://www.iana.org/assignments/ikev2-parameters/ikev2-parameters.xml
Default database contains IANA data as of 12/2024.
Examples:
```bash
compliance-scan update-iana
compliance-scan update-iana -db custom.db
```
Update process:
1. Fetches XML from IANA URLs
2. Validates headers against database schema
3. Validates data integrity (value formats, minimum row counts)
4. Calculates diff (added/modified/deleted entries)
5. Updates database in transaction (rollback on error)
6. Logs all changes at INFO level
## Report Formats
### CSV
Generates granular files per port and category.
| File Pattern | Content |
| --------------------------------------------- | -------------------------------------- |
| `summary.csv` | Scan statistics and compliance summary |
| `<port>_cipher_suites_<version>_accepted.csv` | Accepted cipher suites per TLS version |
| `<port>_cipher_suites_<version>_rejected.csv` | Rejected cipher suites per TLS version |
| `<port>_supported_groups.csv` | Elliptic curves and DH groups |
| `<port>_missing_groups_bsi.csv` | BSI-approved groups not offered |
| `<port>_missing_groups_iana.csv` | IANA-recommended groups not offered |
| `<port>_certificates.csv` | Certificate chain with compliance |
| `<port>_vulnerabilities.csv` | Vulnerability scan results |
| `<port>_protocol_features.csv` | TLS protocol features |
| `<port>_session_features.csv` | Session handling features |
| `<port>_http_headers.csv` | HTTP security headers |
| `<port>_compliance_status.csv` | Aggregated compliance per check type |
Behavior: Ports without TLS support generate no files. Empty sections are omitted.
### Markdown
Single comprehensive report with:
1. Metadata: Scan ID, hostname, IPs, timestamp, duration, ports
2. Summary: Statistics table
3. Per-port sections (TLS-enabled ports only):
- TLS configuration
- Cipher suites (accepted/rejected by version)
- Supported groups with compliance
- Missing groups (collapsible details)
- Certificates with key size and compliance
- Vulnerabilities
- Protocol features
- Session features
- HTTP security headers
### reStructuredText
Identical structure to Markdown but uses `.. csv-table::` directives for Sphinx integration.
Use case: Generate documentation that references CSV files for tabular data.
## Database Structure
File: `compliance_status.db` (SQLite, Schema Version 5)
Template: `src/sslysze_scan/data/crypto_standards.db`
Full schema: [schema.sql](schema.sql)
### Scan Result Tables
| Table | Content |
| ------------------------ | ------------------------------------------------------------ |
| `scans` | Scan metadata: scan_id, hostname, ports, timestamp, duration |
| `scanned_hosts` | Resolved FQDN with IPv4/IPv6 addresses |
| `scan_cipher_suites` | Cipher suites per port and TLS version (accepted/rejected) |
| `scan_supported_groups` | Elliptic curves and DH groups per port |
| `scan_certificates` | Certificate chain with key type, size, validity |
| `scan_vulnerabilities` | Vulnerability test results per port |
| `scan_protocol_features` | TLS protocol features (compression, early data, etc.) |
| `scan_session_features` | Session renegotiation and resumption |
| `scan_http_headers` | HTTP security headers per port |
| `scan_compliance_status` | Compliance evaluation per item and port |
### Database Views (Schema v5)
Six optimized views eliminate complex JOINs and improve query performance:
| View | Purpose |
| ------------------------------------ | --------------------------------------------- |
| `v_cipher_suites_with_compliance` | Cipher suites with BSI/IANA compliance flags |
| `v_supported_groups_with_compliance` | Groups with compliance status |
| `v_certificates_with_compliance` | Certificates with key size compliance |
| `v_port_compliance_summary` | Aggregated compliance statistics per port |
| `v_missing_bsi_groups` | BSI-approved groups not offered by server |
| `v_missing_iana_groups` | IANA-recommended groups not offered by server |
### Reference Data Tables
IANA TLS:
- `iana_tls_cipher_suites`: Cipher suite registry with recommendations
- `iana_tls_signature_schemes`: Signature algorithm registry
- `iana_tls_supported_groups`: Named groups registry
BSI TR-02102-1 (Certificates):
- `bsi_tr_02102_1_key_requirements`: Key length requirements
- `bsi_tr_02102_1_hash_requirements`: Hash algorithm requirements
BSI TR-02102-2 (TLS):
- `bsi_tr_02102_2_tls`: TLS cipher suites and groups with validity periods
BSI TR-02102-3 (IPsec/IKEv2):
- Encryption, integrity, DH groups
BSI TR-02102-4 (SSH):
- Key exchange, encryption, MAC
CSV Export Metadata:
- `csv_export_metadata`: Stores CSV headers as JSON for all export types
## Compliance Validation
### BSI TR-02102-1 (Certificates)
Key length requirements:
| Algorithm | Minimum Bits | Status |
| --------- | ------------ | ----------------------------- |
| RSA | 3000 | Required |
| ECDSA | 250 | Required |
| DSA | 3072 | Deprecated (valid until 2029) |
Hash algorithms:
- Allowed: SHA-256, SHA-384, SHA-512
- Deprecated: SHA-1, MD5
### BSI TR-02102-2 (TLS)
Validates:
- Cipher suites against BSI-approved lists
- Supported groups against BSI requirements
- Validity periods (time-based expiration)
### IANA
Validates:
- Cipher suite recommendations (Y/N/D flags)
- Supported group recommendations (Y/N/D flags)
## Project Structure
```
src/sslysze_scan/
├── __main__.py # CLI entry point
├── cli.py # Argument parsing
├── scanner.py # SSLyze integration
├── protocol_loader.py # Port-protocol mapping
├── output.py # Console output
├── iana_parser.py # IANA XML parsing utilities
├── iana_validator.py # IANA data validation
├── commands/
│ ├── scan.py # Scan command handler
│ ├── report.py # Report command handler
│ └── update_iana.py # IANA update command handler
├── db/
│ ├── schema.py # Schema version management
│ ├── writer.py # Scan result storage
│ ├── compliance.py # Compliance validation
│ └── writers/ # Specialized writers
├── reporter/
│ ├── query.py # Database queries (uses views)
│ ├── csv_export.py # CSV generation
│ ├── csv_utils.py # CSV utilities (exporter class)
│ ├── markdown_export.py # Markdown generation
│ ├── rst_export.py # reST generation
│ └── template_utils.py # Shared utilities
├── templates/
│ ├── report.md.j2 # Markdown template
│ └── report.reST.j2 # reST template
└── data/
├── crypto_standards.db # Template DB (IANA/BSI + schema)
├── iana_parse.json # IANA XML source URLs and registry config
└── protocols.csv # Port-protocol mapping
tests/
├── fixtures/
│ ├── iana_xml/ # Minimal XML test fixtures
│ └── test_scan.db # Test database
├── test_iana_validator.py # IANA validation tests (25 tests)
├── test_iana_parse.py # IANA XML parsing tests (20 tests)
└── test_iana_update.py # IANA update logic tests (13 tests)
```
## Key Functions
### CLI and Parsing
| Function | Module | Purpose |
| -------------------------- | -------- | ----------------------------------- |
| `parse_host_ports(target)` | `cli.py` | Parse `hostname:port1,port2` format |
| `parse_arguments()` | `cli.py` | Parse CLI arguments |
### Scanning
| Function | Module | Purpose |
| ------------------------------------------------------------ | ------------ | -------------------------------- |
| `perform_scan(hostname, port, start_time)` | `scanner.py` | Execute SSLyze scan for one port |
| `create_scan_request(hostname, port, use_opportunistic_tls)` | `scanner.py` | Create SSLyze scan request |
### Database Writing
| Function | Module | Purpose |
| ---------------------------------------------------------------------------- | ------------------ | --------------------------------------- |
| `save_scan_results(db_path, hostname, ports, results, start_time, duration)` | `db/writer.py` | Store all scan results, returns scan_id |
| `check_compliance(db_path, scan_id)` | `db/compliance.py` | Validate compliance, returns statistics |
| `check_schema_version(db_path)` | `db/schema.py` | Verify schema compatibility |
| `get_schema_version(db_path)` | `db/schema.py` | Get current schema version |
### Database Querying
| Function | Module | Purpose |
| ------------------------------------- | ------------------- | ---------------------------------- |
| `get_scan_data(db_path, scan_id)` | `reporter/query.py` | Get complete scan data using views |
| `get_scan_metadata(db_path, scan_id)` | `reporter/query.py` | Get scan metadata only |
| `list_scans(db_path)` | `reporter/query.py` | List all scans in database |
| `has_tls_support(port_data)` | `reporter/query.py` | Check if port has TLS support |
### Report Generation
| Function | Module | Purpose |
| ------------------------------------------------------------ | ----------------------------- | ------------------------------------ |
| `generate_csv_reports(db_path, scan_id, output_dir)` | `reporter/csv_export.py` | Generate all CSV files |
| `generate_markdown_report(db_path, scan_id, output)` | `reporter/markdown_export.py` | Generate Markdown report |
| `generate_rest_report(db_path, scan_id, output, output_dir)` | `reporter/rst_export.py` | Generate reStructuredText report |
| `build_template_context(data)` | `reporter/template_utils.py` | Prepare Jinja2 template context |
| `generate_report_id(metadata)` | `reporter/template_utils.py` | Generate report ID (YYYYMMDD_scanid) |
### IANA Update and Validation
| Function | Module | Purpose |
| ------------------------------------------------ | ------------------------- | ---------------------------------------- |
| `handle_update_iana_command(args)` | `commands/update_iana.py` | Main update command handler |
| `fetch_xml_from_url(url)` | `commands/update_iana.py` | Fetch XML from IANA URL |
| `calculate_diff(old_rows, new_rows)` | `commands/update_iana.py` | Calculate added/modified/deleted entries |
| `process_registry_with_validation(...)` | `commands/update_iana.py` | Process and validate single registry |
| `validate_headers(table_name, headers, db_conn)` | `iana_validator.py` | Validate headers match database schema |
| `validate_registry_data(table_name, rows)` | `iana_validator.py` | Validate complete registry data |
| `validate_cipher_suite_row(row)` | `iana_validator.py` | Validate single cipher suite record |
| `validate_supported_groups_row(row)` | `iana_validator.py` | Validate single supported group record |
| `normalize_header(header)` | `iana_validator.py` | Normalize header to DB column format |
| `get_min_rows(table_name)` | `iana_validator.py` | Get minimum expected rows for table |
| `extract_updated_date(xml_content)` | `iana_parser.py` | Extract date from XML `<updated>` tag |
| `parse_xml_with_namespace_support(xml_path)` | `iana_parser.py` | Parse XML with IANA namespace detection |
| `find_registry(root, registry_id, ns)` | `iana_parser.py` | Find registry element by ID |
| `extract_field_value(record, header, ns)` | `iana_parser.py` | Extract field value from XML record |
## IANA Data Update Process
Configuration file: `src/sslysze_scan/data/iana_parse.json`
Structure:
```json
{
"https://www.iana.org/assignments/tls-parameters/tls-parameters.xml": [
["registry_id", "output_filename.csv", ["Header1", "Header2", "..."]]
]
}
```
Validation rules:
1. Headers must match database schema (case-insensitive, `/``_`)
2. Minimum row counts per table (50 for cipher suites, 10 for groups, 5 for small tables)
3. Value format validation (0x prefix for hex values, numeric for groups)
4. Recommended field must be Y, N, or D
Error handling:
- Validation failure: Rollback transaction, display error with hint to open issue
- Network error: Abort with error message
- XML structure change: Validation catches and aborts
Logging output:
```
INFO: Fetching https://www.iana.org/assignments/tls-parameters/tls-parameters.xml
INFO: XML data date: 2025-12-03
INFO: iana_tls_cipher_suites: 448 rows (2 added, 1 modified, 0 deleted)
INFO: Successfully updated 11 registries (1310 total rows)
```
## Version Management
Version is maintained in `pyproject.toml` only:
```toml
[project]
version = "0.1.0"
```
Runtime access via `importlib.metadata`:
```python
from sslysze_scan import __version__
print(__version__) # "0.1.0"
```
## Development
Run tests:
```bash
poetry run pytest
poetry run pytest tests/test_iana_validator.py -v
```
Update IANA template database:
```bash
python3 -m sslysze_scan.iana_parser
```
Code style:
- PEP 8 compliant
- Max line length: 90 characters
- Ruff for linting and formatting
## SQL Query Examples
All queries use optimized views for performance.
### Cipher Suites with Compliance
```sql
SELECT cipher_suite_name, iana_recommended_final, bsi_approved_final, compliant
FROM v_cipher_suites_with_compliance
WHERE scan_id = ? AND port = ? AND accepted = 1;
```
### Port Compliance Summary
```sql
SELECT check_type, total, passed, percentage
FROM v_port_compliance_summary
WHERE scan_id = ? AND port = ?;
```
### Missing BSI Groups
```sql
SELECT group_name, tls_version, valid_until
FROM v_missing_bsi_groups
WHERE scan_id = ?;
```
### Non-Compliant Certificates
```sql
SELECT port, key_type, key_bits, compliant, compliance_details
FROM v_certificates_with_compliance
WHERE scan_id = ? AND compliant = 0;
```
### Vulnerabilities
```sql
SELECT port, vuln_type, vulnerable, details
FROM scan_vulnerabilities
WHERE scan_id = ? AND vulnerable = 1;
```
## Supported Protocols
### Opportunistic TLS (STARTTLS)
| Protocol | Ports |
| ---------- | ---------- |
| SMTP | 25, 587 |
| LDAP | 389 |
| IMAP | 143 |
| POP3 | 110 |
| FTP | 21 |
| XMPP | 5222, 5269 |
| RDP | 3389 |
| PostgreSQL | 5432 |
### Direct TLS
| Protocol | Port |
| -------- | ---- |
| HTTPS | 443 |
| LDAPS | 636 |
| SMTPS | 465 |
| IMAPS | 993 |
| POP3S | 995 |
### Not Supported
MySQL (proprietary TLS protocol)
Fallback behavior: Automatic retry with direct TLS if STARTTLS fails.
## Testing
```bash
poetry run pytest tests/ -v
```
**Test structure:**
- `tests/conftest.py`: Fixtures with test_db, test_db_path
- `tests/fixtures/test_scan.db`: Real scan data (Scan 1: dc.validation.lan:443,636)
- `tests/test_csv_export.py`: 11 CSV export tests
- `tests/test_template_utils.py`: 2 template utility tests
- `tests/test_cli.py`: 3 CLI parsing tests
- `tests/test_iana_validator.py`: 20 IANA validation tests
- `tests/test_iana_parse.py`: 14 IANA parsing tests
- `tests/test_iana_update.py`: 13 IANA update tests
**Total:** 63 tests
**Test database setup:**
- Loads `crypto_standards.db` (reference data + schema)
- Loads `test_scan.db` (scan data only)
- Creates views dynamically
- In-memory for speed
## Code Quality
**Linter:** Ruff
```bash
poetry run ruff check src/ tests/
poetry run ruff format src/ tests/
```
**Configuration:** `pyproject.toml`
- Line length: 90 characters
- Target: Python 3.13
- Rules: PEP 8, pyflakes, isort, naming, upgrades
## Requirements
- Python 3.13+
- SSLyze 6.0.0+
- Poetry (dependency management)
- Jinja2 3.1+
- pytest 9.0+ (development)
- ruff (development)
## Container Usage
```bash
./container-build.sh
podman run --rm compliance-scan:latest scan example.com:443
```
## Database Workflow
1. **First scan:** Copies `crypto_standards.db``compliance_status.db`
2. **Schema check:** Validates schema version (must be 5)
3. **Scan execution:** SSLyze performs TLS analysis
4. **Data storage:** Results written to scan tables
5. **Compliance check:** Validation against BSI/IANA via views
6. **Report generation:** Queries use views for optimized performance
## Architecture Notes
**Design principles:**
- Single database file contains everything (reference data + results)
- Views optimize complex queries (no N+1 queries)
- CSV headers in database (easy to modify)
- Template-based reports (Jinja2)
- Port-agnostic (one scan_id, multiple ports)
**Key decisions:**
- SQLite for simplicity and portability
- Views introduced in schema v5 for performance
- CSV export metadata centralized
- Test fixtures use real scan data
- Ruff for modern Python linting