Skip to content

Improved JSON parsing and error handling for tool modules#1952

Open
louiselalanne wants to merge 4 commits intosmicallef:masterfrom
louiselalanne:master
Open

Improved JSON parsing and error handling for tool modules#1952
louiselalanne wants to merge 4 commits intosmicallef:masterfrom
louiselalanne:master

Conversation

@louiselalanne
Copy link

Improved JSON parsing and error handling for tool modules

Summary

This PR improves the robustness and reliability of three tool integration modules (sfp_tool_nuclei, sfp_tool_wafw00f, and sfp_tool_whatweb) by fixing JSON parsing issues and enhancing error handling.

Changes

1. sfp_tool_nuclei.py

Problem: The module was failing to parse Nuclei's JSONL (JSON Lines) output format correctly, causing crashes when processing vulnerability scan results.

Pasted image 20251025093920

Improvements:

  • ✅ Fixed JSONL parsing to handle one JSON object per line instead of expecting a single JSON array
  • ✅ Added validation to skip non-JSON lines (lines not starting with {)
  • ✅ Enhanced CVE extraction logic to support both classification.cve-id field and template ID patterns
  • ✅ Improved host extraction from matched-at URLs using urlparse
  • ✅ Added proper exception handling for json.JSONDecodeError and KeyError
  • ✅ Better event creation for non-CVE findings with detailed descriptions
  • ✅ Added support for WEBSERVER_TECHNOLOGY events for informational findings

2. sfp_tool_wafw00f.py

Problem: Temporary files were not being cleaned up properly in case of errors.

Pasted image 20251025093910

Improvements:

  • ✅ Added finally block to ensure temporary output files are always removed
  • ✅ Improved file existence check before attempting removal
  • ✅ Better error messages with stderr/stdout output for debugging

3. sfp_tool_whatweb.py

Problem: The module was crashing with JSONDecodeError because WhatWeb outputs JSONL format (one JSON object per line), not a single JSON array.

Pasted image 20251025093858

Improvements:

  • ✅ Complete rewrite of JSON parsing to handle JSONL format correctly
  • ✅ Parse output line-by-line with individual error handling per line
  • ✅ Added proper encoding handling with decode('utf-8', errors='ignore')
  • ✅ Enhanced debug logging showing first 500 chars of output
  • ✅ Added type checking before accessing nested dictionary values
  • ✅ Graceful handling of malformed JSON lines without stopping execution
  • ✅ Better validation for HTTPServer and X-Powered-By plugin data structures

Testing

All three modules have been tested with their respective tools:

  • Nuclei: Successfully parses vulnerability scan results in JSONL format
  • WAFW00F: Properly cleans up temporary files in all scenarios
  • WhatWeb: Correctly processes multi-line JSON output without crashes
Pasted image 20251025100552

Backwards Compatibility

All changes are backwards compatible and do not affect the module configurations or options.


Related Issues: Fixes JSON parsing errors in tool integration modules

Type of change:

  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improvement to existing functionality)

cbxss added a commit to cbxss/spiderfoot that referenced this pull request Feb 7, 2026
…ken modules

- Migrate from requirements.txt to pyproject.toml with relaxed dependency bounds
- Switch to uv for package management across project and Docker
- Update Docker: Alpine 3.20, Python 3.12-bookworm, fix Node/Wappalyzer setup
- Fix secure library API for 1.x (was 0.3.x), PyPDF2 -> pypdf API
- Fix 14 type()==Y patterns to isinstance()
- Cherry-pick upstream bug fixes: WhatsMyName fields (smicallef#1894), nmap parsing (smicallef#1879),
  DNS for Family IP (smicallef#1872), nuclei/wafw00f/whatweb JSON parsing (smicallef#1952),
  db.py UnboundLocalError (smicallef#1787), dev port correlation (smicallef#1827),
  accounts strip_bad_char support (smicallef#1828)
- Add 5 new modules: InternetDB (Shodan free), LeakCheck (paid+free),
  WhoisFreaks, ip2location.io

238/238 modules load successfully on Python 3.12.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant