Describe the bug
When analyzing complex binaries using the Ghidra extractor, certain functions are prematurely truncated. This happens because the current implementation relies on strict control flow graph (CFG) traversal, which can miss disconnected or complex basic blocks.
Steps to Reproduce
- Analyze a heavily obfuscated or complex binary using Ghidra with capa.
- Observe that the number of identified basic blocks for certain functions is lower than expected.
- Note that features within the "missing" blocks are not extracted.
Expected behavior
The extractor should identify all basic blocks belonging to a function's scope to ensure 100% feature coverage.
Additional context
This issue will be addressed by implementing flow-insensitive block discovery within the Ghidra feature extractor. This ensures that even if the CFG is fragmented, all blocks assigned to the function by Ghidra's analysis are processed.
Describe the bug
When analyzing complex binaries using the Ghidra extractor, certain functions are prematurely truncated. This happens because the current implementation relies on strict control flow graph (CFG) traversal, which can miss disconnected or complex basic blocks.
Steps to Reproduce
Expected behavior
The extractor should identify all basic blocks belonging to a function's scope to ensure 100% feature coverage.
Additional context
This issue will be addressed by implementing flow-insensitive block discovery within the Ghidra feature extractor. This ensures that even if the CFG is fragmented, all blocks assigned to the function by Ghidra's analysis are processed.