Skip to content

feat(parser): Expand Tree-sitter language support - for syntax validation #1883

@dariuszkowalski-com

Description

@dariuszkowalski-com

Current Behavior

Forge currently supports syntax validation for 10 programming languages through Tree-sitter:

  • Already Supported: Rust, Python, JavaScript, TypeScript, CSS, Java, Scala, Go, C++, Ruby

This represents approximately 5% of available Tree-sitter grammars, limiting Forge's ability to analyze and validate code in modern multi-language projects.

Proposed Changes

Implement comprehensive Tree-sitter language support by adding the remaining, the most popular grammars in phases:

Phase 1 - Critical Missing Languages:

  • C# (.NET ecosystem - TIOBE no. 5)
  • C (systems programming - TIOBE no. 2)
  • PHP (web development - enterprise staple)
  • Swift (iOS/macOS development)
  • Kotlin (Android development)
  •  Dart (Multiplatform mobile development)
  • YAML configuration files)
  • TOML (configuration files)
  • Bash (scripting languages)
  • HTML (markup languages)
  • JSON (data interchange)
  • SQL (database queries)
  • Ruby (web development - enterprise staple)
  • Markdown (markup languages)
  • PowerShell (Windows scripting)

Phase 2 - Additional High Priority:

  • Dockerfile (DevOps)
  • Vue/Svelte (frontend frameworks)
  • HCL (Terraform)

Phase 3 - Specialized Languages:

  • Configuration languages (INI, XML)
  • Scientific languages (Julia, R, MATLAB)
  • Functional languages (Haskell, OCaml, F#)

Implementation Strategy:

  • Incremental rollout: Start with Phase 1 languages based on community feedback
  • Backward compatibility: Maintain existing functionality for current 8 languages
  • Performance monitoring: Add benchmarks for memory usage and parsing speed
  • Documentation: Update API docs with new language support details
  • Competitive Parity: Match capabilities of tools like Neovim and VS Code

Implementation Notes

Technical Approach:

  • Leverage existing modular architecture in crates/forge_services/src/tool_services/syn/
  • Add language modules to mod.rs following current pattern
  • Update languages.rs with new language mappings
  • Implement tests for each new language parser

Dependencies:

  • Add Tree-sitter grammar crates to Cargo.toml
  • Utilize tree-sitter-language-pack for comprehensive grammar collection
  • Follow existing validation patterns in syntax_validator.rs

Additional Context

The modular Tree-sitter architecture already exists in Forge - this request primarily focuses on expanding the language registry rather than architectural changes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    type: featureBrand new functionality, features, pages, workflows, endpoints, etc.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions