Skip to content

Comments

feat: implement IR mapping functionality and integrate with formatter#473

Draft
Enter-tainer wants to merge 1 commit intomasterfrom
mgt/ir-mapping
Draft

feat: implement IR mapping functionality and integrate with formatter#473
Enter-tainer wants to merge 1 commit intomasterfrom
mgt/ir-mapping

Conversation

@Enter-tainer
Copy link
Collaborator

Summary

Changes

Checklist

Before submitting, please ensure you've done the following:

  • Updated CHANGELOG.md: Added your changes with examples to the changelog
  • Updated documentation: Updated relevant docs, examples, or README
  • Added tests: Added tests for new features or bug fixes

Testing

Additional Notes

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @Enter-tainer, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces comprehensive Intermediate Representation (IR) mapping capabilities to the formatter. This new functionality allows for precise tracking of how segments of the original source code correspond to parts of the generated pretty-printed IR. The changes span the core formatting logic, the WebAssembly interface, and the playground frontend, enabling users to visualize these mappings, which is crucial for debugging and understanding the formatter's behavior.

Highlights

  • Intermediate Representation (IR) Mapping: Introduced comprehensive IR mapping functionality within the typstyle-core crate, enabling the tracking of source code spans to their corresponding IR output spans.
  • Formatter Integration: Integrated IR mapping into the PrettyPrinter to record source spans during IR generation and added a new render_ir_with_mapping method to Formatter.
  • WebAssembly (WASM) Exposure: Exposed the IR mapping capability via a new format_ir_with_mapping function in the typstyle-wasm crate, facilitating its use in web environments.
  • Playground Visualization: Updated the playground frontend to visualize these IR mappings, enhancing the debugging and understanding of the formatting process by highlighting corresponding source and IR spans.
  • Dependency Management: Modified Cargo.toml and Cargo.lock to use a local path for the prettyless dependency, likely to support the new IR tagging features.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • Cargo.lock
    • Removed source and checksum for the prettyless package.
  • Cargo.toml
    • Changed prettyless dependency to a local path.
  • crates/typstyle-core/src/ir_mapping.rs
    • Added new module defining structures (IrAtomSource, IrAtomOutput, IrAtomRecorder) and functions (join_ir_mappings, remap_offsets_for_indent_change, build_indent_offset_map) for managing and processing IR span mappings.
  • crates/typstyle-core/src/lib.rs
    • Imported IR mapping types and functions.
    • Conditionally enabled IR mapping features.
    • Modified Formatter to support IR mapping, including a new render_ir_with_mapping method and a call to reset IR mapping state.
  • crates/typstyle-core/src/pretty/code_flow.rs
    • Updated PrettyPrinter methods to use emit_source_text_untyped for unary/binary operators and keywords, enabling IR mapping for these elements.
  • crates/typstyle-core/src/pretty/comment.rs
    • Modified convert_comment to use emit_source_text_untyped for line comments, tagging them for IR mapping.
  • crates/typstyle-core/src/pretty/markup.rs
    • Updated PrettyPrinter methods for headings, term items, and list items to use emit_source_text_untyped for markers and other elements, enabling IR mapping.
  • crates/typstyle-core/src/pretty/mod.rs
    • Added source and ir_atom_recorder fields to PrettyPrinter (feature-gated).
    • Introduced new_with_source constructor, reset_ir_mapping, snapshot_ir_atom_sources, and emit_source_text_untyped methods to manage and record IR atom sources.
  • crates/typstyle-core/src/pretty/text.rs
    • Updated convert_text and space handling to use emit_source_text_untyped, enabling IR mapping for text and spaces.
  • crates/typstyle-wasm/src/lib.rs
    • Added format_ir_with_mapping WASM function to expose IR mapping functionality, converting results to UTF-16 for JavaScript consumption.
  • playground/src/Playground.tsx
    • Updated the useMemo hook to check for and utilize formatter.irMapping when the active output is 'pir', enabling IR mapping visualization in the playground.
  • playground/src/hooks/use-cursor-sync.ts
    • Added setDecorationsMany function to handle multiple output ranges for a single source span.
    • Updated useCursorSync to use setDecorationsMany for IR mapping highlights.
  • playground/src/hooks/use-typst-formatter.ts
    • Added irMapping state to the Formatter interface and useTypstFormatter hook.
    • Integrated the new format_ir_with_mapping WASM function to populate the irMapping state.
  • playground/src/utils/offset-mapping.ts
    • Modified querySpanMapping, querySpanMappingReverse, findContainingSpan, and findContainingSpanReverse to provide a fallback to mapping[0] if no prev or next span is found, preventing potential undefined access.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request successfully implements IR mapping functionality, allowing for precise source-to-IR synchronization in the playground. The instrumentation of PrettyPrinter and the integration with prettyless's tagging system are well-executed. The remapping logic for indentation changes is clever and handles multi-byte characters correctly. However, there are a few critical and medium-level issues to address: a local path dependency in Cargo.toml that will break builds for others, a potential bug in line-ending handling for \r\n in the offset mapping logic, and an opportunity to optimize the joining of IR mappings by avoiding an unnecessary HashMap allocation.

[patch.crates-io]
# tinymist-world = { git = "https://github.com/Myriad-Dreamin/tinymist", rev = "0f9f4644567236bbfcd1b21baaa4b8bdf224ecb3" }
# prettyless = { git = "https://github.com/typstyle-rs/prettyless", rev = "9afa5bdece2dbf14c54ade8bef46cd0828cb952b" }
prettyless = { path = "../prettyless" }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Using a local path dependency for prettyless will break the build for anyone who doesn't have the repository cloned in the exact same relative location. This should be changed to a version requirement (if the changes are published) or a git dependency with a specific revision.

Comment on lines +51 to +67
let src_by_id: HashMap<u32, (&usize, &usize)> = src_atoms
.iter()
.map(|s| (s.atom_id, (&s.src_start, &s.src_end)))
.collect();

let mut mapping: Vec<SpanMapping> = out_atoms
.iter()
.filter_map(|out| {
let (src_start, src_end) = src_by_id.get(&out.atom_id)?;
Some(SpanMapping {
src_start: **src_start,
src_end: **src_end,
out_start: out.out_start,
out_end: out.out_end,
})
})
.collect();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The HashMap here is unnecessary and adds overhead. Since IrAtomRecorder allocates atom_ids sequentially starting from 0 and pushes them into src_atoms, the atom_id is guaranteed to be the index of the atom in the src_atoms slice. You can directly look up the source atom by index, avoiding the map construction and lookups.

    let mut mapping: Vec<SpanMapping> = out_atoms
        .iter()
        .filter_map(|out| {
            let src = src_atoms.get(out.atom_id as usize)?;
            Some(SpanMapping {
                src_start: src.src_start,
                src_end: src.src_end,
                out_start: out.out_start,
                out_end: out.out_end,
            })
        })
        .collect();

Comment on lines +108 to +114
for line in text.lines() {
if !first {
// `lines()` splits on `\n`; preserve newline 1:1.
map[old_pos] = new_pos;
old_pos += 1;
new_pos += 1;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The build_indent_offset_map function assumes that line separators are always 1 byte long (i.e., \n). However, str::lines() also handles \r\n (2 bytes). If the input text contains \r\n, old_pos will drift and the mapping for subsequent lines will be incorrect. While the internal IR likely uses \n, it's safer to handle line endings robustly or explicitly normalize the input to ensure the mapping remains accurate across different environments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant