Skip to content

Integrate Copilot for Automated Language Translation#55917

Closed
jason810496 wants to merge 2 commits intoapache:mainfrom
jason810496:feature/add-agent-translation-framework
Closed

Integrate Copilot for Automated Language Translation#55917
jason810496 wants to merge 2 commits intoapache:mainfrom
jason810496:feature/add-agent-translation-framework

Conversation

@jason810496
Copy link
Member

@jason810496 jason810496 commented Sep 20, 2025

closes: #51975
related: #55604

Why

As noted in #55604 (review), integrating Copilot can help us automatically translate from the source language (English).

What

This change integrates Copilot with the dev/i18n/copilot_translations.py script.

  1. Added the --translate-with-copilot flag: This will translate only the TODO entries. Please ensure you run --add-missing beforehand.
uv run dev/i18n/check_translations_completeness.py --language zh-TW --translate-with-copilot
  1. Added the --with-copilot flag, which can be used together with the existing --add-missing flag: This will add TODO entries and translate them in the same CLI run.
uv run dev/i18n/check_translations_completeness.py --language zh-TW --add-missing --with-copilot

How

I used B00TK1D/copilot-api as a reference and refactored it to be class-based and more robust, including retry logic for API calls and improved error handling.

The authentication flow for the Copilot Translator is as follows:

  1. Obtain a Copilot Access Token:
    • This requires opening a browser and entering a device code.
    • The Copilot Access Token will be saved as .copilot_token.
    • Alternatively, users can copy their own Copilot Access Token and save it as .copilot_token.
  2. Obtain a Copilot Session Token (short-lived; it will auto-refresh if a 401 error is encountered).
  3. Use the Copilot Codex API for translation.

The prompts are structured as follows:

dev/i18n/prompts
├── global.jinja2
└── locales
    └── zh-TW.jinja2

The translation flow for a given language path is:

  1. Initialize the context: str variable as an empty string.
  2. Recursively traverse the JSON structure, updating the context with the current key (context=f"{context}.{key}").
  3. If a value starts with TODO: translate:, call Copilot to translate the value:
    1. Retrieve the language name.
    2. Load the global template as the prompt template.
    3. Render the prompt with language_name, value, key, and context.
    4. Append the language-specific prompt if available.

Demo

Screen.Recording.2025-09-20.at.2.44.12.PM.mov

Future Work

  • Documentaion
  • Refine the global prompt for better translation quality.
  • Improve prompts for each language (depends on language reviewers).
  • Add a new flag or command to create a PR via script.
  • Support a --language all flag to create PRs for all languages at once.

@jason810496 jason810496 changed the title Integrate Copilot for auto Translation Integrate Copilot for Automated Language Translation Sep 20, 2025
@jason810496 jason810496 self-assigned this Sep 20, 2025
Copy link
Contributor

@shahar1 shahar1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing job! Definitely my #PROTM :)
Got some comments - not something too drastic though.
If you could please add high-level instructions for using this ability in the tools section of the i18n policy, it would be great.

@@ -21,6 +21,8 @@
# dependencies = [
Copy link
Contributor

@shahar1 shahar1 Sep 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we should start thinking about a new name for the script, as it does more than just "checking completeness" at this point (also, it's quite a long one) :)
Not urgent for now though - if it is acceptable, I'd prefer to do something about it after the upcoming Airflow Summit as I refer to this script in my talk.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about renaming as just "tool.py" ( or something more simple and universal ), and renaming in further PR will make the change more easy to review.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a fan of tool.py. It's as broad as utils.py. We should avoid it whenever possible.

complete_translations might be a bit better (?)

from jinja2 import Template


COPILOT_CLIENT_ID = "Iv1.b507a08c87ecfe98"
Copy link
Contributor

@shahar1 shahar1 Sep 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this client ID taken from? I've managed to find references in Google, but not official ones.
It's worth documenting it here.

Comment on lines +321 to +322
"max_tokens": 2000,
"temperature": 0.1,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to make these parameters configurable

@jason810496
Copy link
Member Author

No problem! Thanks @shahar1 for review 🙌

@potiuk
Copy link
Member

potiuk commented Oct 3, 2025

Few comments:

This PR is a bit too complex and makes (IMHO) the usability of copilot translation a bit less than the one embedded in IDEs / with interactivity. I am not sure if we want to do it - this might - paradoxically - make the translations less good because people will be too "lazy" to review them and it will not be as easy to review as with interactive copilot sessions.

I think the biggest value of the IDE-driven translation is the UI/UX where you see together the original (with TODO) and translation and you can individually approve blocks of translation or even modify them directly in the editor. This is why running "please translate all todo: phrases" in IntelliJ or VSCode integrated copilot is so nice, because you get HITL (Human-In-The-Loop) where you can review and look at the translations individually after they are done. This automated translation here has the drawback, that when you run automated translation like that and even try to review it in the commit, you do not see original English text.

This is what you see when you do translations in IntelliJ's copilot:

Screenshot 2025-10-03 at 02 24 52

I can individually review before/after as well as even manually modify the translations. And it's really good UX.

I think there is a case where we want to translate "all languages" - when we want to open single PR for all translators (I tried it before the release) - but I think this should be reserved only for "all languages" not for single language (there interactive approach is way better).

And even if we want to do it automatically - all the token/etc. is not needed. Recently GitHub released "copilot" cli, I tested it and it works very well. In interactive mode it can even provide similar green/red display in the terminal (without capability of correcting it manually yet - but this is likely something they will add). It's just enough to start copilot CLI with subprocess and it will do all the gh token handling thing.

copilot --allow-all-tools --add-dir . -p 'Please translate the remaining "TODO: translate" in airflow-core/src/airflow/ui/public/i18n/locales/pl/'

Also there is no need to split json and do individual translations, it's way faster and better (more relevant context) if you pass the whole file and ask AI to translate alll TODO: entries in it. It does not even have to work on individual strings, AI is capable of finding the right strings on its own.

@shahar1
Copy link
Contributor

shahar1 commented Oct 3, 2025

This PR is a bit too complex and makes (IMHO) the usability of copilot translation a bit less than the one embedded in IDEs / with interactivity. I am not sure if we want to do it - this might - paradoxically - make the translations less good because people will be too "lazy" to review them and it will not be as easy to review as with interactive copilot sessions.

I think the biggest value of the IDE-driven translation is the UI/UX where you see together the original (with TODO) and translation and you can individually approve blocks of translation or even modify them directly in the editor. This is why running "please translate all todo: phrases" in IntelliJ or VSCode integrated copilot is so nice, because you get HITL (Human-In-The-Loop) where you can review and look at the translations individually after they are done. This automated translation here has the drawback, that when you run automated translation like that and even try to review it in the commit, you do not see original English text.

This is what you see when you do translations in IntelliJ's copilot:

Screenshot 2025-10-03 at 02 24 52 I can individually review before/after as well as even manually modify the translations. And it's really good UX.

I understand your concerns regarding the complexity of the PR in its current state, and I'm sure that Jason will do his best to simplify it. However, regarding the latter statement - eventually the responsibility for reviewing is of the Translation Owners (or, if it's their own PR - their translation peers) - so even if the entire PR was made by running a single CLI command, we do expect the reviewer(s) to approve the quality of each and every term in the translation (they could use tooling for that as well). Of course, after running the command, the author should do it as well - but I don't think that we need to limit them only to the method that you suggested. From my personal experience (even before introducing the --add-missing), I did quite well with opening the English translation side by side the Hebrew/Arabic one, and translate line-by-line, like this:
image
Working with Copilot plugin has it perks, but I have to say that it's often quite buggy - so mileage might differ regarding UX.

I think there is a case where we want to translate "all languages" - when we want to open single PR for all translators (I tried it before the release) - but I think this should be reserved only for "all languages" not for single language (there interactive approach is way better).

Agree about this one, although we need to think about improving the reviewing process in these cases*. For now it would be nice to have this ability as part of the script.

* - the more translations we have, we'll more reviewers in the same PR - it might create a bottleneck (reviewers' availability/GitHub limitations for max. num of reviewers)...but that's for another thread :)

And even if we want to do it automatically - all the token/etc. is not needed. Recently GitHub released "copilot" cli, I tested it and it works very well. In interactive mode it can even provide similar green/red display in the terminal (without capability of correcting it manually yet - but this is likely something they will add). It's just enough to start copilot CLI with subprocess and it will do all the gh token handling thing.

copilot --allow-all-tools --add-dir . -p 'Please translate the remaining "TODO: translate" in airflow-core/src/airflow/ui/public/i18n/locales/pl/'

It would be great to give this one a shot! (what are the odds that we'll also be able to use it in the CI at some point?)

@jason810496
Copy link
Member Author

jason810496 commented Oct 5, 2025

Thanks for the feedback Jarek! If this is the case, I will replace the "CopilotTranslator" class with just a subprocess of "Copilot CLI" and agree, the previous implementation is actually kind of workaround implementation before Copilot CLI released.

IMO, the prompts structure should still be useful even if we want to switch to Copilot CLI, because we could standardize the prompt and customize prompts for each language in a structured way.

@potiuk
Copy link
Member

potiuk commented Oct 6, 2025

@jason810496 :

Thanks for the feedback Jarek! If this is the case, I will replace the "CopilotTranslator" class with just a subprocess of "Copilot CLI" and agree, the previous implementation is actually kind of workaround implementation before Copilot CLI released.

Yeah. I think that's the most important part - it's simply way simpler to get the same result.

IMO, the prompts structure should still be useful even if we want to switch to Copilot CLI, because we could standardize the prompt and customize prompts for each language in a structured way.

I am not so sure we need anything else than "translate the TODO: following translations already present". I think we do not have to provide a lot of "manual/per language" context on how to translate each langugage - it's not needed IMHO, simply because AI is pretty good in finding the rules based on the context. Pretty much all the translations we are doing are incremental -based on hundreds of already made translations in the .json files.

There are basically two stages:

  • you do first time translation - you do not yet know the rules you translate it for the first time and the rules are created as you do it, This happens exactly once per language.
  • you do incremental translation - where you already have 100s of translations done and you want to add few more that were added since last time.

This is quite special case where the "solution space" is very limited and task is very simple. To be honest, if we need to add any more context and prompt in this case than "follow translations already done", this means that AI is not doing it's job well - it should figure out all the rules that were already applied on it's own and apply it well. This is precisely what AI models are supposed to do. They excel in it.

But of course maybe it's a good idea for the initial translation to add some prompts like "do it in the way that uses less space e for the UI" etc. , so maybe it makes sense for the first time run (but i am not sure it should be different per language, and that people who will add languages will know what to add their "per language" prompt. But this can also be done interactivel in the first translation - simply because once we translate several few hundreds of those translations, AI should learn from the context and pick up the style and approach from those already translated messages without the need of the additional contest. Or so I think at least :)

@shahar1

I did quite well with opening the English translation side by side the Hebrew/Arabic one, and translate line-by-line, like this:

One potential issue I see with it, is that while it's fine for bulk translation, it's not really good for incremental translations - especially when we skip the "TODO:" phase. What you compare then when you compare two files en + target language are two files with completed translations, but you don't see what has changed really when you look at those two different files. Simply you have to mentally do two comparisions:

  • find which translations were addded (easy to do when you compare changed file with previous version)
  • compare those new translations with english ones

I don't think that the current IDEs or manual can help with easily doing both comparisions at the same time. And the copilot interactive "accept" view does this exactly - it shows you what changed, what was the english phrase (with TODO:) and allows you to single click - approve/reject (or even correct it) - and move to the next change. note that often each incremental change will contain several changes

Just to simmarize it - I am not against doing it but I doubt it will make things easier :). But maybe it's just me - we can always add this option to auto-translate and ask translators if this is good.

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Nov 21, 2025
@jason810496 jason810496 removed the stale Stale PRs per the .github/workflows/stale.yml policy file label Nov 21, 2025
@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Jan 23, 2026
@jason810496 jason810496 removed the stale Stale PRs per the .github/workflows/stale.yml policy file label Jan 23, 2026
@shahar1
Copy link
Contributor

shahar1 commented Jan 24, 2026

Now that I have some more "hands-on" experience with AI agents, I would like to re-iterate on this one -
I think that instead of maintaining a full script, it might be better to store the prompts within .github/prompts, and maybe we could define a "skill" of translation (maybe per language) that agents will utilize to ensure consistent translation. Feel free to explore that area :)

@jason810496
Copy link
Member Author

Now that I have some more "hands-on" experience with AI agents, I would like to re-iterate on this one - I think that instead of maintaining a full script, it might be better to store the prompts within .github/prompts, and maybe we could define a "skill" of translation (maybe per language) that agents will utilize to ensure consistent translation. Feel free to explore that area :)

Yes, I very much agree with that after the recent agentic evolution! I also feel that having "skills" is more appropriate approach compared to having additional full script.

@potiuk
Copy link
Member

potiuk commented Feb 14, 2026

Feel free to try :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement an Agent Skill for UI translations

4 participants