feat: generate rich profile READMEs with badges, repos, and contributions#3962
feat: generate rich profile READMEs with badges, repos, and contributions#3962marcusquinn merged 1 commit intomainfrom
Conversation
…ions cmd_init now calls _generate_rich_readme() which fetches the user's GitHub profile data and auto-generates a README with: - Language badges auto-detected from the user's repos (23 languages supported) - Common tooling badges (Docker, Linux, Git) - Bio line from GitHub profile - Projects section (non-fork repos with descriptions) - Contributions section (forks linked to upstream parent URLs) - Connect section (blog, Twitter/X, GitHub follow badges) - STATS markers for aidevops auto-update Replaces the bare template that only had stat markers.
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly upgrades the profile README generation capability within the system. Instead of a minimal template, the Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
Important Review skippedAuto reviews are limited based on label configuration. 🚫 Review skipped — only excluded labels are configured. (1)
Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Mon Mar 9 05:36:59 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
|
There was a problem hiding this comment.
Code Review
This pull request introduces a feature to generate rich profile READMEs. However, it introduces a security vulnerability related to Markdown injection and potential Cross-Site Scripting (XSS) due to a lack of proper output encoding for user-controlled fields from the GitHub API embedded directly into the generated README.md file. Additionally, there are performance issues in how the GitHub API data is processed, specifically an N+1 API call problem when handling forked repositories and inefficient use of jq calls within loops.
| local contrib_repos="" | ||
| while IFS= read -r row; do | ||
| [[ -z "$row" ]] && continue | ||
| local rname rdesc rurl | ||
| rname=$(echo "$row" | jq -r '.name') | ||
| rdesc=$(echo "$row" | jq -r '.description // "No description"') | ||
| rurl=$(echo "$row" | jq -r '.html_url') | ||
| # Try to get parent repo URL | ||
| local parent_url | ||
| parent_url=$(gh api "repos/${gh_user}/${rname}" --jq '.parent.html_url // empty' 2>/dev/null) | ||
| if [[ -n "$parent_url" ]]; then | ||
| contrib_repos="${contrib_repos}- **[${rname}](${parent_url})** -- ${rdesc}"$'\n' | ||
| else | ||
| contrib_repos="${contrib_repos}- **[${rname}](${rurl})** -- ${rdesc}"$'\n' | ||
| fi | ||
| done < <(echo "$repos_json" | jq -c '.[] | select(.fork == true)') |
There was a problem hiding this comment.
This loop is inefficient and will perform poorly for users with many forked repositories. It suffers from two issues:
- N+1 API Calls: It makes a separate
gh apicall for every forked repository to fetch the parent URL, which is already available in the initialrepos_jsondata. - Multiple
jqInvocations: It callsjqmultiple times inside the loop for each repository.
You can resolve both issues by using a single jq command to process the repos_json data, which is significantly more performant and simplifies the code.
| local contrib_repos="" | |
| while IFS= read -r row; do | |
| [[ -z "$row" ]] && continue | |
| local rname rdesc rurl | |
| rname=$(echo "$row" | jq -r '.name') | |
| rdesc=$(echo "$row" | jq -r '.description // "No description"') | |
| rurl=$(echo "$row" | jq -r '.html_url') | |
| # Try to get parent repo URL | |
| local parent_url | |
| parent_url=$(gh api "repos/${gh_user}/${rname}" --jq '.parent.html_url // empty' 2>/dev/null) | |
| if [[ -n "$parent_url" ]]; then | |
| contrib_repos="${contrib_repos}- **[${rname}](${parent_url})** -- ${rdesc}"$'\n' | |
| else | |
| contrib_repos="${contrib_repos}- **[${rname}](${rurl})** -- ${rdesc}"$'\n' | |
| fi | |
| done < <(echo "$repos_json" | jq -c '.[] | select(.fork == true)') | |
| contrib_repos=$(echo "$repos_json" | jq -r '.[] | select(.fork == true) | "- **[\\(.name)](\\(.parent.html_url // .html_url))** -- \\(.description // \"No description\")"') |
References
- Consolidate multiple 'jq' calls into a single pass where possible to improve performance and script efficiency.
- Prefer using URLs provided directly in API responses (e.g., 'html_url', 'web_url') instead of manually constructing them via string concatenation to ensure robustness.
| own_repos="${own_repos}- **[${rname}](${rurl})** -- ${rdesc}"$'\n' | ||
| done < <(echo "$repos_json" | jq -c ".[] | select(.fork == false and .name != \"${gh_user}\")") | ||
|
|
||
| # Build contributions section (forks with description) | ||
| local contrib_repos="" | ||
| while IFS= read -r row; do | ||
| [[ -z "$row" ]] && continue | ||
| local rname rdesc rurl | ||
| rname=$(echo "$row" | jq -r '.name') | ||
| rdesc=$(echo "$row" | jq -r '.description // "No description"') | ||
| rurl=$(echo "$row" | jq -r '.html_url') | ||
| # Try to get parent repo URL | ||
| local parent_url | ||
| parent_url=$(gh api "repos/${gh_user}/${rname}" --jq '.parent.html_url // empty' 2>/dev/null) | ||
| if [[ -n "$parent_url" ]]; then | ||
| contrib_repos="${contrib_repos}- **[${rname}](${parent_url})** -- ${rdesc}"$'\n' | ||
| else | ||
| contrib_repos="${contrib_repos}- **[${rname}](${rurl})** -- ${rdesc}"$'\n' |
There was a problem hiding this comment.
This section embeds user-controlled repository names (rname), descriptions (rdesc), and URLs (rurl, parent_url) directly into the Markdown content without proper sanitization. This creates a risk of Markdown injection and potential Cross-Site Scripting (XSS) if an attacker provides malicious data. Additionally, the current while loop invokes jq multiple times for each repository, leading to inefficiency. The suggested code addresses the performance by consolidating jq calls to generate the markdown list directly, but further sanitization of user-controlled fields is recommended to mitigate the XSS vulnerability.
| own_repos="${own_repos}- **[${rname}](${rurl})** -- ${rdesc}"$'\n' | |
| done < <(echo "$repos_json" | jq -c ".[] | select(.fork == false and .name != \"${gh_user}\")") | |
| # Build contributions section (forks with description) | |
| local contrib_repos="" | |
| while IFS= read -r row; do | |
| [[ -z "$row" ]] && continue | |
| local rname rdesc rurl | |
| rname=$(echo "$row" | jq -r '.name') | |
| rdesc=$(echo "$row" | jq -r '.description // "No description"') | |
| rurl=$(echo "$row" | jq -r '.html_url') | |
| # Try to get parent repo URL | |
| local parent_url | |
| parent_url=$(gh api "repos/${gh_user}/${rname}" --jq '.parent.html_url // empty' 2>/dev/null) | |
| if [[ -n "$parent_url" ]]; then | |
| contrib_repos="${contrib_repos}- **[${rname}](${parent_url})** -- ${rdesc}"$'\n' | |
| else | |
| contrib_repos="${contrib_repos}- **[${rname}](${rurl})** -- ${rdesc}"$'\n' | |
| own_repos=$(echo "$repos_json" | jq -r ".[] | select(.fork == false and .name != \"${gh_user}\") | \"- **[\\(.name)](\\(.html_url))** -- \\(.description // \"No description\")\"") |
References
- Consolidate multiple 'jq' calls into a single pass where possible to improve performance and script efficiency.
| Haskell) echo '' ;; | ||
| Vue) echo '' ;; | ||
| Svelte) echo '' ;; | ||
| *) echo "" ;; |
| connect="${connect}[](${blog})"$'\n' | ||
| fi | ||
| if [[ -n "$twitter" ]]; then | ||
| connect="${connect}[](https://twitter.com/${twitter})"$'\n' |
There was a problem hiding this comment.
The blog and twitter fields from the user's GitHub profile are used to construct Markdown links and badges without sanitization. A malicious user could set these fields to javascript: URIs or craft them to break the Markdown structure. While GitHub's own sanitization may prevent XSS, this still allows for Markdown injection and phishing.
| echo "# ${display_name}" | ||
| echo "" | ||
| if [[ -n "$bio" ]]; then | ||
| echo "**${bio}**" |
| display_name=$(echo "$user_json" | jq -r '.name // empty' 2>/dev/null) | ||
| display_name="${display_name:-$gh_user}" | ||
| bio=$(echo "$user_json" | jq -r '.bio // empty' 2>/dev/null) | ||
| blog=$(echo "$user_json" | jq -r 'select(.blog != null and .blog != "") | .blog' 2>/dev/null) | ||
| twitter=$(echo "$user_json" | jq -r 'select(.twitter_username != null and .twitter_username != "") | .twitter_username' 2>/dev/null) |
There was a problem hiding this comment.
This section has a couple of areas for improvement:
- Error Suppression: The
jqcalls suppress stderr with2>/dev/null. This can hide important errors and goes against the project's general rules. - Inefficient Parsing: There are four separate
jqcalls to parse different fields fromuser_json. This can be consolidated into a single, more efficient call.
Here is a suggested refactoring that addresses both points.
| display_name=$(echo "$user_json" | jq -r '.name // empty' 2>/dev/null) | |
| display_name="${display_name:-$gh_user}" | |
| bio=$(echo "$user_json" | jq -r '.bio // empty' 2>/dev/null) | |
| blog=$(echo "$user_json" | jq -r 'select(.blog != null and .blog != "") | .blog' 2>/dev/null) | |
| twitter=$(echo "$user_json" | jq -r 'select(.twitter_username != null and .twitter_username != "") | .twitter_username' 2>/dev/null) | |
| IFS=$'\n' read -r display_name bio blog twitter < <(echo "$user_json" | jq -r '(.name // ""), (.bio // ""), (.blog // ""), (.twitter_username // "")' || printf '\n\n\n\n') | |
| display_name="${display_name:-$gh_user}" |
References
- Consolidate multiple 'jq' calls into a single pass where possible to improve performance and script efficiency.
- In shell scripts with 'set -e' enabled, use '|| true' to prevent the script from exiting when a command like 'jq' fails on an optional lookup. Do not suppress stderr with '2>/dev/null' so that actual syntax or system errors remain visible for debugging.
- Consolidate 4 separate jq calls for user profile into single pass with tab-delimited output, remove stderr suppression (Gemini #6) - Consolidate own repos loop into single jq pass, eliminating per-row jq invocations (Gemini #2) - Replace sequential N+1 gh api calls for fork parent URLs with parallel xargs -P 6 batch fetch (Gemini #1) - Add _sanitize_md() and _sanitize_url() helpers to sanitize user-controlled fields (display_name, bio, blog, twitter) before embedding in markdown, preventing markdown injection and javascript: URI attacks (Gemini #4, #5) Ref: PR #3962 review comments from gemini-code-assist
…#3963) * fix: address Gemini code review feedback on profile README generation - Consolidate 4 separate jq calls for user profile into single pass with tab-delimited output, remove stderr suppression (Gemini #6) - Consolidate own repos loop into single jq pass, eliminating per-row jq invocations (Gemini #2) - Replace sequential N+1 gh api calls for fork parent URLs with parallel xargs -P 6 batch fetch (Gemini #1) - Add _sanitize_md() and _sanitize_url() helpers to sanitize user-controlled fields (display_name, bio, blog, twitter) before embedding in markdown, preventing markdown injection and javascript: URI attacks (Gemini #4, #5) Ref: PR #3962 review comments from gemini-code-assist * fix: address second round of Gemini review feedback - Remove 2>/dev/null from xargs fork fetch (|| true suffices) - Tighten _sanitize_url to reject markdown-breaking chars in URLs using glob patterns (bash regex [^...] with escaped parens is unreliable across bash versions) - Strip tabs/newlines from jq user profile output to prevent tab-delimiter injection in bio/description fields - Sanitize repo names and descriptions in both own repos (jq gsub) and fork repos (_sanitize_md) before markdown embedding - Keep printf '%s\n' for own_repos (bash $() strips trailing newlines, so the explicit \n is needed for section spacing) Ref: PR #3963 review comments from gemini-code-assist



Summary
cmd_initnow generates a rich profile README by fetching the user's GitHub profile data viagh api, instead of seeding a bare template with only stat markersChanges
New functions in
profile-readme-helper.sh_lang_badge()— maps language name to shields.io badge markdown (23 languages + generic fallback)_generate_rich_readme()— fetches GitHub user profile + repos, generates full README with badges, projects, contributions, connect section, and STATS markersModified
cmd_init()— calls_generate_rich_readme()instead of writing a bare templateTesting
johnwaldo— correctly detected Ruby/Shell/TypeScript, listed 1 own project, 6 contributions with upstream parent URLs, GitHub follow badge (no blog/twitter)