Skip to content

feat: decouple frontend to GitHub Pages and add auto-sleep/wake flow#123

Open
MostlyKIGuess wants to merge 1 commit intomainfrom
aws-optimizations
Open

feat: decouple frontend to GitHub Pages and add auto-sleep/wake flow#123
MostlyKIGuess wants to merge 1 commit intomainfrom
aws-optimizations

Conversation

@MostlyKIGuess
Copy link
Copy Markdown
Member

Implements the plan #90

Frontend (docs/, served by GitHub Pages):

  • New static pages: index, dashboard, oauth-login, admin-login, request-key, admin. Each page loads config.js + auth.js + api.js + wake.js so the site works with no build step.
  • assets/js/config.js: runtime-editable API_BASE_URL, WAKE_LAMBDA_URL and WAKE_TOKEN so the same bundle deploys to staging/prod by swapping one file.
  • assets/js/auth.js: stores the API key in localStorage, captures ?api_key= from the OAuth redirect and strips it from the URL so it does not land in history/Referer.
  • assets/js/api.js: thin fetch wrapper that injects X-API-Key and hands network errors to the wake banner.
  • assets/js/wake.js: auto-shows a "Wake Up Server" banner on network failure, POSTs the Lambda URL with X-Wake-Token, then polls /api/health every 5s until the backend responds, then reloads.
  • assets/js/dashboard.js + admin.js: port the existing Jinja dashboard and admin-panel logic to pure JSON calls.

Backend (REST surface for the static frontend):

  • GET /api/health - cheap probe for the wake poller.
  • GET /api/user - returns name/email/api_key/can_change_model/
    picture/quota for the caller's X-API-Key so
    the dashboard can render without Jinja.
  • POST /api/request-key - JSON version of the key-request form.
  • GET /api/admin/keys - pending/approved/denied keys as JSON for the
    static admin panel. Existing POST
    /admin/approve|deny|toggle-* already accept
    X-API-Key via get_current_user.
  • OAuth /auth/github and /auth/google now accept a ?frontend_redirect=
    query param, validate its origin against ALLOWED_ORIGINS (open-
    redirect guard), stash it in the session, and the callback hands
    the user back to that URL with ?api_key= appended. Falls back to
    settings.FRONTEND_URL when not provided.
  • CORS switched from the broken allow_credentials + "*" combo to an
    explicit allow-list driven by ALLOWED_ORIGINS.

Config / layout:

  • New settings: FRONTEND_URL, ALLOWED_ORIGINS (both in .example.env).
  • Moved RAG PDFs out of docs/ (which is now the GH Pages root) into rag_docs/. DOC_PATHS default and Dockerfile COPY updated to match.

AWS (aws/):

  • aws/lambda/wake_server.py: Lambda, handler=handler. Validates X-Wake-Token, describes the instance, calls ec2:StartInstances only when state=stopped. Returns CORS headers so GH Pages can call it from the browser.
  • aws/README.md: step-by-step for the CloudWatch, "CPUUtilization<5% for 30 min -> Stop" alarm, the IAM role, Lambda creation, Function URL + reserved concurrency, and how to wire the frontend + backend envs together. Also captures the known limitation (public WAKE_TOKEN) and the zong0728 follow-up idea of per-IP throttling via DynamoDB.

Closes #90 (pending Phase 2 CloudWatch alarm, which
is an AWS-console step documented in aws/README.md).

Implements the plan #90

Frontend (docs/, served by GitHub Pages):
  - New static pages: index, dashboard, oauth-login, admin-login,
    request-key, admin. Each page loads config.js + auth.js + api.js +
    wake.js so the site works with no build step.
  - assets/js/config.js: runtime-editable API_BASE_URL, WAKE_LAMBDA_URL
    and WAKE_TOKEN so the same bundle deploys to staging/prod by
    swapping one file.
  - assets/js/auth.js: stores the API key in localStorage, captures
    ?api_key= from the OAuth redirect and strips it from the URL so it
    does not land in history/Referer.
  - assets/js/api.js: thin fetch wrapper that injects X-API-Key and
    hands network errors to the wake banner.
  - assets/js/wake.js: auto-shows a "Wake Up Server" banner on network
    failure, POSTs the Lambda URL with X-Wake-Token, then polls
    /api/health every 5s until the backend responds, then reloads.
  - assets/js/dashboard.js + admin.js: port the existing Jinja dashboard
    and admin-panel logic to pure JSON calls.

Backend (REST surface for the static frontend):
  - GET  /api/health    - cheap probe for the wake poller.
  - GET  /api/user      - returns name/email/api_key/can_change_model/
    picture/quota for the caller's X-API-Key so
    the dashboard can render without Jinja.
  - POST /api/request-key - JSON version of the key-request form.
  - GET  /api/admin/keys  - pending/approved/denied keys as JSON for the
    static admin panel. Existing POST
    /admin/approve|deny|toggle-* already accept
    X-API-Key via get_current_user.
  - OAuth /auth/github and /auth/google now accept a ?frontend_redirect=
    query param, validate its origin against ALLOWED_ORIGINS (open-
    redirect guard), stash it in the session, and the callback hands
    the user back to that URL with ?api_key= appended. Falls back to
    settings.FRONTEND_URL when not provided.
  - CORS switched from the broken allow_credentials + "*" combo to an
    explicit allow-list driven by ALLOWED_ORIGINS.

Config / layout:
  - New settings: FRONTEND_URL, ALLOWED_ORIGINS (both in .example.env).
  - Moved RAG PDFs out of docs/ (which is now the GH Pages root) into
    rag_docs/. DOC_PATHS default and Dockerfile COPY updated to match.

AWS (aws/):
  - aws/lambda/wake_server.py: Lambda, handler=handler.
    Validates X-Wake-Token, describes the instance, calls
    ec2:StartInstances only when state=stopped. Returns CORS headers
    so GH Pages can call it from the browser.
  - aws/README.md: step-by-step for the CloudWatch,
    "CPUUtilization<5% for 30 min -> Stop" alarm, the IAM role, Lambda
    creation, Function URL + reserved concurrency, and how to wire the
    frontend + backend envs together. Also captures the known
    limitation (public WAKE_TOKEN) and the zong0728 follow-up idea of
    per-IP throttling via DynamoDB.

  Closes #90 (pending Phase 2 CloudWatch alarm, which
  is an AWS-console step documented in aws/README.md).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Eliminate Idle EC2 Costs along with Simplyfying the Architecture

2 participants