Skip to content

benoitdelorme/googlemeet-avatar

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MeetAvatar

MeetAvatar

A voice bot that joins Google Meet calls, listens to participants, generates responses via LLM (Gemini), and speaks back with a synthesized voice (ElevenLabs).

How it works

  1. The bot joins a Google Meet call via headless Chromium (Playwright)
  2. It captures participant audio through WebRTC peer connections
  3. A VAD (Voice Activity Detection) detects end of speech (1.2s silence threshold)
  4. Audio is transcribed via ElevenLabs STT (scribe_v1)
  5. The transcript is sent to Gemini to generate a reply
  6. The reply is synthesized to audio via ElevenLabs TTS
  7. The audio is injected back into the meeting via WebRTC track replacement
Participant speaks → WebRTC capture → VAD → ElevenLabs STT
    → Gemini LLM → ElevenLabs TTS → WebRTC playback

Prerequisites

  • Node.js >= 18
  • A Google account (to bypass the waiting room on your own meetings)
  • API keys: ElevenLabs + Google Gemini

Installation

npm install
npx playwright install chromium

Configuration

Copy the example env file and fill in your keys:

cp .env.example .env
Variable Description
ELEVENLABS_API_KEY ElevenLabs API key
ELEVENLABS_VOICE_ID Voice ID to use for TTS
ELEVENLABS_TTS_MODEL TTS model (default: eleven_turbo_v2_5)
GEMINI_API_KEY Google Gemini API key
GEMINI_MODEL Gemini model (default: gemini-2.5-flash-lite)

Google login (one-time setup)

To let the bot join meetings without being stuck in the waiting room, log into a Google account:

npm run login

A browser window opens. Sign into your Google account, then close the browser. The session is persisted in chrome-profile/ and reused by the headless bot.

Running

npm start

Server starts on http://localhost:3000.

API

Join a call

POST /join
Content-Type: application/json

{
  "meetId": "abc-defg-hij",
  "botName": "My Agent",
  "timeout": 60000
}

meetId accepts a Google Meet ID or a full URL. Returns immediately (202):

{ "sessionId": "uuid", "meetUrl": "https://...", "status": "joining" }

Session status

GET /session/:sessionId

Possible statuses: idlejoiningwaiting_roomin_meetingerror | closed

Leave

DELETE /session/:sessionId

List active sessions

GET /sessions

Full example

# Join
SESSION=$(curl -s -X POST http://localhost:3000/join \
  -H "Content-Type: application/json" \
  -d '{"meetId":"abc-defg-hij","botName":"TestBot"}' | jq -r .sessionId)

# Poll status
curl -s http://localhost:3000/session/$SESSION

# Leave
curl -X DELETE http://localhost:3000/session/$SESSION

Project structure

src/
  server.js          # Express server (REST API, session management)
  meet-bot.js        # Google Meet automation (Playwright, WebRTC)
  audio-pipeline.js  # VAD, STT, TTS, Gemini LLM
  login.js           # Google login helper (persistent Chrome profile)
avatar.png           # Reference avatar image
.env.example         # Environment variables template

Lip sync (previous iteration)

An earlier version of this project included real-time lip sync using MuseTalk. The setup was:

  • A separate Python FastAPI server (lipsync/server.py) running on port 8765
  • MuseTalk models (VAE, UNet, Whisper, face parsing) loaded on GPU/MPS/CPU
  • The Node.js bot would POST the TTS audio to /generate and receive back NDJSON-streamed mouth region crops (JPEG sprites at 25fps)
  • The browser would composite the static avatar image + animated mouth crops on a canvas, then inject the resulting video track into WebRTC

Pipeline with lip sync:

TTS audio → MuseTalk server → mouth crops (25fps NDJSON stream)
                                    ↓
Static avatar (fetched from /avatar) + mouth overlay → Canvas → WebRTC video track

This was removed to simplify the setup (no Python/GPU dependency). The current version is audio-only. The lip sync code can be found in the git history if needed.

About

A bot that connects to your Google account and that you can talk to.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors