Skip to content

Test#400

Open
gagan-a11y wants to merge 138 commits intoZackriya-Solutions:mainfrom
gagan-a11y:test
Open

Test#400
gagan-a11y wants to merge 138 commits intoZackriya-Solutions:mainfrom
gagan-a11y:test

Conversation

@gagan-a11y
Copy link
Copy Markdown

Description

[Provide a detailed description of your changes]

Related Issue

[Link to the issue this PR addresses (e.g., "Fixes #123")]

Type of Change

  • Bug fix
  • New feature
  • Documentation update
  • Performance improvement
  • Code refactoring
  • Other (please describe)

Testing

  • Unit tests added/updated
  • Manual testing performed
  • All tests pass

Documentation

  • Documentation updated
  • No documentation needed

Checklist

  • Code follows project style
  • Self-reviewed the code
  • Added comments for complex code
  • Updated README if needed
  • Branch is up to date with devtest
  • No merge conflicts

Screenshots (if applicable)

[Add screenshots here if your changes affect the UI]

Additional Notes

[Add any additional information that might be helpful for reviewers]

- Add PRD with timeline and architecture diagrams
- Add progress report tracking Phase 0 completion
- Add tech stack guide explaining 24 technologies
- Add Phase 1 implementation plan
- Add documentation index and navigation
- Browser audio capture via MediaRecorder API
- WebSocket streaming to backend
- ffmpeg WebM→WAV conversion
- Whisper integration with multilingual support
- Test UI at /test-audio
- Real-time transcription (~2-3s latency)
- Remove Tauri configuration and build files
- Delete Tauri-specific dependencies
- Clean up related scripts and references
- Add Groq Whisper API integration for low-latency transcription (~1-2s)
- Implement AudioWorklet for real-time 48kHz to 16kHz downsampling
- Add StreamingTranscriptionManager with VAD and rolling buffer
- Replace batch processing with continuous PCM streaming
- Remove old batch audio processing code and test pages
- Update documentation for Phase 1.5 completion
FEATURES:
- Ask AI: Chat about meetings with cross-meeting context search
- Catch Me Up: Quick summary for late joiners
- Context Linking: Link meetings to share context across sessions
- Vector Store: ChromaDB-based semantic search across all meetings

DOCS:
- Add meeting-copilot-docs/ with architecture and optimization guides
- Document 15 future optimizations (P0/P1/P2 prioritized)

COMPONENTS:
- ChatInterface: AI chat with streaming responses
- MeetingSelector: Link meetings for shared context
- Improved VAD and rolling buffer for transcription
This commit introduces Google Gemini as a primary LLM provider across the application and significantly enhances the 'Ask AI' capabilities.

Key Changes:

Backend:
- feat(db): Add `geminiApiKey` to the settings table with a simple migration.
- feat(api): Integrate Gemini for streaming responses in the 'Ask AI' chat endpoint.
- feat(api): Add Gemini as a supported provider for the 'Catch Up' summary feature.
- feat(ai): Implement intelligent context-linking and web search capabilities using Gemini Flash to determine when external information is needed.

Frontend:
- feat(settings): Add Gemini to the list of available providers in the Model Settings modal.
- feat(ui): The 'Ask AI' chat now dynamically uses the user's configured LLM, defaulting to Gemini.
- feat(ui): The 'Catch Up' feature now uses the configured model, with a fallback to Gemini for unsupported providers.

Docs:
- docs(optimizations): Add a proposal for a 'Custom Dictionary' feature to `FUTURE_OPTIMIZATIONS.md` to improve transcription accuracy for domain-specific terms.
- Add user context input for notes generation (fed to AI prompt)
- Implement AI-powered web search with Gemini (replaces DuckDuckGo)
- Add markdown rendering in chat interface
- Make chat panel resizable (350-800px drag handle)
- Fix /save-summary endpoint (was missing, causing saves to fail)
- Fix Sidebar delete/rename to use correct endpoints
- Fix get_transcript_data to query summary_processes directly
- Use gemini-2.0-flash model consistently across features
- Replace DuckDuckGo with SerpAPI Google Search (free tier)
- Crawl top results with httpx and extract content with trafilatura
- Gemini synthesizes findings with inline citations
- Filter non-English domains for better results
- Increase context prompt from 100 to 300 chars for better entity/term consistency
- Add prompt parameter to Groq translation mode (was missing)
- Add debug logging for context usage tracking

Benefits: names stay consistent (John not Jon), technical terms preserved (Kubernetes not Cube Netties)
- Revert context prompt from 300 to 100 chars
- Remove prompt from translation mode (caused garbled output)
- Keep original working configuration
…peline and remove Whisper dependency and related build steps
… context

- Add grounded system prompt to prevent hallucinations while being helpful
- Change context strategy: full transcripts for current + linked meetings
- Linked meetings: fetch full DB transcripts when triggered by keywords
- Global search: use vector search with 20 chunks when explicitly triggered
- Remove context truncation limits (Gemini 2.0 Flash has 1M token limit)
- Add keyword-only detection for linked meetings (no LLM classifier)
- Improve history handling: first 2 + last 8 messages for continuity
- Add debug logging for Gemini streaming
- Document new context flow with full transcript approach
- Add trigger keywords for each context type
- Include token estimation for extreme cases
- Document grounded prompt strategy
- Add usage examples
- Added explicit instructions for handling speech-to-text transcription errors
- Gemini now uses context to infer correct spellings for technical terms, names, acronyms
- Added 'Transcription Corrections' section at end of generated notes
- Shows what corrections were made from original transcript for transparency
- Replaced fixed 8s timeout with multi-condition triggers
- Silence threshold: 1.2s (balances speed vs clean output)
- Punctuation trigger: sentence + 3s speech duration
- Max timeout: 12s (safety net)
- Improved deduplication: 3-grams, 0.35 threshold, 50-word window
- Speech duration tracking for intelligent finalization
meinhoongagan and others added 30 commits February 24, 2026 08:46
- Removed Celery chunking and GCS signed URL dependencies for diarization.
- Implemented direct parallel byte processing in DiarizationService.
- Fixed bug in _split_wav_for_parallel: calculate duration from PCM bytes instead of unreliable WAV header.
- Optimized alignment with intervaltree and Levenshtein (added to requirements.txt).
- Updated benchmark scripts and added debugging utilities.
…te via Gemini

- Switch diarization from Groq 'translations' endpoint (unstable for Hinglish) to 'transcriptions' endpoint (high fidelity).
- Add prompt engineering to Groq client to explicitly handle Hindi/English switching.
- Implement post-alignment translation step using Gemini (via ) to ensure final output is English.
- Clean 'undefined' artifacts before translation.
- Add frontend support for displaying translated status.
- Restore missing 'align_with_transcripts' method in DiarizationService
- Add input/output cleaning for 'undefined' artifacts in translation
- Improve Groq prompt to handle Hindi/English code-switching explicitly
- Fix benchmark script arguments
- Updated GCP service account path in docker-compose files to point to ./app/gcp-service-account.json
- Updated meeting URL in reminder emails to use production domain
- Added user journey documentation
… emails

- Added Redis distributed locking to CalendarReminderScheduler
- Prevents multiple Gunicorn workers from sending duplicate emails
- Ensures single-worker execution for background tasks
- Updated default credentials path logic to check common locations
- Fixes issue where credentials file was not found even when present
- Improves robustness of GCP authentication in different environments
- Added fallback to manual JSON loading if strict service_account_file parser fails
- Explicitly pass project_id to storage Client
- Resolves 'missing fields' error for valid service account JSONs
- feat(audio): restore dynamic prompting with meeting context
- fix(diarization): resolve DiarizationResult init error
- fix(docker): update Redis URL and healthcheck
- feat(Page.tsx): add setup reminders
- docs(gitignore): ignore txt files and gcp secrets
- Fixes Next.js build error 'useSearchParams() should be wrapped in a suspense boundary'
…ix transcript errors

- Prioritize environment variables for Gemini API keys in backend

- Fix NameError for GeminiModel in transcript.py

- Add dismissible warning prompt for diarized transcripts in UI

- Add dropdown menu for generating notes from diarized transcripts

- Fix backend Redis connection URL and nginx.conf mount
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

where id database data?why setting table has not data?

3 participants