Skip to content

Commit 55e2cf0

Browse files
kyteinskybackportbot[bot]
authored andcommitted
feat(AI/LiveTranscription): add live translation docs (#14045)
Signed-off-by: Anupam Kumar <[email protected]>
1 parent 5923652 commit 55e2cf0

File tree

3 files changed

+42
-16
lines changed

3 files changed

+42
-16
lines changed

admin_manual/ai/app_live_transcription.rst

Lines changed: 38 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,14 @@
1-
==============================================================
2-
App: Live Transcription in Nextcloud Talk (live_transcription)
3-
==============================================================
1+
==============================================================================
2+
App: Live Transcription and Translation in Nextcloud Talk (live_transcription)
3+
==============================================================================
44

55
.. _ai-live-transcription:
66

7-
This app provides live transcription of speech in Nextcloud Talk calls using open source AI models provided by `Vosk <https://alphacephei.com/vosk/>`_.
8-
The transcription is done on your own server, preserving your privacy and data sovereignty.
7+
| This app provides live transcription and translation of speech in Nextcloud Talk calls using open source AI models provided by `Vosk <https://alphacephei.com/vosk/>`_.
8+
| The transcription is done on your own server, preserving your privacy and data sovereignty, while the translation is done using a translation task processing provider like the :ref:`translate2 app <ai-app-translate2>`. `OpenAI and LocalAI integration <https://apps.nextcloud.com/apps/integration_openai>`_ and `DeepL integration <http://apps.nextcloud.com/apps/integration_deepl>`_ apps will soon also be supported for translation.
99
10-
A good set of language models are auto-downloaded. They include Arabic, Arabic (Tunisian), Breton, Catalan, Czech, German, English, Esperanto, Spanish, Persian (Farsi), French, Hindi, Italian, Japanese, Kazakh, Korean, Dutch, Polish, Portuguese (Brazilian), Russian, Telegu, Tajik, Turkish, Ukrainian, Uzbek, Vietnamese and Chinese.
10+
| A good set of language models for transcription are auto-downloaded. They include Arabic, Arabic (Tunisian), Breton, Catalan, Czech, German, English, Esperanto, Spanish, Persian (Farsi), French, Hindi, Italian, Japanese, Kazakh, Korean, Dutch, Polish, Portuguese (Brazilian), Russian, Telegu, Tajik, Turkish, Ukrainian, Uzbek, Vietnamese and Chinese.
11+
| The translation capabilities depend on the installed translation task processing provider app. A list of translation-capable apps can be found :ref:`here <mt-consumer-apps>` in the "Backend apps" section.
1112
1213
Installation
1314
------------
@@ -24,21 +25,42 @@ Installation
2425
--env LT_INTERNAL_SECRET=1234 \
2526
--wait-finish
2627
28+
.. important::
2729

28-
.. note::
30+
The environment variables ``LT_HPB_URL`` and ``LT_INTERNAL_SECRET`` must be set in the :ref:`Deploy Options <ai-app_api_deploy_options>` during installation,
31+
and the High-Performance Backend must be functionally configured in Nextcloud Talk settings for the app to work.
2932

30-
Environment variables and mounts can be set during the app installation from the "Deploy Options" button.
31-
The models are stored in a persistent volume at ``/nc_app_live_transcription_data``.
32-
This volume is created automatically during the installation but you can also mount your own volume there.
33-
As the name suggests, this volume is persistent and will not be deleted when the app is updated or uninstalled
34-
(without removing data).
33+
Changing these environment variables after installation is possible through a re-installation of the app after uninstalling it first.
3534

35+
5. Install a Text-to-text task processing provider app for translation capabilities from the "Backend apps" section :ref:`here <mt-consumer-apps>`.
3636

37-
.. important::
37+
Requirements
38+
------------
3839

39-
The environment variables ``LT_HPB_URL`` and ``LT_INTERNAL_SECRET`` must be set in the Deploy Options,
40-
and the High-Performance Backend must be functionally configured in Nextcloud Talk settings for the app to work.
40+
* Minimal Nextcloud version: 33
41+
* Nextcloud AIO is supported
42+
* We currently support NVIDIA GPUs and x86_64 CPUs. Only CPU-based transcription is also supported and works well on modern x86 CPUs.
43+
* CUDA >= v12.4.1 on your host system for GPU-based transcription
44+
* GPU Sizing
45+
46+
* A NVIDIA GPU with at least 10 GB VRAM
47+
* 16 GB of system RAM should be enough for one or two concurrent calls
48+
49+
* CPU Sizing
50+
51+
* x86 CPU with 4 threads. Additional 2 threads per concurrent call.
52+
* 16 GB of RAM should be enough for one or two concurrent calls
53+
54+
* Space usage
55+
* ~ 2.8 GB for the docker container
56+
* ~ 6.0 GB for the default models
57+
58+
.. note::
4159

60+
We currently have very little real-world experience running this software on production instances.
61+
The above sizing recommendations come from our estimates and are not real-world benchmarks.
62+
Actual requirements will vary based on factors such as the number of concurrent calls, audio quality, and selected languages.
63+
Please do thorough testing to confirm your hardware meets your needs.
4264

4365
App store
4466
---------
@@ -59,3 +81,4 @@ Limitations
5981
* The app currently supports only a limited number of languages. More languages may be added in the future.
6082
* The languages other than English may have lower accuracy mainly due to the shipped models being smaller.
6183
* The app currently does not support punctuation in the transcription.
84+
* `OpenAI and LocalAI integration <https://apps.nextcloud.com/apps/integration_openai>`_ and `DeepL integration <http://apps.nextcloud.com/apps/integration_deepl>`_ apps are not yet supported for translation.

admin_manual/ai/overview.rst

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -137,13 +137,14 @@ Frontend apps
137137
* *Text* for offering the translation menu
138138
* `Assistant <https://apps.nextcloud.com/apps/assistant>`_ offering a graphical translation UI
139139
* `Analytics <https://apps.nextcloud.com/apps/analytics>`_ for translating graph labels
140+
* `Talk <https://apps.nextcloud.com/apps/spreed>`_ for translating messages and live translations in calls in conjunction with the :ref:`Live Transcription app <ai-live-transcription>`
140141

141142
Backend apps
142143
~~~~~~~~~~~~
143144

144145
* :ref:`translate2 (ExApp)<ai-app-translate2>` - Runs open source AI translation models locally on your own server hardware (Customer support available upon request)
145146
* `OpenAI and LocalAI integration (via OpenAI API) <https://apps.nextcloud.com/apps/integration_openai>`_ - Integrates with the OpenAI API to provide AI functionality from OpenAI servers (Customer support available upon request; see :ref:`AI as a Service<ai-ai_as_a_service>`)
146-
* *integration_deepl* - Integrates with the deepl API to provide translation functionality from Deepl.com servers (Only community supported)
147+
* `DeepL integration <http://apps.nextcloud.com/apps/integration_deepl>`__ - Integrates with the deepl API to provide translation functionality from Deepl.com servers (Only community supported)
147148

148149
Speech-To-Text
149150
^^^^^^^^^^^^^^

admin_manual/exapps_management/AdvancedDeployOptions.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22
Advanced Deploy Options
33
=======================
44

5+
.. _ai-app_api_deploy_options:
6+
57
AppAPI allows optionally to configure environment variables and mounts for the ExApp container.
68

79
It is available via "Deploy options" modal next to "Deploy and Enable" button in the sidebar of the ExApp page on the Apps management page:

0 commit comments

Comments
 (0)