🚀 Live WeeklyDeploy Enterprise AI in weeks - Workshop Thursday, Oct 9th at 11am PTRegister Free →
Skip to main content

Speech to Text

The SpeechToText component allows administrators to configure speech-to-text functionality for their chatbot. This component is part of the chatbot configuration interface and enables the conversion of spoken language into written text.

Purpose​

The main purpose of this component is to enable administrators to set up and manage speech-to-text capabilities, allowing users to interact with the chatbot using voice input.

Features​

Provider Selection​

  • Allows you to choose from multiple speech-to-text providers:
    • OpenAI Whisper
    • Assembly AI
    • LocalAI STT
  • Option to disable speech-to-text by selecting "None"

Provider-Specific Configuration​

Each provider has its own set of configuration options, which may include:

  • API Credentials
  • Language settings
  • Model selection
  • Advanced parameters (e.g., temperature, prompts)

How to Use​

  1. Accessing the Settings:

    • Navigate to the chatflow configuration interface.
    • Locate the "Speech to Text" section.
  2. Selecting a Provider:

    • Use the dropdown menu to select a speech-to-text provider.
    • Options include "None" (to disable), "OpenAI Whisper", "Assembly AI", and "LocalAI STT".
  3. Configuring Provider Settings:

    • Once a provider is selected, its specific configuration options will appear.
    • Fill in the required fields and any optional parameters as needed.
  4. OpenAI Whisper Configuration:

    • Connect OpenAI API credentials
    • Optionally set language, prompt, and temperature
  5. Assembly AI Configuration:

    • Connect Assembly AI API credentials
  6. LocalAI STT Configuration:

    • Connect LocalAI API credentials
    • Set the base URL for the local AI server
    • Optionally configure language, model, prompt, and temperature
  7. Saving Changes:

    • After configuring the settings, click the "Save" button to apply the speech-to-text configuration.
    • A success message will appear if the settings are saved successfully.

Important Notes​

  • Only one speech-to-text provider can be active at a time.
  • Ensure that you have the necessary API credentials for the selected provider.
  • Some providers may require additional setup or have usage limits. Refer to the provider's documentation for more information.
  • The "Save" button will be disabled if a provider is selected but no credential is provided.

Technical Details​

  • The component uses Redux for state management and dispatching actions.
  • Speech-to-text settings are stored in the speechToText field of the chatflow data as a JSON string.
  • When saved, the configuration is updated via an API call to updateChatflow.

Error Handling​

If an error occurs while saving the settings, an error message will be displayed with details about the failure.

Security Implications​

  • Ensure that API credentials are kept secure and not exposed to unauthorized parties.
  • Be aware of the data privacy implications of using cloud-based speech-to-text services, especially when handling sensitive information.
  • For LocalAI STT, ensure that the local server is properly secured and accessible only to authorized systems.

Customization​

The speech-to-text functionality can be further customized by adjusting provider-specific parameters such as language, prompts, and temperature settings. These allow you to fine-tune the accuracy and behavior of the speech recognition for your specific use case.