Whisper (OpenAI)

Pricing model
GitHub
Upvote 0
Whisper is a publicly available system for automatic speech recognition, developed using 680,000 hours of multilingual and multi-task supervised data sourced from the internet. It is crafted to effectively handle various accents, background noise, and technical jargon, and it can convert and translate spoken language in numerous tongues into English. This straightforward end-to-end method is executed as an encoder-decoder Transformer. Additionally, it can identify languages and provide timestamps at the phrase level. It aims to offer ease of use and high precision, enabling developers to integrate voice interfaces into more applications.

Similar neural networks:

Paid
Upvote 0
Tolgee is a free localization platform that enhances the translation of software applications. It includes features such as in-context translation, translation memory, machine translation, and compatibility with multiple file formats. Developers and teams utilize Tolgee to optimize their localization workflow, reducing time and effort through automation and intuitive interfaces. The platform's effectiveness, simplicity, and extensive toolset make it appealing for projects of all scales needing multilingual capabilities.
Freemium
Upvote 0
Translate.Video is a video translation application that allows users to effortlessly convert their videos into various languages. This tool provides features like automated captioning, subtitle translation, dubbing, AI voice-overs, recording, and transcript creation all within a user-friendly platform.
GitHub
Upvote 0
TTS Voice Wizard is a software that allows users to transform their speech into text and then reconvert it to speech using Microsoft Azure Voice Recognition and TTS. Additionally, it transmits OSC messages to VRChat to exhibit text on an avatar. The software offers numerous customization features, including over 100 voice options, support for more than 20 languages, and the capability to display song titles, artists, and progress above the user.