Whisper (OpenAI)

Pricing model
GitHub
Upvote 0
Whisper is a publicly available system for automatic speech recognition, developed using 680,000 hours of multilingual and multi-task supervised data sourced from the internet. It is crafted to effectively handle various accents, background noise, and technical jargon, and it can convert and translate spoken language in numerous tongues into English. This straightforward end-to-end method is executed as an encoder-decoder Transformer. Additionally, it can identify languages and provide timestamps at the phrase level. It aims to offer ease of use and high precision, enabling developers to integrate voice interfaces into more applications.

Similar neural networks:

Freemium
Upvote 1
Descript is an audio and video editing software offering transcription, screen recording, publishing, and AI features such as lifelike voice cloning with Overdub, free voice templates, privacy-centric options, the capacity to edit real recordings mid-sentence, create multiple voices, share with trusted collaborators, and access a premium stock voice library. It also delivers a 44.1KHz broadcast-quality speech synthesizer and live Overdubbing capabilities.
Paid
Upvote 0
Sonix is a rapid, precise, and cost-effective platform for automated transcription, translation, and subtitling. It transforms audio and video into text, swiftly translates transcripts using its advanced automated translation engine, and generates fully automated subtitles. The platform also offers detailed multi-user permissions, AI-driven transcript summarization, seamless transcript sharing and publishing, and integration with web conferencing systems and video editing platforms. Additionally, it upholds top-notch security and privacy standards.
Free
Upvote 0
LangGPT is a platform that allows users to interact with ChatGPT in a variety of languages, including English, Italian, Russian, Spanish, German, French, Hindi, Portuguese, Simplified Chinese, Traditional Chinese, and Czech.