How to record desktop audio and transcribe it to a VTT file

Glama

Search for:

How to record desktop audio and transcribe it to a VTT file

View all MCP Servers

Why this server?
This server directly addresses recording audio, and transcribing to text, though it focuses on online sources (YouTube, Bilibili, TikTok) not local.
MCP Video & Audio Text Extraction Server
SealinGp
-
security
F
license
-
quality
An MCP server that downloads videos/extracts audio from various platforms like YouTube, Bilibili, and TikTok, then transcribes them to text using OpenAI's Whisper model.
Last updated 6 months ago
6
Why this server?
While it focuses on video recognition, it also indicates processing of audio and video input using Google's Gemini AI.
MCP Video Recognition Server
mario-andreschak
A
security
A
license
A
quality
Provides tools for image, audio, and video recognition using Google's Gemini AI through the Model Context Protocol.
Last updated 7 months ago
3
9
MIT License
Why this server?
This server provides text-to-speech capabilities and also mentions multiple audio formats. This can indirectly be useful.
TTS-MCP
nakamurau1
-
security
A
license
-
quality
A Model Context Protocol server that integrates high-quality text-to-speech capabilities with Claude Desktop and other MCP-compatible clients, supporting multiple voice options and audio formats.
Last updated 8 months ago
1
1
MIT License
Why this server?
This server focuses on invoice processing and OCR, it provides capabilities to extract text from invoice PDF and images. Since user needs to transcribe the recorded audio, this server can become useful if the user gets audio as video and converts video to image and then transcribe those images.
MCP Invoice
nfshanq
-
security
F
license
-
quality
A Python MCP server for invoice and receipt processing that uses OCR technology to extract data from PDFs and images, offering AI assistants the ability to process, extract text from, and merge invoice documents.
Last updated 8 months ago
2
Why this server?
This server can convert various file formats to Markdown which can help in transcription output format.
MarkItDown MCP Server
canlgz
A
security
A
license
A
quality
A Model Context Protocol server that converts various file formats (PDF, PowerPoint, Word, Excel, Images, etc.) to Markdown to make them accessible to LLMs.
Last updated 10 months ago
1
MIT License
Why this server?
This server focuses on extracting transcripts from YouTube videos which is related to the transcription part of the prompt
YouTube Transcript Extractor MCP
MalikElate
A
security
F
license
A
quality
A Model Context Protocol server that enables AI assistants to extract transcripts from YouTube videos, allowing AI to analyze and work with video content directly.
Last updated 9 months ago
1
10
2

How to record desktop audio and transcribe it to a VTT file

MCP Video & Audio Text Extraction Server

MCP Video Recognition Server

TTS-MCP

MCP Invoice

MarkItDown MCP Server

YouTube Transcript Extractor MCP