Back to Marketplace

MarkItDown

Open-source pickContent, Docs & Media

Convert PDFs, Office docs, images, HTML, audio, and more to clean Markdown using Microsoft's MarkItDown.

By RAPR AIv1.0.0Package license: MITFree & Open Source

MarkItDown package details

markdownconversionpdfdocumentspython

Open-source curation scope

This listing is a convenience wrapper for discovery, attribution, setup links, install commands, and agent guidance. This package curates public setup links, install commands, and agent instructions. It does not bundle or relicense the upstream project unless explicitly stated. Users can also go directly to the public upstream source linked on this page.

How to get started

Install RAPR AI

Download and install RAPR AI on your computer

Find in Marketplace

Open RAPR AI, go to Packages, and browse the marketplace

Install from Marketplace

Click Install. RAPR sets up the wrapper package, connector guidance, or skill instructions for this listing.

MarkItDown

Convert virtually any file format to clean, usable Markdown using MarkItDown — Microsoft's open-source Python library for document-to-Markdown conversion.

What you can do

  • Convert PDFs: Extract text content from PDF files into structured Markdown.
  • Office documents: Convert DOCX, XLSX, and PPTX files preserving headings, tables, and structure.
  • Images with OCR: Extract text from scanned documents and photos via an LLM vision plugin.
  • Audio transcription: Convert MP3, WAV, and M4A recordings to Markdown transcripts.
  • Batch processing: Convert entire folders of mixed file types in a single pipeline run.
  • RAG pipelines: Feed the output directly into vector stores for LLM retrieval.

Prerequisites

pip install markitdown
# or with all extras:
pip install "markitdown[all]"

Example Requests

  • "Convert all PDFs in the docs/ folder to Markdown for RAG indexing."
  • "Extract the text from this Excel spreadsheet as a Markdown table."
  • "Transcribe this MP3 meeting recording to Markdown."
  • "Convert this PPTX presentation to Markdown so I can summarize it."
  • "Pull the text from this scanned PDF image using the LLM plugin."

Ready to try MarkItDown?

Download RAPR AI and connect MarkItDown in seconds.