Skip to main content

Markitdown — Document Conversion

Kaji can extract and convert content from files and web pages into clean text using Markitdown. This lets you point Kaji at a document — a PDF report, an Excel spreadsheet, a Word doc, a web page — and ask questions about its contents.


What Kaji can do

  • Convert PDFs to readable text
  • Extract data from Excel and CSV files
  • Read Word documents and PowerPoint presentations
  • Convert web pages to clean text
  • Process audio files (transcription)
  • Extract content from ZIP archives

Supported formats

FormatExamples
DocumentsPDF, Word (.docx), PowerPoint (.pptx)
SpreadsheetsExcel (.xlsx), CSV
WebAny public URL, HTML files
AudioMP3, WAV (via Azure Speech, if configured)
ArchivesZIP
ImagesPNG, JPG (extracts embedded text)

Example prompts

Read this PDF and summarize the key findings: /path/to/report.pdf
Extract all the data from this Excel file and show me the totals: /path/to/budget.xlsx
Summarize the content of this web page: https://example.com/article
What are the action items from this Word document?
Convert this PowerPoint to a text outline

Tips for business users

  • Markitdown works best with text-based documents (it reads text, not images within PDFs)
  • For web pages, provide the full URL including https://
  • For files on your machine, provide the full file path — files in your Kaji session drive are accessible at /root/
  • This is a great way to feed external documents into Kaji's context for analysis or comparison

Configuration

Markitdown works out of the box in Kaji sessions. No additional credentials are needed for basic document and URL conversion.

Optional configuration for enhanced features:

VariableDescription
AZURE_SPEECH_API_KEYEnables audio transcription
AZURE_SPEECH_REGIONAzure region for speech service