Markitdown — Document Conversion
Kaji can extract and convert content from files and web pages into clean text using Markitdown. This lets you point Kaji at a document — a PDF report, an Excel spreadsheet, a Word doc, a web page — and ask questions about its contents.
What Kaji can do
- Convert PDFs to readable text
- Extract data from Excel and CSV files
- Read Word documents and PowerPoint presentations
- Convert web pages to clean text
- Process audio files (transcription)
- Extract content from ZIP archives
Supported formats
| Format | Examples |
|---|---|
| Documents | PDF, Word (.docx), PowerPoint (.pptx) |
| Spreadsheets | Excel (.xlsx), CSV |
| Web | Any public URL, HTML files |
| Audio | MP3, WAV (via Azure Speech, if configured) |
| Archives | ZIP |
| Images | PNG, JPG (extracts embedded text) |
Example prompts
Read this PDF and summarize the key findings: /path/to/report.pdf
Extract all the data from this Excel file and show me the totals: /path/to/budget.xlsx
Summarize the content of this web page: https://example.com/article
What are the action items from this Word document?
Convert this PowerPoint to a text outline
Tips for business users
- Markitdown works best with text-based documents (it reads text, not images within PDFs)
- For web pages, provide the full URL including
https:// - For files on your machine, provide the full file path — files in your Kaji session drive are accessible at
/root/ - This is a great way to feed external documents into Kaji's context for analysis or comparison
Configuration
Markitdown works out of the box in Kaji sessions. No additional credentials are needed for basic document and URL conversion.
Optional configuration for enhanced features:
| Variable | Description |
|---|---|
AZURE_SPEECH_API_KEY | Enables audio transcription |
AZURE_SPEECH_REGION | Azure region for speech service |