A fast, memory-safe library for text extraction from Office documents. Rust core with first-class bindings for Python, Go, C#/.NET, Node.js (native and WASM), and a stable C FFI. Handles DOCX, XLSX, ...
python-docx is a Python library for reading, creating, and updating Microsoft Word 2007+ (.docx) files.