Doc2X CLI User Guide
doc2x-cli is a command-line tool provided by Doc2X for parsing and converting PDF or image files into Markdown, Word, LaTeX, HTML, or PDF formats, with support for multi-language document translation and batch processing.
Command Overview
| Command | Description |
|---|---|
doc2x login | Log in to Doc2X via browser (OAuth) |
doc2x logout | Clear stored OAuth credentials |
doc2x parse <input> | Parse PDF or image and export to a structured document format |
doc2x translate <input> | Parse and translate a PDF file |
doc2x batch <action> [inputs...] | Batch parse or translate for multiple files or directories |
doc2x models list | List translation models available for your account |
doc2x term list | List glossaries and their IDs configured for your account |
Requirements & Installation
- Runtime: Node.js 22 or higher
- Network: Access to Doc2X services is required
- File limits: PDF up to 300 MB, images up to 3 MB
- Supported image formats: png, jpg, jpeg, gif, bmp (webp, tiff, and tif are not supported)
Prerequisites
Configure the GitHub Packages registry for the @noedgeai scope (one-time setup):
npm config set @noedgeai:registry=https://npm.pkg.github.comThis package is hosted on GitHub Packages and is publicly accessible — no login or token is required.
Installation
Install the package globally via npm:
npm i -g @noedgeai/doc2x-cli@latest
# Verify installation
doc2x --versionTip: If you get
command not found: doc2xafter installation, make sure npm's global bin directory is in your systemPATH.
Authentication
doc2x-cli supports two authentication modes, switchable via the --auth-mode flag.
Client Mode (Default)
Reads the login state from the locally running Doc2X desktop client. Best suited for personal development machines. Simply ensure the desktop client is logged in and running:
doc2x parse ./paper.pdfOAuth Mode
Authenticates via browser-based OAuth login. Suited for servers, CI/CD environments, or machines without the desktop client:
# Log in (opens browser for OAuth authorization)
doc2x login
# After login, run commands with OAuth mode
doc2x --auth-mode oauth parse ./paper.pdf
# If you cannot open a browser, use --no-browser to get the URL manually
doc2x login --no-browser
# Log out (clear credentials)
doc2x logoutOAuth credentials are stored in the system application data directory (macOS:
~/Library/Application Support/doc2x/, Linux:~/.config/doc2x/, Windows:%APPDATA%\doc2x\) and tokens are refreshed automatically.
Global Options
The following options apply to all commands:
| Option | Description | Default |
|---|---|---|
--config <path> | Path to config file (YAML/JSON) | - |
--auth-mode <mode> | Authentication mode: client or oauth | client |
--timeout <ms> | API request timeout in milliseconds | 60000 |
--retry <n> | Retry count for downloads/exports | 2 |
--json | Output results as JSON | false |
--quiet | Only output errors and final results | false |
--verbose | Output debug information | false |
--no-color | Disable colored output (auto-detected in non-TTY) | Auto |
Quick Start
Four most common scenarios:
# 1. Parse a PDF and export as Markdown (default)
doc2x parse ./paper.pdf
# 2. Parse a PDF and export as Word
doc2x parse ./paper.pdf --to docx
# 3. Translate a PDF to Chinese and export as Word
doc2x translate ./paper.pdf --target-language zh --to docx
# 4. Batch parse all PDFs in a directory to Markdown
doc2x batch parse ./docs --glob "**/*.pdf"By default, output files are saved in the output folder under the current directory.
Document Parsing (parse)
The parse command processes PDF and image files. Use the --to flag to specify the output format: md (default), docx, tex, html, pdf, none (parse only, no export).
doc2x parse ./paper.pdf --to tex
doc2x parse ./image.pngParse Options
| Option | Description | Default |
|---|---|---|
--to <fmt> | Output format: none, md, tex, docx, html, pdf | md |
--image-models <models...> | Image OCR models (doc2x is mandatory): doc2x mathpix | doc2x |
--vision-models <models...> | LLM vision models for image parsing (use IDs from doc2x models list) | - |
--image-hosting <mode> | Image source: local or online | local |
--formula-mode <mode> | Formula delimiter: normal or dollar | normal |
--formula-level <level> | Formula downgrade level: normal, onlyLine, processAll | normal |
--merge-cross-page-forms | Attempt to merge tables that span across pages | false |
--remove-comments | Remove HTML comments from output | false |
--avoid-indented-code-blocks | Code indentation compatibility | false |
--out <path> | Output directory or file path | ./output |
--name <pattern> | Filename template: {basename}, {date}, {lang} | {basename} |
--overwrite | Overwrite existing output files | false |
Note on formula dollar mode:
--formula-mode dollaris only effective when exporting to Markdown. It has no effect with other export formats.Note on HTML export: HTML output relies on MathJax CDN for formula rendering. For fully offline use, consider exporting to
docxorNote on online image hosting: When using
--image-hosting online, images are only stored for 30 days.
Document Translation (translate)
The translate command translates PDF files. You must specify the target language via --target-language.
Supported Target Languages
zh (Chinese), en (English), ja (Japanese), fr (French), ru (Russian), pt (Portuguese), pt-BR (Brazilian Portuguese), es (Spanish), de (German), ko (Korean), ar (Arabic)
Translation Modes
Two modes are available, controlled by --translate-type:
- Document translation (default:
md): Extracts and translates content, then exports to Markdown, Word, HTML, etc. Supports bilingual output. - Fixed-layout translation (
pdf): Preserves the original document layout and outputs a translated PDF. The output format is fixed to.pdf.
# Document translation: translate to Chinese, export as Word
doc2x translate ./paper.pdf --target-language zh --to docx
# Fixed-layout translation: translate to Chinese, export as layout-preserved PDF
doc2x translate ./paper.pdf --target-language zh --translate-type pdfTranslate Options
The following options are specific to the translate command (all parse options are also inherited):
| Option | Description | Default |
|---|---|---|
--translate-type <t> | Translation mode: md or pdf | md |
--target-language <lang> | Target language code | zh |
--target-model <model> | Translation LLM model | Default model |
--term-id <id> | Glossary ID (get IDs via doc2x term list) | - |
--font-color-extraction | Extract font color | false |
--ignore-translate-types <list...> | Element types to skip: table, code, figure, reference | - |
--convert-trans <t> | Export content: both (bilingual), origin (original only), translate (translated only) | both |
--contextual-translation | Enable contextual translation enhancement | false |
Batch Processing (batch)
The batch command processes multiple files or entire directories at once, and automatically generates a processing report.
# Batch parse specific files in a directory
doc2x batch parse ./docs --glob "**/*.pdf" --to docx
# Batch translate, continue on errors
doc2x batch translate ./papers --glob "**/*.pdf" --target-language zh --continue-on-errorBatch Options
| Option | Description | Default |
|---|---|---|
--glob <pattern> | Glob pattern for directory inputs | **/*.{pdf,png,jpg,jpeg} |
--continue-on-error | Continue when a file fails | false |
--skip-existing | Skip files with existing output | true |
--report <path> | Report output path | ./doc2x-report.json |
--dry-run | Print the list of files to be processed without executing | false |
The batch command also supports all
parseandtranslateoptions.
Model Query (models)
List available translation models for your account, including model IDs, names, and subscription requirements. Use the model ID from this list for the --target-model option in the translate command.
doc2x models list
# JSON format output
doc2x models list --jsonGlossary Management (term)
Glossaries help unify terminology translation across documents.
Term Subcommands
| Subcommand | Description | Required Options |
|---|---|---|
doc2x term list | List all glossaries | - |
doc2x term create | Create a new glossary | --name <name> |
doc2x term items | List items in a glossary | --term-id <id> |
doc2x term import | Import terms from a CSV file | --term-id <id> --file <path> |
# List all glossaries
doc2x term list
# Create a glossary
doc2x term create --name "Technical Dictionary"
# List items in a glossary
doc2x term items --term-id <id>
# Import terms from a CSV file
doc2x term import --term-id <id> --file terms.csvCSV Import Format
The CSV file supports the following columns (RFC 4180 compliant):
| Column | Description | Default |
|---|---|---|
origin | Source term | (required) |
translate | Translated term | (required) |
originLang | Source language code | en |
translateLang | Target language code | zh |
Supported language codes: zh, en, ja, fr, ru, pt, pt-BR, es, de, ko, ar.
Example CSV file:
origin,translate,originLang,translateLang
machine learning,机器学习,en,zh
neural network,神经网络,en,zhConfiguration & Output Settings
Output Control
Use --out to specify the output directory and --name to configure the filename template. Supported placeholders: {basename}, {date}, {lang}.
# Specify output directory and filename with language suffix
doc2x translate ./paper.pdf --target-language zh --out ./results --name "{basename}_{lang}" --to docxConfiguration File
For frequently used parameter combinations, create a YAML or JSON configuration file and load it with --config.
Configuration priority (highest to lowest): CLI arguments > config file > built-in defaults.
# doc2x.yaml
authMode: client
timeout: 60000
retry: 2
defaults:
parse:
to: docx
imageModels: ["doc2x"]
visionModels: []
imageHosting: local
formulaMode: normal
formulaLevel: normal
mergeCrossPageForms: false
removeComments: false
avoidIndentedCodeBlocks: false
out: ./output
name: "{basename}"
overwrite: false
translate:
translateType: md
targetLanguage: zh
targetModel: "72"
termId: ""
fontColorExtraction: false
ignoreTranslateTypes: []
convertTrans: both
contextualTranslation: false
batch:
glob: "**/*.{pdf,png,jpg,jpeg}"
continueOnError: false
skipExisting: true
report: ./doc2x-report.json
dryRun: falsedoc2x --config ./doc2x.yaml parse ./paper.pdfJSON Output
Use the --json flag to format command output as JSON, useful for scripting and automation:
doc2x parse ./paper.pdf --json
doc2x batch parse ./docs --jsonExit Codes
| Exit Code | Meaning |
|---|---|
0 | Success |
1 | Argument error |
2 | Authentication failed |
3 | Input file error |
4 | Task processing failed |
5 | Export failed |
6 | Batch partial failure |
Troubleshooting
| Issue / Error | Suggested Fix |
|---|---|
Cannot connect to desktop client | Confirm the desktop client is running and logged in. On servers or headless environments, use doc2x login and switch to --auth-mode oauth |
| OAuth login failed | Ensure network access to Doc2X services, try doc2x login --no-browser to manually copy the URL |
| Output file not generated or unchanged | The CLI does not overwrite existing files by default. Add --overwrite to force update |
| Image upload rejected or failed | Check image format and size. Max 3 MB per image; webp, tiff, and tif are not supported — convert to png or jpg |
| Insufficient quota / model requires subscription | Check your account quota and subscription status. Some features (e.g., mathpix model) may require a premium account |