Skip to content

Doc2X CLI User Guide

doc2x-cli is a command-line tool provided by Doc2X for parsing and converting PDF or image files into Markdown, Word, LaTeX, HTML, or PDF formats, with support for multi-language document translation and batch processing.

Command Overview

CommandDescription
doc2x loginLog in to Doc2X via browser (OAuth)
doc2x logoutClear stored OAuth credentials
doc2x parse <input>Parse PDF or image and export to a structured document format
doc2x translate <input>Parse and translate a PDF file
doc2x batch <action> [inputs...]Batch parse or translate for multiple files or directories
doc2x models listList translation models available for your account
doc2x term listList glossaries and their IDs configured for your account

Requirements & Installation

  • Runtime: Node.js 22 or higher
  • Network: Access to Doc2X services is required
  • File limits: PDF up to 300 MB, images up to 3 MB
  • Supported image formats: png, jpg, jpeg, gif, bmp (webp, tiff, and tif are not supported)

Prerequisites

Configure the GitHub Packages registry for the @noedgeai scope (one-time setup):

bash
npm config set @noedgeai:registry=https://npm.pkg.github.com

This package is hosted on GitHub Packages and is publicly accessible — no login or token is required.

Installation

Install the package globally via npm:

bash
npm i -g @noedgeai/doc2x-cli@latest

# Verify installation
doc2x --version

Tip: If you get command not found: doc2x after installation, make sure npm's global bin directory is in your system PATH.

Authentication

doc2x-cli supports two authentication modes, switchable via the --auth-mode flag.

Client Mode (Default)

Reads the login state from the locally running Doc2X desktop client. Best suited for personal development machines. Simply ensure the desktop client is logged in and running:

bash
doc2x parse ./paper.pdf

OAuth Mode

Authenticates via browser-based OAuth login. Suited for servers, CI/CD environments, or machines without the desktop client:

bash
# Log in (opens browser for OAuth authorization)
doc2x login

# After login, run commands with OAuth mode
doc2x --auth-mode oauth parse ./paper.pdf

# If you cannot open a browser, use --no-browser to get the URL manually
doc2x login --no-browser

# Log out (clear credentials)
doc2x logout

OAuth credentials are stored in the system application data directory (macOS: ~/Library/Application Support/doc2x/, Linux: ~/.config/doc2x/, Windows: %APPDATA%\doc2x\) and tokens are refreshed automatically.

Global Options

The following options apply to all commands:

OptionDescriptionDefault
--config <path>Path to config file (YAML/JSON)-
--auth-mode <mode>Authentication mode: client or oauthclient
--timeout <ms>API request timeout in milliseconds60000
--retry <n>Retry count for downloads/exports2
--jsonOutput results as JSONfalse
--quietOnly output errors and final resultsfalse
--verboseOutput debug informationfalse
--no-colorDisable colored output (auto-detected in non-TTY)Auto

Quick Start

Four most common scenarios:

bash
# 1. Parse a PDF and export as Markdown (default)
doc2x parse ./paper.pdf

# 2. Parse a PDF and export as Word
doc2x parse ./paper.pdf --to docx

# 3. Translate a PDF to Chinese and export as Word
doc2x translate ./paper.pdf --target-language zh --to docx

# 4. Batch parse all PDFs in a directory to Markdown
doc2x batch parse ./docs --glob "**/*.pdf"

By default, output files are saved in the output folder under the current directory.

Document Parsing (parse)

The parse command processes PDF and image files. Use the --to flag to specify the output format: md (default), docx, tex, html, pdf, none (parse only, no export).

bash
doc2x parse ./paper.pdf --to tex
doc2x parse ./image.png

Parse Options

OptionDescriptionDefault
--to <fmt>Output format: none, md, tex, docx, html, pdfmd
--image-models <models...>Image OCR models (doc2x is mandatory): doc2x mathpixdoc2x
--vision-models <models...>LLM vision models for image parsing (use IDs from doc2x models list)-
--image-hosting <mode>Image source: local or onlinelocal
--formula-mode <mode>Formula delimiter: normal or dollarnormal
--formula-level <level>Formula downgrade level: normal, onlyLine, processAllnormal
--merge-cross-page-formsAttempt to merge tables that span across pagesfalse
--remove-commentsRemove HTML comments from outputfalse
--avoid-indented-code-blocksCode indentation compatibilityfalse
--out <path>Output directory or file path./output
--name <pattern>Filename template: {basename}, {date}, {lang}{basename}
--overwriteOverwrite existing output filesfalse

Note on formula dollar mode: --formula-mode dollar is only effective when exporting to Markdown. It has no effect with other export formats.

Note on HTML export: HTML output relies on MathJax CDN for formula rendering. For fully offline use, consider exporting to docx or pdf instead.

Note on online image hosting: When using --image-hosting online, images are only stored for 30 days.

Document Translation (translate)

The translate command translates PDF files. You must specify the target language via --target-language.

Supported Target Languages

zh (Chinese), en (English), ja (Japanese), fr (French), ru (Russian), pt (Portuguese), pt-BR (Brazilian Portuguese), es (Spanish), de (German), ko (Korean), ar (Arabic)

Translation Modes

Two modes are available, controlled by --translate-type:

  • Document translation (default: md): Extracts and translates content, then exports to Markdown, Word, HTML, etc. Supports bilingual output.
  • Fixed-layout translation (pdf): Preserves the original document layout and outputs a translated PDF. The output format is fixed to .pdf.
bash
# Document translation: translate to Chinese, export as Word
doc2x translate ./paper.pdf --target-language zh --to docx

# Fixed-layout translation: translate to Chinese, export as layout-preserved PDF
doc2x translate ./paper.pdf --target-language zh --translate-type pdf

Translate Options

The following options are specific to the translate command (all parse options are also inherited):

OptionDescriptionDefault
--translate-type <t>Translation mode: md or pdfmd
--target-language <lang>Target language codezh
--target-model <model>Translation LLM modelDefault model
--term-id <id>Glossary ID (get IDs via doc2x term list)-
--font-color-extractionExtract font colorfalse
--ignore-translate-types <list...>Element types to skip: table, code, figure, reference-
--convert-trans <t>Export content: both (bilingual), origin (original only), translate (translated only)both
--contextual-translationEnable contextual translation enhancementfalse

Batch Processing (batch)

The batch command processes multiple files or entire directories at once, and automatically generates a processing report.

bash
# Batch parse specific files in a directory
doc2x batch parse ./docs --glob "**/*.pdf" --to docx

# Batch translate, continue on errors
doc2x batch translate ./papers --glob "**/*.pdf" --target-language zh --continue-on-error

Batch Options

OptionDescriptionDefault
--glob <pattern>Glob pattern for directory inputs**/*.{pdf,png,jpg,jpeg}
--continue-on-errorContinue when a file failsfalse
--skip-existingSkip files with existing outputtrue
--report <path>Report output path./doc2x-report.json
--dry-runPrint the list of files to be processed without executingfalse

The batch command also supports all parse and translate options.

Model Query (models)

List available translation models for your account, including model IDs, names, and subscription requirements. Use the model ID from this list for the --target-model option in the translate command.

bash
doc2x models list

# JSON format output
doc2x models list --json

Glossary Management (term)

Glossaries help unify terminology translation across documents.

Term Subcommands

SubcommandDescriptionRequired Options
doc2x term listList all glossaries-
doc2x term createCreate a new glossary--name <name>
doc2x term itemsList items in a glossary--term-id <id>
doc2x term importImport terms from a CSV file--term-id <id> --file <path>
bash
# List all glossaries
doc2x term list

# Create a glossary
doc2x term create --name "Technical Dictionary"

# List items in a glossary
doc2x term items --term-id <id>

# Import terms from a CSV file
doc2x term import --term-id <id> --file terms.csv

CSV Import Format

The CSV file supports the following columns (RFC 4180 compliant):

ColumnDescriptionDefault
originSource term(required)
translateTranslated term(required)
originLangSource language codeen
translateLangTarget language codezh

Supported language codes: zh, en, ja, fr, ru, pt, pt-BR, es, de, ko, ar.

Example CSV file:

text
origin,translate,originLang,translateLang
machine learning,机器学习,en,zh
neural network,神经网络,en,zh

Configuration & Output Settings

Output Control

Use --out to specify the output directory and --name to configure the filename template. Supported placeholders: {basename}, {date}, {lang}.

bash
# Specify output directory and filename with language suffix
doc2x translate ./paper.pdf --target-language zh --out ./results --name "{basename}_{lang}" --to docx

Configuration File

For frequently used parameter combinations, create a YAML or JSON configuration file and load it with --config.

Configuration priority (highest to lowest): CLI arguments > config file > built-in defaults.

yaml
# doc2x.yaml
authMode: client
timeout: 60000
retry: 2
defaults:
  parse:
    to: docx
    imageModels: ["doc2x"]
    visionModels: []
    imageHosting: local
    formulaMode: normal
    formulaLevel: normal
    mergeCrossPageForms: false
    removeComments: false
    avoidIndentedCodeBlocks: false
    out: ./output
    name: "{basename}"
    overwrite: false
  translate:
    translateType: md
    targetLanguage: zh
    targetModel: "72"
    termId: ""
    fontColorExtraction: false
    ignoreTranslateTypes: []
    convertTrans: both
    contextualTranslation: false
  batch:
    glob: "**/*.{pdf,png,jpg,jpeg}"
    continueOnError: false
    skipExisting: true
    report: ./doc2x-report.json
    dryRun: false
bash
doc2x --config ./doc2x.yaml parse ./paper.pdf

JSON Output

Use the --json flag to format command output as JSON, useful for scripting and automation:

bash
doc2x parse ./paper.pdf --json
doc2x batch parse ./docs --json

Exit Codes

Exit CodeMeaning
0Success
1Argument error
2Authentication failed
3Input file error
4Task processing failed
5Export failed
6Batch partial failure

Troubleshooting

Issue / ErrorSuggested Fix
Cannot connect to desktop clientConfirm the desktop client is running and logged in. On servers or headless environments, use doc2x login and switch to --auth-mode oauth
OAuth login failedEnsure network access to Doc2X services, try doc2x login --no-browser to manually copy the URL
Output file not generated or unchangedThe CLI does not overwrite existing files by default. Add --overwrite to force update
Image upload rejected or failedCheck image format and size. Max 3 MB per image; webp, tiff, and tif are not supported — convert to png or jpg
Insufficient quota / model requires subscriptionCheck your account quota and subscription status. Some features (e.g., mathpix model) may require a premium account