Skip to content

Doc2X Common Questions - FAQ

Doc2X is a powerful document parsing and translation tool that supports recognition, conversion, and translation of multiple document formats. This guide will help you quickly understand and use Doc2X's various features.

Translation - Points Rules

Doc2X has updated the translation points rules for web and client versions. For details, please refer to Doc2X Points Rules.

Currently Supported Features

Core Recognition Features

  • Multi-element Recognition: Supports recognition of text, formulas, tables, and images
  • Multi-column Recognition: Supports recognition of multi-column documents and restores correct reading order
  • Multi-language Support: Supports Chinese (Simplified/Traditional), English, Western European languages, Japanese, Korean, etc.
    • More languages are being supported

Advanced Features (In Development)

The following features are supported but their effectiveness is continuously being optimized:

Current Limitations

  • Does not support extra-long/extra-wide images: Need to manually split into normal page sizes
  • Does not support documents with excessive blank margins: Need to manually crop blank margins
  • Does not support rotated PDFs: Please manually rotate PDF to correct orientation before recognition

Processing Speed and Concurrency

Processing Speed

  • Web and API: Average speed for single PDF is approximately 10 pages/second
  • Actual Speed: Depends on document complexity
  • API Acceleration: Contact us for faster processing speeds

Concurrency Limits

  • Default API Concurrency: 5 PDFs processed simultaneously
  • Increase Concurrency: Contact us for higher concurrency needs

Large Batch Data Processing

Enterprise Services

  • Large Volume Processing: Contact us for discount pricing on large PDF processing volumes
  • Served Clients: Multiple well-known large language model companies, financial and educational institutions
  • Infrastructure: Self-built hundreds of GPU compute pools and multi-data center redundancy ensuring stability
  • Processing Capacity: Can process tens of millions of document pages daily, with cumulative processing of billions of pages

Why Choose Doc2X

Core Advantages

  1. Leading Formula Recognition: Similar products on the market perform poorly in formula recognition (especially inline and complex formulas), while Doc2X is at the leading level
  2. Excellent Table Recognition: Supports advanced features like recognizing images within tables and cross-page table merging
  3. Precise Multi-column Recognition: Outstanding performance in restoring reading order for multi-column documents
  4. Wide Adaptation Range: Supports various document types including financial research reports, academic papers, educational materials, patents, etc.

Data Security

  • Web Storage: Valid for 30 days (including image hosting)
  • API Storage: Expires after 24 hours
  • Automatic Deletion: Automatically deleted after expiration, please use with confidence