Doc2X Common Questions - FAQ
Doc2X is a powerful document parsing and translation tool that supports recognition, conversion, and translation of multiple document formats. This guide will help you quickly understand and use Doc2X's various features.
Quick Links
- Website: doc2x.noedgeai.com
- API Access: open.noedgeai.com (quotas are not shared with web version)
- API v2 Documentation: Doc2x API v2 Interface Documentation
- Zotero Plugin Tutorial: Doc2X Zotero Plugin Usage Guide
- Feature Introduction: https://noedgeai.com
Translation - Points Rules
Doc2X has updated the translation points rules for web and client versions. For details, please refer to Doc2X Points Rules.
Currently Supported Features
Core Recognition Features
- ✅ Multi-element Recognition: Supports recognition of text, formulas, tables, and images
- ✅ Multi-column Recognition: Supports recognition of multi-column documents and restores correct reading order
- ✅ Multi-language Support: Supports Chinese (Simplified/Traditional), English, Western European languages, Japanese, Korean, etc.
- More languages are being supported
Advanced Features (In Development)
The following features are supported but their effectiveness is continuously being optimized:
- 🔄 Cross-page Table Merging (API)
- 🔄 Handwriting Recognition
- 🔄 Vertical Text
- 🔄 Multi-level Title Support
Current Limitations
- ❌ Does not support extra-long/extra-wide images: Need to manually split into normal page sizes
- ❌ Does not support documents with excessive blank margins: Need to manually crop blank margins
- ❌ Does not support rotated PDFs: Please manually rotate PDF to correct orientation before recognition
Processing Speed and Concurrency
Processing Speed
- Web and API: Average speed for single PDF is approximately 10 pages/second
- Actual Speed: Depends on document complexity
- API Acceleration: Contact us for faster processing speeds
Concurrency Limits
- Default API Concurrency: 5 PDFs processed simultaneously
- Increase Concurrency: Contact us for higher concurrency needs
Large Batch Data Processing
Enterprise Services
- Large Volume Processing: Contact us for discount pricing on large PDF processing volumes
- Served Clients: Multiple well-known large language model companies, financial and educational institutions
- Infrastructure: Self-built hundreds of GPU compute pools and multi-data center redundancy ensuring stability
- Processing Capacity: Can process tens of millions of document pages daily, with cumulative processing of billions of pages
Why Choose Doc2X
Core Advantages
- Leading Formula Recognition: Similar products on the market perform poorly in formula recognition (especially inline and complex formulas), while Doc2X is at the leading level
- Excellent Table Recognition: Supports advanced features like recognizing images within tables and cross-page table merging
- Precise Multi-column Recognition: Outstanding performance in restoring reading order for multi-column documents
- Wide Adaptation Range: Supports various document types including financial research reports, academic papers, educational materials, patents, etc.
Data Security
- Web Storage: Valid for 30 days (including image hosting)
- API Storage: Expires after 24 hours
- Automatic Deletion: Automatically deleted after expiration, please use with confidence