❤️ myPDF.AI API Documentation
Extract data from PDFs, images, and URLs using our powerful REST API. Our artificial intelligence processes documents of any complexity with unmatched accuracy.
Complete Postman Collection
Access the full Postman collection with every endpoint, example, and testing script:
View Postman DocumentationQuick Start
Begin extracting data in under five minutes:
- Retrieve your access token
- Send your first request
- Receive structured data
Base URL
Supported Formats
- PDFs: All file types, including scanned documents
- Images: PNG, JPG, JPEG, GIF, BMP, TIFF
- URLs: Direct links to PDFs or images
Authentication
Access Token
Every request must include your access token in the access_token header. You receive this token after purchasing credits.
curl --location 'https://pdf.mypdf-ai.com/api/pdf' \ --header 'access_token: YOUR_TOKEN_HERE' \ --form 'pdf=@"document.pdf"'
PDF Extraction
/api/pdf
Extracts structured data from PDF files, including text, tables, and metadata.
Parameters
| Name | Type | Required | Description |
|---|---|---|---|
| file | Required | PDF file to process | |
| access_token | string | Required | Authentication token provided in the header |
curl --location 'https://pdf.mypdf-ai.com/api/pdf' \ --header 'access_token: YOUR_TOKEN_HERE' \ --form 'pdf=@"document.pdf"'
Image Extraction
/api/image
Uses advanced OCR to extract text and data from images with artificial intelligence.
Parameters
| Name | Type | Required | Description |
|---|---|---|---|
| image | file | Required | Image file (PNG, JPG, JPEG, GIF, BMP, TIFF) |
| access_token | string | Required | Authentication token provided in the header |
curl --location 'https://pdf.mypdf-ai.com/api/image' \ --header 'access_token: YOUR_TOKEN_HERE' \ --form 'image=@"document.png"'
URL Extraction
/api/url
Extracts data from files hosted on public URLs. Supports both PDFs and images.
Parameters
| Name | Type | Required | Description |
|---|---|---|---|
| url | string | Required | Public URL of the file to process |
| type | string | Required | File type: "pdf" or "image" |
| access_token | string | Required | Authentication token provided in the header |
curl --location 'https://pdf.mypdf-ai.com/api/url' \
--header 'Content-Type: application/json' \
--header 'access_token: YOUR_TOKEN_HERE' \
--data '{
"url": "https://example.com/document.pdf",
"type": "pdf"
}'
OCR Parse JSON
/api/parse-json
Processes PDFs with OCR and returns structured JSON with metadata, coordinates, and detected tables.
Parameters
| Name | Type | Required | Description |
|---|---|---|---|
| file | file | Required | PDF file to process with positional data |
| access_token | string | Required | Authentication token provided in the header |
curl --location 'https://pdf.mypdf-ai.com/api/parse-json' \ --header 'access_token: YOUR_TOKEN_HERE' \ --form 'file=@"document.pdf"'
OCR Parse Markdown
/api/parse-markdown
Converts PDFs into formatted Markdown and detailed JSON with layout analysis.
Parameters
| Name | Type | Required | Description |
|---|---|---|---|
| file | file | Required | PDF file used to generate Markdown and enriched JSON |
| access_token | string | Required | Authentication token provided in the header |
curl --location 'https://pdf.mypdf-ai.com/api/parse-markdown' \ --header 'access_token: YOUR_TOKEN_HERE' \ --form 'file=@"document.pdf"'
RAG Setup PDF
/api/setup-pdf
Processes a PDF with OCR and sends the extracted data to configure a RAG knowledge base in the customer's Supabase project.
Parameters
| Name | Type | Required | Description |
|---|---|---|---|
| file | Required | PDF document that will be processed and forwarded to the RAG flow | |
| supabase_url | string | Required | Destination Supabase project URL |
| supabase_key | string | Required | Supabase key with the necessary permissions (service_role recommended) |
| role_type | string | Optional | Defines the type of key provided (default: service_role) |
| access_token | string | Required | Authentication token provided in the header |
curl --location 'https://pdf.mypdf-ai.com/api/setup-pdf' \ --header 'access_token: YOUR_TOKEN_HERE' \ --form 'pdf=@"document.pdf"' \ --form 'supabase_url=https://your-project.supabase.co' \ --form 'supabase_key=SERVICE_ROLE_KEY' \ --form 'role_type=service_role'
Response Format
All responses are returned in JSON with the following structure:
✔️ Success Response (200)
{
"success": true,
"data": {
"text": "Extracted text from document...",
"pages": 5,
"metadata": {
"title": "Document Title",
"author": "Author",
"creation_date": "2024-01-15"
},
"tables": [
{
"page": 1,
"data": [
["Column 1", "Column 2"],
["Value 1", "Value 2"]
]
}
]
},
"processing_time": "2.3s",
"credits_used": 5
}
⚠️ Error Response (400/401/500)
{
"success": false,
"error": {
"code": "INVALID_TOKEN",
"message": "Invalid or expired access token"
}
}
Error Codes
| HTTP Code | Error Code | Description |
|---|---|---|
| 400 | INVALID_FILE | Invalid or corrupted file |
| 400 | FILE_TOO_LARGE | File exceeds the maximum allowed size |
| 400 | INVALID_URL | URL is invalid or inaccessible |
| 401 | INVALID_TOKEN | Access token is invalid or expired |
| 402 | INSUFFICIENT_CREDITS | Not enough credits to process the request |
| 429 | RATE_LIMIT_EXCEEDED | Rate limit exceeded |
| 500 | PROCESSING_ERROR | Internal processing error |
Practical Examples
Invoice Processing
Sample workflow to process an invoice PDF and extract structured information:
# Invoice processing using cURL
curl --location 'https://pdf.mypdf-ai.com/api/pdf' \
--header 'access_token: your_token_here' \
--form 'pdf=@"invoice.pdf"'
# Expected response:
# {
# "success": true,
# "data": {
# "text": "Extracted text from invoice...",
# "tables": [...],
# "pages": 2,
# "metadata": {...}
# },
# "processing_time": "1.2s",
# "credits_used": 2
# }
Batch Processing
Example of how to process multiple documents efficiently:
# Batch processing using cURL in bash
#!/bin/bash
TOKEN="your_token_here"
PDF_DIR="./pdfs_to_process"
echo "Processing PDFs from directory: $PDF_DIR"
# Success and failure counters
SUCCESS_COUNT=0
FAIL_COUNT=0
# Loop through all PDFs
for pdf_file in "$PDF_DIR"/*.pdf; do
if [ -f "$pdf_file" ]; then
echo "Processing: $(basename "$pdf_file")"
# Make cURL request
response=$(curl -s -w "%{http_code}" --location 'https://pdf.mypdf-ai.com/api/pdf' \
--header "access_token: $TOKEN" \
--form "pdf=@\"$pdf_file\"")
# Extract HTTP code (last 3 characters)
http_code="${response: -3}"
response_body="${response%???}"
if [ "$http_code" -eq 200 ]; then
echo "✔️ $(basename "$pdf_file") - processed successfully"
SUCCESS_COUNT=$((SUCCESS_COUNT + 1))
else
echo "⚠️ $(basename "$pdf_file") - HTTP Error: $http_code"
FAIL_COUNT=$((FAIL_COUNT + 1))
fi
# Pause to respect rate limits
sleep 0.5
fi
done
echo ""
echo "Summary:"
echo "Total: $((SUCCESS_COUNT + FAIL_COUNT))"
echo "Successes: $SUCCESS_COUNT"
echo "Failures: $FAIL_COUNT"
Need Help?
Our team is ready to help you integrate the myPDF.AI API.
Technical Support More Examples