❤️ myPDF.AI API Documentation
Extract data from PDFs, images, and URLs using our powerful REST API. Our artificial intelligence processes documents of any complexity with unmatched accuracy.
Complete Postman Collection
Access the full Postman collection with every endpoint, example, and testing script:
View Postman DocumentationQuick Start
Begin extracting data in under five minutes:
- Retrieve your access token
- Send your first request
- Receive structured data
Base URL
Supported Formats
- PDFs: All file types, including scanned documents
- Images: PNG, JPG, JPEG, GIF, BMP, TIFF
- URLs: Direct links to PDFs or images
Authentication
Access Token
Every request must include your access token in the access_token
header. You receive this token after purchasing credits.
curl --location 'https://pdf.mypdf-ai.com/api/pdf' \ --header 'access_token: YOUR_TOKEN_HERE' \ --form 'pdf=@"document.pdf"'
PDF Extraction
/api/pdf
Extracts structured data from PDF files, including text, tables, and metadata.
Parameters
Name | Type | Required | Description |
---|---|---|---|
file | Required | PDF file to process | |
access_token | string | Required | Authentication token provided in the header |
curl --location 'https://pdf.mypdf-ai.com/api/pdf' \ --header 'access_token: YOUR_TOKEN_HERE' \ --form 'pdf=@"document.pdf"'
Image Extraction
/api/image
Uses advanced OCR to extract text and data from images with artificial intelligence.
Parameters
Name | Type | Required | Description |
---|---|---|---|
image | file | Required | Image file (PNG, JPG, JPEG, GIF, BMP, TIFF) |
access_token | string | Required | Authentication token provided in the header |
curl --location 'https://pdf.mypdf-ai.com/api/image' \ --header 'access_token: YOUR_TOKEN_HERE' \ --form 'image=@"document.png"'
URL Extraction
/api/url
Extracts data from files hosted on public URLs. Supports both PDFs and images.
Parameters
Name | Type | Required | Description |
---|---|---|---|
url | string | Required | Public URL of the file to process |
type | string | Required | File type: "pdf" or "image" |
access_token | string | Required | Authentication token provided in the header |
curl --location 'https://pdf.mypdf-ai.com/api/url' \ --header 'Content-Type: application/json' \ --header 'access_token: YOUR_TOKEN_HERE' \ --data '{ "url": "https://example.com/document.pdf", "type": "pdf" }'
OCR Parse JSON
/api/parse-json
Processes PDFs with OCR and returns structured JSON with metadata, coordinates, and detected tables.
Parameters
Name | Type | Required | Description |
---|---|---|---|
file | file | Required | PDF file to process with positional data |
access_token | string | Required | Authentication token provided in the header |
curl --location 'https://pdf.mypdf-ai.com/api/parse-json' \ --header 'access_token: YOUR_TOKEN_HERE' \ --form 'file=@"document.pdf"'
OCR Parse Markdown
/api/parse-markdown
Converts PDFs into formatted Markdown and detailed JSON with layout analysis.
Parameters
Name | Type | Required | Description |
---|---|---|---|
file | file | Required | PDF file used to generate Markdown and enriched JSON |
access_token | string | Required | Authentication token provided in the header |
curl --location 'https://pdf.mypdf-ai.com/api/parse-markdown' \ --header 'access_token: YOUR_TOKEN_HERE' \ --form 'file=@"document.pdf"'
RAG Setup PDF
/api/setup-pdf
Processes a PDF with OCR and sends the extracted data to configure a RAG knowledge base in the customer's Supabase project.
Parameters
Name | Type | Required | Description |
---|---|---|---|
file | Required | PDF document that will be processed and forwarded to the RAG flow | |
supabase_url | string | Required | Destination Supabase project URL |
supabase_key | string | Required | Supabase key with the necessary permissions (service_role recommended) |
role_type | string | Optional | Defines the type of key provided (default: service_role) |
access_token | string | Required | Authentication token provided in the header |
curl --location 'https://pdf.mypdf-ai.com/api/setup-pdf' \ --header 'access_token: YOUR_TOKEN_HERE' \ --form 'pdf=@"document.pdf"' \ --form 'supabase_url=https://your-project.supabase.co' \ --form 'supabase_key=SERVICE_ROLE_KEY' \ --form 'role_type=service_role'
Response Format
All responses are returned in JSON with the following structure:
✔️ Success Response (200)
{ "success": true, "data": { "text": "Extracted text from document...", "pages": 5, "metadata": { "title": "Document Title", "author": "Author", "creation_date": "2024-01-15" }, "tables": [ { "page": 1, "data": [ ["Column 1", "Column 2"], ["Value 1", "Value 2"] ] } ] }, "processing_time": "2.3s", "credits_used": 5 }
⚠️ Error Response (400/401/500)
{ "success": false, "error": { "code": "INVALID_TOKEN", "message": "Invalid or expired access token" } }
Error Codes
HTTP Code | Error Code | Description |
---|---|---|
400 | INVALID_FILE | Invalid or corrupted file |
400 | FILE_TOO_LARGE | File exceeds the maximum allowed size |
400 | INVALID_URL | URL is invalid or inaccessible |
401 | INVALID_TOKEN | Access token is invalid or expired |
402 | INSUFFICIENT_CREDITS | Not enough credits to process the request |
429 | RATE_LIMIT_EXCEEDED | Rate limit exceeded |
500 | PROCESSING_ERROR | Internal processing error |
Practical Examples
Invoice Processing
Sample workflow to process an invoice PDF and extract structured information:
# Invoice processing using cURL curl --location 'https://pdf.mypdf-ai.com/api/pdf' \ --header 'access_token: your_token_here' \ --form 'pdf=@"invoice.pdf"' # Expected response: # { # "success": true, # "data": { # "text": "Extracted text from invoice...", # "tables": [...], # "pages": 2, # "metadata": {...} # }, # "processing_time": "1.2s", # "credits_used": 2 # }
Batch Processing
Example of how to process multiple documents efficiently:
# Batch processing using cURL in bash #!/bin/bash TOKEN="your_token_here" PDF_DIR="./pdfs_to_process" echo "Processing PDFs from directory: $PDF_DIR" # Success and failure counters SUCCESS_COUNT=0 FAIL_COUNT=0 # Loop through all PDFs for pdf_file in "$PDF_DIR"/*.pdf; do if [ -f "$pdf_file" ]; then echo "Processing: $(basename "$pdf_file")" # Make cURL request response=$(curl -s -w "%{http_code}" --location 'https://pdf.mypdf-ai.com/api/pdf' \ --header "access_token: $TOKEN" \ --form "pdf=@\"$pdf_file\"") # Extract HTTP code (last 3 characters) http_code="${response: -3}" response_body="${response%???}" if [ "$http_code" -eq 200 ]; then echo "✔️ $(basename "$pdf_file") - processed successfully" SUCCESS_COUNT=$((SUCCESS_COUNT + 1)) else echo "⚠️ $(basename "$pdf_file") - HTTP Error: $http_code" FAIL_COUNT=$((FAIL_COUNT + 1)) fi # Pause to respect rate limits sleep 0.5 fi done echo "" echo "Summary:" echo "Total: $((SUCCESS_COUNT + FAIL_COUNT))" echo "Successes: $SUCCESS_COUNT" echo "Failures: $FAIL_COUNT"
Need Help?
Our team is ready to help you integrate the myPDF.AI API.
Technical Support More Examples