❤️ myPDF.AI API Documentation

Extract data from PDFs, images, and URLs using our powerful REST API. Our artificial intelligence processes documents of any complexity with unmatched accuracy.

Complete Postman Collection

Access the full Postman collection with every endpoint, example, and testing script:

View Postman Documentation

Quick Start

Begin extracting data in under five minutes:

  1. Retrieve your access token
  2. Send your first request
  3. Receive structured data

Base URL

https://pdf.mypdf-ai.com

Supported Formats

  • PDFs: All file types, including scanned documents
  • Images: PNG, JPG, JPEG, GIF, BMP, TIFF
  • URLs: Direct links to PDFs or images

Authentication

Access Token

Every request must include your access token in the access_token header. You receive this token after purchasing credits.

curl --location 'https://pdf.mypdf-ai.com/api/pdf' \
--header 'access_token: YOUR_TOKEN_HERE' \
--form 'pdf=@"document.pdf"'

PDF Extraction

POST /api/pdf

Extracts structured data from PDF files, including text, tables, and metadata.

Parameters

Name Type Required Description
pdf file Required PDF file to process
access_token string Required Authentication token provided in the header
curl --location 'https://pdf.mypdf-ai.com/api/pdf' \
--header 'access_token: YOUR_TOKEN_HERE' \
--form 'pdf=@"document.pdf"'

Image Extraction

POST /api/image

Uses advanced OCR to extract text and data from images with artificial intelligence.

Parameters

Name Type Required Description
image file Required Image file (PNG, JPG, JPEG, GIF, BMP, TIFF)
access_token string Required Authentication token provided in the header
curl --location 'https://pdf.mypdf-ai.com/api/image' \
--header 'access_token: YOUR_TOKEN_HERE' \
--form 'image=@"document.png"'

URL Extraction

POST /api/url

Extracts data from files hosted on public URLs. Supports both PDFs and images.

Parameters

Name Type Required Description
url string Required Public URL of the file to process
type string Required File type: "pdf" or "image"
access_token string Required Authentication token provided in the header
curl --location 'https://pdf.mypdf-ai.com/api/url' \
--header 'Content-Type: application/json' \
--header 'access_token: YOUR_TOKEN_HERE' \
--data '{
    "url": "https://example.com/document.pdf",
    "type": "pdf"
}'

OCR Parse JSON

POST /api/parse-json

Processes PDFs with OCR and returns structured JSON with metadata, coordinates, and detected tables.

Parameters

Name Type Required Description
file file Required PDF file to process with positional data
access_token string Required Authentication token provided in the header
curl --location 'https://pdf.mypdf-ai.com/api/parse-json' \
--header 'access_token: YOUR_TOKEN_HERE' \
--form 'file=@"document.pdf"'

OCR Parse Markdown

POST /api/parse-markdown

Converts PDFs into formatted Markdown and detailed JSON with layout analysis.

Parameters

Name Type Required Description
file file Required PDF file used to generate Markdown and enriched JSON
access_token string Required Authentication token provided in the header
curl --location 'https://pdf.mypdf-ai.com/api/parse-markdown' \
--header 'access_token: YOUR_TOKEN_HERE' \
--form 'file=@"document.pdf"'

RAG Setup PDF

POST /api/setup-pdf

Processes a PDF with OCR and sends the extracted data to configure a RAG knowledge base in the customer's Supabase project.

Parameters

Name Type Required Description
pdf file Required PDF document that will be processed and forwarded to the RAG flow
supabase_url string Required Destination Supabase project URL
supabase_key string Required Supabase key with the necessary permissions (service_role recommended)
role_type string Optional Defines the type of key provided (default: service_role)
access_token string Required Authentication token provided in the header
curl --location 'https://pdf.mypdf-ai.com/api/setup-pdf' \
--header 'access_token: YOUR_TOKEN_HERE' \
--form 'pdf=@"document.pdf"' \
--form 'supabase_url=https://your-project.supabase.co' \
--form 'supabase_key=SERVICE_ROLE_KEY' \
--form 'role_type=service_role'

Response Format

All responses are returned in JSON with the following structure:

✔️ Success Response (200)
{
    "success": true,
    "data": {
        "text": "Extracted text from document...",
        "pages": 5,
        "metadata": {
            "title": "Document Title",
            "author": "Author",
            "creation_date": "2024-01-15"
        },
        "tables": [
            {
                "page": 1,
                "data": [
                    ["Column 1", "Column 2"],
                    ["Value 1", "Value 2"]
                ]
            }
        ]
    },
    "processing_time": "2.3s",
    "credits_used": 5
}
⚠️ Error Response (400/401/500)
{
    "success": false,
    "error": {
        "code": "INVALID_TOKEN",
        "message": "Invalid or expired access token"
    }
}

Error Codes

HTTP Code Error Code Description
400 INVALID_FILE Invalid or corrupted file
400 FILE_TOO_LARGE File exceeds the maximum allowed size
400 INVALID_URL URL is invalid or inaccessible
401 INVALID_TOKEN Access token is invalid or expired
402 INSUFFICIENT_CREDITS Not enough credits to process the request
429 RATE_LIMIT_EXCEEDED Rate limit exceeded
500 PROCESSING_ERROR Internal processing error

Practical Examples

Invoice Processing

Sample workflow to process an invoice PDF and extract structured information:

# Invoice processing using cURL
curl --location 'https://pdf.mypdf-ai.com/api/pdf' \
--header 'access_token: your_token_here' \
--form 'pdf=@"invoice.pdf"'

# Expected response:
# {
#   "success": true,
#   "data": {
#     "text": "Extracted text from invoice...",
#     "tables": [...],
#     "pages": 2,
#     "metadata": {...}
#   },
#   "processing_time": "1.2s",
#   "credits_used": 2
# }

Batch Processing

Example of how to process multiple documents efficiently:

# Batch processing using cURL in bash
#!/bin/bash

TOKEN="your_token_here"
PDF_DIR="./pdfs_to_process"

echo "Processing PDFs from directory: $PDF_DIR"

# Success and failure counters
SUCCESS_COUNT=0
FAIL_COUNT=0

# Loop through all PDFs
for pdf_file in "$PDF_DIR"/*.pdf; do
    if [ -f "$pdf_file" ]; then
        echo "Processing: $(basename "$pdf_file")"
        
        # Make cURL request
        response=$(curl -s -w "%{http_code}" --location 'https://pdf.mypdf-ai.com/api/pdf' \
            --header "access_token: $TOKEN" \
            --form "pdf=@\"$pdf_file\"")
        
        # Extract HTTP code (last 3 characters)
        http_code="${response: -3}"
        response_body="${response%???}"
        
        if [ "$http_code" -eq 200 ]; then
            echo "✔️ $(basename "$pdf_file") - processed successfully"
            SUCCESS_COUNT=$((SUCCESS_COUNT + 1))
        else
            echo "⚠️ $(basename "$pdf_file") - HTTP Error: $http_code"
            FAIL_COUNT=$((FAIL_COUNT + 1))
        fi
        
        # Pause to respect rate limits
        sleep 0.5
    fi
done

echo ""
echo "Summary:"
echo "Total: $((SUCCESS_COUNT + FAIL_COUNT))"
echo "Successes: $SUCCESS_COUNT"
echo "Failures: $FAIL_COUNT"

Need Help?

Our team is ready to help you integrate the myPDF.AI API.

Technical Support More Examples