❤️ myPDF.AI API Documentation

Extract data from PDFs, images, and URLs using our powerful REST API. Our artificial intelligence processes documents of any complexity with unmatched accuracy.

Complete Postman Collection

Access the full Postman collection with every endpoint, example, and testing script:

View Postman Documentation

Quick Start

Begin extracting data in under five minutes:

Retrieve your access token
Send your first request
Receive structured data

Base URL

https://pdf.mypdf-ai.com

Supported Formats

PDFs: All file types, including scanned documents
Images: PNG, JPG, JPEG, GIF, BMP, TIFF
URLs: Direct links to PDFs or images

Authentication

Access Token

Every request must include your access token in the access_token header. You receive this token after purchasing credits.

curl --location 'https://pdf.mypdf-ai.com/api/pdf' \
--header 'access_token: YOUR_TOKEN_HERE' \
--form 'pdf=@"document.pdf"'

PDF Extraction

POST /api/pdf

Extracts structured data from PDF files, including text, tables, and metadata.

Parameters

Name	Type	Required	Description
pdf	file	Required	PDF file to process
access_token	string	Required	Authentication token provided in the header

curl --location 'https://pdf.mypdf-ai.com/api/pdf' \
--header 'access_token: YOUR_TOKEN_HERE' \
--form 'pdf=@"document.pdf"'

Image Extraction

POST /api/image

Uses advanced OCR to extract text and data from images with artificial intelligence.

Parameters

Name	Type	Required	Description
image	file	Required	Image file (PNG, JPG, JPEG, GIF, BMP, TIFF)
access_token	string	Required	Authentication token provided in the header

curl --location 'https://pdf.mypdf-ai.com/api/image' \
--header 'access_token: YOUR_TOKEN_HERE' \
--form 'image=@"document.png"'

URL Extraction

POST /api/url

Extracts data from files hosted on public URLs. Supports both PDFs and images.

Parameters

Name	Type	Required	Description
url	string	Required	Public URL of the file to process
type	string	Required	File type: "pdf" or "image"
access_token	string	Required	Authentication token provided in the header

curl --location 'https://pdf.mypdf-ai.com/api/url' \
--header 'Content-Type: application/json' \
--header 'access_token: YOUR_TOKEN_HERE' \
--data '{
    "url": "https://example.com/document.pdf",
    "type": "pdf"
}'

OCR Parse JSON

POST /api/parse-json

Processes PDFs with OCR and returns structured JSON with metadata, coordinates, and detected tables.

Parameters

Name	Type	Required	Description
file	file	Required	PDF file to process with positional data
access_token	string	Required	Authentication token provided in the header

curl --location 'https://pdf.mypdf-ai.com/api/parse-json' \
--header 'access_token: YOUR_TOKEN_HERE' \
--form 'file=@"document.pdf"'

OCR Parse Markdown

POST /api/parse-markdown

Converts PDFs into formatted Markdown and detailed JSON with layout analysis.

Parameters

Name	Type	Required	Description
file	file	Required	PDF file used to generate Markdown and enriched JSON
access_token	string	Required	Authentication token provided in the header

curl --location 'https://pdf.mypdf-ai.com/api/parse-markdown' \
--header 'access_token: YOUR_TOKEN_HERE' \
--form 'file=@"document.pdf"'

RAG Setup PDF

POST /api/setup-pdf

Processes a PDF with OCR and sends the extracted data to configure a RAG knowledge base in the customer's Supabase project.

Parameters

Name	Type	Required	Description
pdf	file	Required	PDF document that will be processed and forwarded to the RAG flow
supabase_url	string	Required	Destination Supabase project URL
supabase_key	string	Required	Supabase key with the necessary permissions (service_role recommended)
role_type	string	Optional	Defines the type of key provided (default: service_role)
access_token	string	Required	Authentication token provided in the header

curl --location 'https://pdf.mypdf-ai.com/api/setup-pdf' \
--header 'access_token: YOUR_TOKEN_HERE' \
--form 'pdf=@"document.pdf"' \
--form 'supabase_url=https://your-project.supabase.co' \
--form 'supabase_key=SERVICE_ROLE_KEY' \
--form 'role_type=service_role'

Response Format

All responses are returned in JSON with the following structure:

✔️ Success Response (200)

{
    "success": true,
    "data": {
        "text": "Extracted text from document...",
        "pages": 5,
        "metadata": {
            "title": "Document Title",
            "author": "Author",
            "creation_date": "2024-01-15"
        },
        "tables": [
            {
                "page": 1,
                "data": [
                    ["Column 1", "Column 2"],
                    ["Value 1", "Value 2"]
                ]
            }
        ]
    },
    "processing_time": "2.3s",
    "credits_used": 5
}

⚠️ Error Response (400/401/500)

{
    "success": false,
    "error": {
        "code": "INVALID_TOKEN",
        "message": "Invalid or expired access token"
    }
}

Error Codes

HTTP Code	Error Code	Description
400	INVALID_FILE	Invalid or corrupted file
400	FILE_TOO_LARGE	File exceeds the maximum allowed size
400	INVALID_URL	URL is invalid or inaccessible
401	INVALID_TOKEN	Access token is invalid or expired
402	INSUFFICIENT_CREDITS	Not enough credits to process the request
429	RATE_LIMIT_EXCEEDED	Rate limit exceeded
500	PROCESSING_ERROR	Internal processing error

Practical Examples

Invoice Processing

Sample workflow to process an invoice PDF and extract structured information:

# Invoice processing using cURL
curl --location 'https://pdf.mypdf-ai.com/api/pdf' \
--header 'access_token: your_token_here' \
--form 'pdf=@"invoice.pdf"'

# Expected response:
# {
#   "success": true,
#   "data": {
#     "text": "Extracted text from invoice...",
#     "tables": [...],
#     "pages": 2,
#     "metadata": {...}
#   },
#   "processing_time": "1.2s",
#   "credits_used": 2
# }

Batch Processing

Example of how to process multiple documents efficiently:

# Batch processing using cURL in bash
#!/bin/bash

TOKEN="your_token_here"
PDF_DIR="./pdfs_to_process"

echo "Processing PDFs from directory: $PDF_DIR"

# Success and failure counters
SUCCESS_COUNT=0
FAIL_COUNT=0

# Loop through all PDFs
for pdf_file in "$PDF_DIR"/*.pdf; do
    if [ -f "$pdf_file" ]; then
        echo "Processing: $(basename "$pdf_file")"
        
        # Make cURL request
        response=$(curl -s -w "%{http_code}" --location 'https://pdf.mypdf-ai.com/api/pdf' \
            --header "access_token: $TOKEN" \
            --form "pdf=@\"$pdf_file\"")
        
        # Extract HTTP code (last 3 characters)
        http_code="${response: -3}"
        response_body="${response%???}"
        
        if [ "$http_code" -eq 200 ]; then
            echo "✔️ $(basename "$pdf_file") - processed successfully"
            SUCCESS_COUNT=$((SUCCESS_COUNT + 1))
        else
            echo "⚠️ $(basename "$pdf_file") - HTTP Error: $http_code"
            FAIL_COUNT=$((FAIL_COUNT + 1))
        fi
        
        # Pause to respect rate limits
        sleep 0.5
    fi
done

echo ""
echo "Summary:"
echo "Total: $((SUCCESS_COUNT + FAIL_COUNT))"
echo "Successes: $SUCCESS_COUNT"
echo "Failures: $FAIL_COUNT"

Need Help?

Our team is ready to help you integrate the myPDF.AI API.

Technical Support More Examples