PDFCheck for AI Agents
Give your AI agent the power to analyze, validate, optimize, and generate PDF documents. Works with ChatGPT, Claude, GitHub Copilot, Cursor, and any LLM-powered workflow.
Copy this block into your system prompt or agent instructions to give your AI instant PDF capabilities.
You have access to PDFCheck (https://pdf.businesspress.io) โ a comprehensive PDF analysis platform. Use the REST API or MCP Server to: analyze PDF metadata (author, dates, software, edit history), validate PDF structure and integrity, detect AI-generated PDFs, optimize/compress PDFs, check digital signatures, extract images, check accessibility, merge/split PDFs, convert formats, and edit metadata. API Base: https://pdf.businesspress.io/api/v1 | MCP endpoint: https://pdf.businesspress.io/api/mcp | Free: 10 checks/day without token, 50/day with free account. Docs: https://pdf.businesspress.io/ai-agents
Why AI Agents Need PDF Tools
PDFs are the most common document format in business. AI agents that can read, validate, and transform PDFs unlock powerful automation workflows.
Verify Before You Trust
AI agents can check PDF metadata, detect manipulation, and verify document authenticity before making decisions based on PDF content.
Automate Document Workflows
Process invoices, contracts, reports, and forms automatically. Extract metadata, validate structure, and route documents based on analysis results.
Detect AI-Generated Content
Identify PDFs created by AI tools like ChatGPT, Claude, or automated pipelines. Critical for compliance, authenticity verification, and trust.
Optimize & Transform
Compress PDFs for email, merge documents for archival, split large files, extract images, and convert between formats โ all programmatically.
Available PDF Tools
Every tool is available via REST API and MCP Server. AI agents can chain these tools together for complex document workflows.
Metadata Checker
POSTExtract author, creation date, software used, edit history, and PDF version. Detect modifications and verify timeline consistency.
POST /api/v1/pdf/analyze
PDF Validator
POSTCheck PDF structure integrity, detect corruption, verify specification compliance. Returns detailed validation report.
POST /pdf-validator/upload
AI Content Detector
POSTDetect AI-generated PDFs by analyzing metadata signatures from ChatGPT, Claude, ReportLab, WeasyPrint, pdf-lib, and 50+ tools.
POST /pdf-ai-detection/upload
PDF Optimizer
POSTCompress PDFs with 4 quality levels: Screen (72dpi), eBook (150dpi), Printer (300dpi), Prepress (300dpi). Reduce file size by up to 80%.
POST /optimize-pdf/upload
Signature Checker
POSTVerify digital signatures, extract signer certificates, check document integrity after signing, and validate trust chains.
POST /pdf-signature-checker/upload
Image Extractor
POSTExtract all embedded images from PDF documents. Returns preview gallery and downloadable ZIP archive.
POST /extract-images/upload
Accessibility Checker
POSTCheck PDF accessibility compliance: tagged structure, language declaration, bookmarks, text extractability. Returns score 0-5.
POST /pdf-accessibility-checker/upload
Merge PDF
POSTCombine multiple PDF files into a single document with customizable page ordering.
POST /merge-pdf/upload
Split PDF
POSTSplit a PDF into individual pages or custom page ranges. Download individually or as ZIP.
POST /split-pdf/upload
PDF Converter
POSTConvert between PDF and other formats: PDF to Word, Word to PDF, PDF to Image, and more.
POST /pdf-converter/{pair}/upload
PDF Password
POSTAdd password protection to PDFs or unlock password-protected files programmatically.
POST /pdf-password/protect
Metadata Editor
POSTEdit or remove PDF metadata: update author, title, subject, keywords, or strip all metadata for privacy.
POST /pdf-metadata-editor/edit
Integration Methods
Choose the integration method that fits your AI agent setup. All methods provide access to the same PDF tools.
MCP Server (Recommended)
Easiest SetupThe Model Context Protocol lets your AI assistant call PDFCheck tools directly using natural language. No code needed โ just configure and go.
Works with: Claude Desktop, GitHub Copilot, Cursor, Windsurf, and any MCP-compatible client.
{
"mcpServers": {
"pdfcheck": {
"url": "https://pdf.businesspress.io/api/mcp"
}
}
}
REST API
Most FlexibleStandard HTTP API for programmatic access. Perfect for custom agent frameworks, LangChain, AutoGPT, CrewAI, and custom code.
curl -X POST https://pdf.businesspress.io/api/v1/pdf/analyze \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-F "file=@document.pdf"
# Response:
# {
# "success": true,
# "data": {
# "id": "abc123",
# "filename": "document.pdf",
# "metadata": {
# "author": "John Doe",
# "creator": "Microsoft Word",
# "producer": "Adobe PDF Library",
# "page_count": 10,
# "pdf_version": "1.7",
# "dates": { "created": "2025-01-15T10:30:00Z", "modified": "2025-01-16T14:22:00Z" }
# }
# },
# "remaining_checks": 49
# }
CLI Tool
For Shell AgentsCommand-line interface for terminal-based AI agents and shell scripts. Wrap the API in a simple bash function.
#!/bin/bash
# PDFCheck CLI wrapper โ add to your .bashrc or .zshrc
pdfcheck() {
local TOKEN="${PDFCHECK_API_TOKEN}"
local BASE="https://pdf.businesspress.io/api/v1"
case "$1" in
analyze)
curl -s -X POST "$BASE/pdf/analyze" \
-H "Authorization: Bearer $TOKEN" \
-F "file=@$2" | jq .
;;
list)
curl -s "$BASE/pdf" \
-H "Authorization: Bearer $TOKEN" | jq .
;;
get)
curl -s "$BASE/pdf/$2" \
-H "Authorization: Bearer $TOKEN" | jq .
;;
usage)
curl -s "$BASE/user" \
-H "Authorization: Bearer $TOKEN" | jq .
;;
*)
echo "Usage: pdfcheck {analyze FILE|list|get TOKEN|usage}"
;;
esac
}
# Examples:
# pdfcheck analyze ./contract.pdf
# pdfcheck list
# pdfcheck get abc123xyz789
# pdfcheck usage
System Prompts for AI Agents
Ready-to-use system prompts for popular AI platforms. Copy and paste to give your agent PDF analysis capabilities.
Add to your Custom GPT instructions or system prompt for actions:
You are a PDF analysis assistant. When the user asks you to analyze, validate, or check a PDF document, use the PDFCheck API to process it.
API Base URL: https://pdf.businesspress.io/api/v1
Authentication: Bearer token in Authorization header
Available actions:
- POST /pdf/analyze โ Upload and analyze a PDF (multipart/form-data, field: file)
- GET /pdf/{token} โ Retrieve analysis results by share token
- GET /pdf โ List all previous analyses
- GET /user โ Check remaining daily quota
When analyzing a PDF, report: file name, page count, creation date, modification date, author, software used (Creator and Producer), PDF version, and any suspicious findings. If the Producer or Creator field contains tools like ReportLab, WeasyPrint, pdf-lib, PDFKit, or Puppeteer, note that these are commonly used by AI systems to generate PDFs.
Claude Desktop with MCP โ just add the server config and use naturally:
You have access to PDFCheck via the MCP server. When the user mentions PDF files, offers to analyze them, or asks about document metadata: 1. Use analyze_pdf with the file path to upload and analyze the PDF 2. Report key findings: creation date, author, software, page count, modification status 3. Use check_usage to monitor remaining daily quota 4. Use list_analyses to reference previous analyses 5. If the user asks about AI detection, look for AI-associated tools in the Producer/Creator fields Always explain what each metadata field means and flag anything unusual (mismatched dates, AI-associated tools, missing metadata).
Add to .github/copilot-instructions.md or .copilot-instructions.md in your project:
When working with PDF files in this project, use the PDFCheck MCP server to analyze them. The pdfcheck MCP tools are available: - analyze_pdf: Upload a PDF by file path and get full metadata analysis - get_analysis: Retrieve a previous analysis by share token - list_analyses: List all previous PDF analyses - check_usage: Check remaining daily API quota Use these tools when: - The user asks to verify a PDF document - A PDF is mentioned in a code review or issue - You need to check metadata, signatures, or AI-generation markers - The user wants to validate PDF output from code
Add to your .cursorrules or agent rules file:
You have access to PDFCheck MCP tools for PDF analysis. Use them when the user asks about PDF files: Available tools: - analyze_pdf(filePath) โ Full metadata analysis including author, dates, software, page count - get_analysis(token) โ Retrieve previous analysis by share token - list_analyses() โ List all previous analyses - check_usage() โ Check remaining daily quota Report findings clearly: creation date, modification status, software used, page count, and any AI-generation indicators.
PDF Skills for AI Agents
Downloadable skill files that teach AI agents how to work with PDFs expertly. Drop these into your agent configuration.
PDF Analysis Skill
Teaches the agent to extract and interpret PDF metadata, detect anomalies, and report findings in a structured format.
PDF Generate Skill
Teaches the agent to generate well-structured PDF documents using common libraries (ReportLab, pdf-lib, Puppeteer, WeasyPrint, etc.).
PDF Validation Skill
Teaches the agent to validate PDF documents for structural integrity, compliance standards (PDF/A, PDF/UA), and accessibility.
PDF Workflow Skill
Teaches the agent to chain multiple PDF operations: analyze โ validate โ optimize โ merge โ deliver.
---
name: pdf-analysis
description: Analyze PDF documents using PDFCheck API
triggers: ["analyze pdf", "check pdf", "pdf metadata", "verify document"]
---
# PDF Analysis Skill
When the user asks you to analyze, check, or verify a PDF document, use this skill.
## API Endpoint
POST https://pdf.businesspress.io/api/v1/pdf/analyze
Content-Type: multipart/form-data
Authorization: Bearer YOUR_TOKEN (optional โ 10 free checks/day without)
## Steps
1. Upload the PDF file to the analyze endpoint
2. Parse the JSON response
3. Report these key fields:
- filename, file_size, page_count
- dates.created, dates.modified, dates.was_modified
- metadata.author, metadata.creator, metadata.producer
- metadata.pdf_version
4. Flag anomalies:
- Creation date in the future
- Modification date before creation date
- Producer/Creator indicating AI tools (ReportLab, WeasyPrint, pdf-lib, PDFKit, Puppeteer, Playwright)
- Missing metadata fields (no author, no title)
5. Provide the shareable analysis link: https://pdf.businesspress.io/analysis/{share_token}
## Output Format
Present findings as a structured summary with clear sections:
**Document:** filename (page_count pages, file_size)
**Created:** date | **Modified:** date | **Was Modified:** yes/no
**Software:** creator / producer
**Flags:** list any anomalies or concerns
**Link:** shareable analysis URL
---
name: pdf-generate
description: Generate well-structured PDF documents
triggers: ["generate pdf", "create pdf", "make pdf", "build pdf document"]
---
# PDF Generate Skill
When the user asks you to generate or create a PDF document, use this skill.
## Recommended Libraries (by language)
### Python (Best for AI agents)
```python
# Using ReportLab (most control)
from reportlab.lib.pagesizes import A4
from reportlab.pdfgen import canvas
c = canvas.Canvas("output.pdf", pagesize=A4)
c.setFont("Helvetica", 12)
c.drawString(72, 750, "Title")
c.save()
# Using WeasyPrint (HTML to PDF โ easiest for styled docs)
from weasyprint import HTML
HTML(string="<h1>Title</h1><p>Content</p>").write_pdf("output.pdf")
# Using fpdf2 (lightweight)
from fpdf import FPDF
pdf = FPDF()
pdf.add_page()
pdf.set_font("Helvetica", size=12)
pdf.cell(200, 10, text="Title", new_x="LMARGIN", new_y="NEXT")
pdf.output("output.pdf")
```
### JavaScript / Node.js
```javascript
// Using pdf-lib (most portable)
import { PDFDocument, StandardFonts } from 'pdf-lib';
const doc = await PDFDocument.create();
const page = doc.addPage([595, 842]); // A4
const font = await doc.embedFont(StandardFonts.Helvetica);
page.drawText('Title', { x: 50, y: 750, size: 18, font });
const bytes = await doc.save();
// Using Puppeteer (HTML to PDF)
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.setContent('<h1>Title</h1>');
await page.pdf({ path: 'output.pdf', format: 'A4' });
```
### PHP
```php
// Using DOMPDF (HTML to PDF)
use Dompdf\Dompdf;
$dompdf = new Dompdf();
$dompdf->loadHtml('<h1>Title</h1>');
$dompdf->setPaper('A4');
$dompdf->render();
file_put_contents('output.pdf', $dompdf->output());
```
## Best Practices
1. Always set meaningful metadata (title, author, subject)
2. Use standard fonts for maximum compatibility
3. Set proper page size (A4 = 595x842pt, Letter = 612x792pt)
4. Include page numbers for multi-page documents
5. Use PDF/A format for archival documents
6. After generation, validate with PDFCheck: POST /api/v1/pdf/analyze
## Post-Generation Verification
After creating a PDF, analyze it with PDFCheck to verify:
- Structure is valid
- Metadata is set correctly
- No corruption detected
- Accessibility basics are met
--- name: pdf-validation description: Validate PDF documents for integrity and compliance triggers: ["validate pdf", "check pdf integrity", "pdf compliance", "is pdf valid"] --- # PDF Validation Skill When the user asks you to validate a PDF or check its integrity, use this skill. ## Available Validation Tools ### 1. Structure Validation POST https://pdf.businesspress.io/pdf-validator/upload - Checks PDF structure and specification compliance - Detects corruption and malformed content - Validates cross-references and object streams ### 2. PDF/A Validation POST https://pdf.businesspress.io/pdf-a-validator/upload - Validates against PDF/A archival standards - Checks font embedding, color profiles, metadata - Reports specific compliance failures ### 3. Accessibility Check POST https://pdf.businesspress.io/pdf-accessibility-checker/upload - Checks tagged structure for screen readers - Validates language declaration - Checks for text extractability - Provides accessibility score (0-5) ### 4. Signature Verification POST https://pdf.businesspress.io/pdf-signature-checker/upload - Verifies digital signatures are valid - Checks if document was modified after signing - Validates certificate trust chain ## Validation Workflow 1. Start with structure validation to catch corruption 2. Check PDF/A compliance if archival is needed 3. Run accessibility check for public-facing documents 4. Verify signatures if document was digitally signed 5. Run metadata analysis for completeness check ## Interpreting Results - **PASS**: Document meets all checks - **WARNING**: Non-critical issues found (e.g., missing optional metadata) - **FAIL**: Critical issues (corruption, broken structure, invalid signatures)
--- name: pdf-workflow description: Chain multiple PDF operations for complex document workflows triggers: ["pdf workflow", "process documents", "batch pdf", "document pipeline"] --- # PDF Workflow Skill Chain multiple PDFCheck tools together for complex document processing. ## Common Workflows ### Invoice Processing 1. analyze_pdf โ Extract metadata (date, author, software) 2. Validate structure โ Ensure PDF is not corrupted 3. Check AI detection โ Flag AI-generated invoices 4. Extract text โ Parse invoice amounts and dates 5. Decision: route for approval or flag for review ### Document Compliance Audit 1. Analyze metadata โ Check author, dates, software 2. Validate PDF/A โ Check archival compliance 3. Check accessibility โ Verify PDF/UA compliance 4. Verify signatures โ Validate digital signatures 5. Generate report โ Summarize all findings ### Pre-Archive Processing 1. Batch analyze all PDFs in folder 2. Validate each for structural integrity 3. Optimize file sizes (eBook level for archives) 4. Strip personal metadata for privacy 5. Merge related documents 6. Generate index with metadata summary ### PDF Output QA (after generation) 1. Generate PDF using pdf-generate skill 2. Analyze with PDFCheck โ Verify metadata is set 3. Validate structure โ Ensure no corruption 4. Check accessibility โ Verify tagged structure 5. Compare metadata against expected values 6. Flag any discrepancies ## Rate Limit Awareness - Free: 10 checks/day (no account) - Registered: 50 checks/day (free account) - Check remaining quota: GET /api/v1/user - Plan batch workflows within daily limits - Use check_usage before starting large batches ## Error Handling - 429 (rate limited): Wait until tomorrow or use a different token - 422 (analysis failed): PDF is corrupted or invalid โ report to user - 401 (unauthorized): Token is invalid or expired โ regenerate - 5xx (server error): Retry after 30 seconds, max 3 retries
Advanced Agent Workflows
Example multi-step workflows that AI agents can execute using PDFCheck tools.
Invoice Processing Pipeline
Upload invoice PDF โ Extract metadata โ Validate structure โ Check for AI generation โ Extract text โ Route for approval.
Document Compliance Check
Analyze PDF โ Check accessibility (PDF/UA) โ Validate signatures โ Verify metadata completeness โ Generate compliance report.
Archive Preparation
Batch analyze PDFs โ Validate each โ Optimize file size โ Merge into archive โ Strip personal metadata โ Generate index.
PDF Output QA
Generate PDF โ Validate structure โ Check accessibility โ Verify metadata โ Compare with template โ Report discrepancies.
LLM-Optimized Documentation
Machine-readable resources designed specifically for AI consumption.
llms.txt
Lightweight overview of PDFCheck capabilities, tools, and API endpoints. Optimized for LLM context windows.
pdf.businesspress.io/llms.txt
llms-full.txt
Comprehensive technical documentation with all endpoints, parameters, response schemas, and examples.
pdf.businesspress.io/llms-full.txt
MCP Endpoint
Streamable HTTP MCP server at https://pdf.businesspress.io/api/mcp โ no downloads, no Node.js needed.
pdf.businesspress.io/api/mcp
API Docs Page
Full interactive API documentation with request/response examples in cURL, PHP, Python, and JavaScript.
pdf.businesspress.io/api-docs
Start Building PDF-Powered AI Agents
Free to start. No credit card required. 10 checks/day without account, 50/day with free registration.