Overview

API Deals provides access to Anthropic's Claude models through a fully compatible REST API at 20% below official pricing. Your existing code that targets the Anthropic Messages API works with API Deals out of the box — just swap the base URL and API key.

Base URL:  https://api.apideals.org
Endpoint:  POST /v1/messages

Every request is routed through AWS Bedrock, giving you the same model quality and latency guarantees as the official API while we handle billing, usage tracking, and cost optimization on your behalf.

Authentication

All API calls are authenticated with a bearer token. You can find and rotate your API key from the Dashboard → API Key page.

curl https://api.apideals.org/v1/messages \
  -H "Authorization: Bearer sk-your-api-key" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{ "model": "claude-sonnet-4-5-20250514", "max_tokens": 1024, "messages": [...] }'
Keep your API key secret. If you suspect a key has been exposed, rotate it immediately from the dashboard. The old key is invalidated the moment a new one is issued.

Sending Messages

The request and response format is identical to the Anthropic Messages API. Here is a minimal example:

POST /v1/messages
Content-Type: application/json
Authorization: Bearer sk-your-api-key
anthropic-version: 2023-06-01

{
  "model": "claude-sonnet-4-5-20250514",
  "max_tokens": 1024,
  "messages": [
    {
      "role": "user",
      "content": "What is the capital of France?"
    }
  ]
}

Response

{
  "id": "msg_abc123",
  "type": "message",
  "role": "assistant",
  "content": [
    { "type": "text", "text": "The capital of France is Paris." }
  ],
  "model": "claude-sonnet-4-5-20250514",
  "usage": {
    "input_tokens": 14,
    "output_tokens": 10
  }
}

Both streaming (stream: true) and non-streaming responses are supported. When streaming, the API returns server-sent events (SSE) in the same format as the official Anthropic API.

Long prompts:for requests with large context (> ~50K input tokens) we strongly recommend setting stream: true. AWS Bedrock can take tens of seconds to emit its first token on very long inputs; streaming forwards each chunk immediately, so your client sees progress and stays within the gateway's 600-second read timeout. Non-streaming requests that exceed 600 seconds time-to-first-byte will return 504 Gateway Timeout.
Request size limits: Claude models on Bedrock cap context length at 200,000 tokens (input + previous messages). The gateway additionally rejects requests whose raw body exceeds 32 MB with 413 Payload Too Large. This is generous enough for a full 200K-token context even with base64 images and PDFs; requests beyond it are almost always a client-side bug (e.g. looped message history).

Multimodal Input: Text, Images & PDFs

Claude models are multimodal. You can send text, images, and PDF documentsin a single request. Each message's content field accepts an array of content blocks:

Text

{ "type": "text", "text": "Describe this image." }

Images

Send images as base64-encoded data. Supported formats: image/jpeg, image/png, image/gif, and image/webp.

{
  "type": "image",
  "source": {
    "type": "base64",
    "media_type": "image/jpeg",
    "data": "<base64-encoded-image>"
  }
}

PDF Documents

Upload PDFs as base64 content blocks. Claude will extract and reason over the document's text and visual content.

{
  "type": "document",
  "source": {
    "type": "base64",
    "media_type": "application/pdf",
    "data": "<base64-encoded-pdf>"
  }
}

Combined Example

{
  "model": "claude-sonnet-4-5-20250514",
  "max_tokens": 2048,
  "messages": [
    {
      "role": "user",
      "content": [
        { "type": "text", "text": "Compare the chart in this image with the data in the PDF." },
        {
          "type": "image",
          "source": {
            "type": "base64",
            "media_type": "image/png",
            "data": "<base64-chart>"
          }
        },
        {
          "type": "document",
          "source": {
            "type": "base64",
            "media_type": "application/pdf",
            "data": "<base64-report>"
          }
        }
      ]
    }
  ]
}
All input types — text, images, and PDFs — are counted as input tokens for billing purposes. Images and PDFs are converted to tokens by the model automatically.

Pricing & Cost Calculation

Costs are calculated per request based on the number of tokens consumed. Each response includes a usage object with exact token counts.

How Costs Are Calculated

input_cost  = (input_tokens  / 1,000,000) × input_price_per_1M
output_cost = (output_tokens / 1,000,000) × output_price_per_1M
total_cost  = input_cost + output_cost

Claude Sonnet 4.5 Pricing

Token TypeOfficial (per 1M)Our Price (per 1M)
Input$3.00$2.40
Output$15.00$12.00
Long Context Input (>200K)$6.00$4.80
Long Context Output (>200K)$22.50$18.00
Cache Read$0.30$0.24
Cache Write$3.75$3.00
Batch Input$1.50$1.20
Batch Output$7.50$6.00
Long Context pricing applies when total context exceeds 200K tokens. Cache Read / Write pricing applies when using prompt caching. Batch pricing applies to asynchronous batch requests.

You can monitor your spending in real time from Dashboard → Usage. Each request is logged with its token count, cost, latency, and status.

Supported Models

The following Claude models are available through API Deals. Pass the model ID in the model field of your request.

Model IDDescription
claude-sonnet-4-5-20250514Best balance of speed, cost, and capability (recommended)
claude-haiku-3-5-20241022Fastest responses, lowest cost
claude-3-opus-20240229Highest capability for complex tasks

Use GET /v1/models to retrieve the up-to-date list of available models and their current pricing tiers.

Wallet & Billing

API Deals uses a prepaid wallet system. You deposit funds into your wallet, and each API request automatically deducts the computed cost from your balance.

Wallet Balance

Your wallet has three components:

  • Available Balance— funds available for API usage
  • Locked Balance— funds reserved for pending transactions
  • Bonus Credit— promotional or sign-up credits

Deposits

Fund your wallet via cryptocurrency deposits (BTC, ETH, USDT, and more). Deposits are confirmed on-chain and credited to your wallet automatically.

Billing Periods

Usage is aggregated into monthly billing periods. You can view per-period breakdowns including total requests, tokens consumed, and costs from the Dashboard → Usage page.

Rate Limits

Rate limits are applied per API key to ensure fair usage across all customers. When you exceed the limit, the API returns a 429 status code with a RATE_LIMITED error.

{
  "error": {
    "code": "RATE_LIMITED",
    "message": "Too many requests. Please retry after a short delay.",
    "status": 429
  }
}

Best practice: implement exponential backoff with jitter in your client. Most official Anthropic SDKs handle this automatically.

Error Handling

All errors follow a consistent envelope format:

{
  "error": {
    "code": "ERROR_CODE",
    "message": "Human-readable description.",
    "status": 400
  }
}
StatusCodeMeaning
401INVALID_API_KEYAPI key is missing, invalid, or revoked
402INSUFFICIENT_BALANCEWallet balance (including bonus) has run out — top up at /wallet
403USER_INACTIVEAccount has been deactivated
413PAYLOAD_TOO_LARGERequest body exceeds 32 MB — check for runaway message history or unencoded attachments
429RATE_LIMITEDToo many requests
502BEDROCK_ERRORUpstream error from AWS Bedrock
503SERVICE_UNAVAILABLEService is temporarily down
504GATEWAY_TIMEOUTUpstream did not emit a response within 600s — retry with stream: true for long prompts