Developer10 min readJune 2, 2026

Hooking BigShield Into Your LLM Pipeline: Save Tokens, Block Fraud

Step-by-step tutorial for integrating BigShield email validation into LLM and AI application pipelines. Includes Express middleware, async batch processing, and cost savings calculations.

Why LLM Pipelines Need Email Validation

If you are running an AI product with a free tier (or even a generous paid tier), you have a fraud problem. We have seen it across hundreds of LLM platforms: fraudsters sign up with fake emails, burn through free credits, and move on. Rinse and repeat with another fake email.

The math is brutal. A single GPT-4-class API call costs $0.01-0.06 depending on context length. A fraudster who creates 100 accounts and burns the free tier on each might consume $500-2,000 in compute before you notice. Scale that to a fraud ring operating thousands of accounts, and you are looking at tens of thousands per month in wasted tokens.

The solution is simple in concept: validate the email before you spend a single token. In practice, the integration needs to be fast (you do not want to add seconds to your signup flow), reliable (downtime means blocked signups), and smart (you need to catch fraud without blocking real users). This tutorial shows you how to wire BigShield into your LLM pipeline at every level.

Architecture Overview

There are three main integration points for email validation in an LLM pipeline:

Signup gate: Validate the email when the user creates an account, before they get any API keys or credits
Request middleware: Re-validate on each API request (using cached results) to catch accounts that were initially clean but later flagged
Batch processing: Async validation for bulk user imports, waitlist processing, or periodic re-evaluation of your user base

Let's implement each one. If you have not set up BigShield yet, our zero-to-hero implementation guide covers the basics.

Step 1: Signup Gate Validation

This is the most important integration point. You want to validate the email before provisioning any resources. Here is a complete Express route handler:

import express from 'express';

const app = express();
app.use(express.json());

const BIGSHIELD_API_KEY = process.env.BIGSHIELD_API_KEY;
const BIGSHIELD_URL = 'https://bigshield.app/api/v1/validate';

interface BigShieldResponse {
  email: string;
  score: number;           // 0-100, higher = more trustworthy
  verdict: 'pass' | 'warn' | 'fail';
  signals: Array<{
    name: string;
    score_impact: number;
    confidence: number;
    details: string;
  }>;
  cached: boolean;
  latency_ms: number;
}

async function validateEmail(email: string): Promise<BigShieldResponse> {
  const response = await fetch(BIGSHIELD_URL, {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'Authorization': `Bearer ${BIGSHIELD_API_KEY}`,
    },
    body: JSON.stringify({ email }),
  });

  if (!response.ok) {
    throw new Error(`BigShield API error: ${response.status}`);
  }

  return response.json() as Promise<BigShieldResponse>;
}

app.post('/api/signup', async (req, res) => {
  const { email, password, name } = req.body;

  // Validate email BEFORE creating the account
  try {
    const validation = await validateEmail(email);

    if (validation.verdict === 'fail') {
      // Score below 30: almost certainly fraudulent
      return res.status(422).json({
        error: 'This email address cannot be used for signup.',
        // Don't reveal the specific reason to avoid helping fraudsters
      });
    }

    if (validation.verdict === 'warn') {
      // Score between 30-85: suspicious but not definitive
      // Require additional verification
      return res.status(200).json({
        requiresVerification: true,
        message: 'Please verify your email to continue.',
      });
    }

    // Score above 85: looking good, proceed with account creation
    const user = await createUser({ email, password, name });

    // Store the BigShield score for future reference
    await storeValidationResult(user.id, validation);

    // Provision API keys and free-tier credits
    const apiKey = await provisionApiKey(user.id);

    return res.status(201).json({
      user: { id: user.id, email },
      apiKey,
    });
  } catch (error) {
    // If BigShield is unreachable, fail open but flag for review
    console.error('BigShield validation failed:', error);
    const user = await createUser({ email, password, name });
    await flagForManualReview(user.id, 'validation_unavailable');
    const apiKey = await provisionApiKey(user.id);

    return res.status(201).json({
      user: { id: user.id, email },
      apiKey,
    });
  }
});

A few important design decisions here:

Fail open: If BigShield is unreachable, we still create the account but flag it for review. This prevents an outage from blocking all signups.
Vague error messages: We never tell the user why their email was rejected. Specific error messages help fraudsters tune their approach.
Three-tier response: Pass, warn, and fail create different user experiences. Warned users get a chance to verify rather than being blocked outright.

Step 2: Express Middleware for Per-Request Validation

Once a user has an API key, you want to continue monitoring. An account that was clean at signup might get flagged later (for example, if the email domain starts being used for spam). Here is middleware that checks a cached validation score on every request:

import { Redis } from '@upstash/redis';

const redis = new Redis({
  url: process.env.UPSTASH_REDIS_URL!,
  token: process.env.UPSTASH_REDIS_TOKEN!,
});

interface CachedValidation {
  score: number;
  verdict: string;
  validatedAt: string;
}

// Middleware: check email validation on every LLM request
function bigshieldMiddleware(options: {
  blockThreshold?: number;
  cacheHours?: number;
  revalidateHours?: number;
} = {}) {
  const {
    blockThreshold = 30,
    cacheHours = 24,
    revalidateHours = 168,  // Re-validate weekly
  } = options;

  return async (
    req: express.Request,
    res: express.Response,
    next: express.NextFunction
  ) => {
    const apiKey = req.headers['x-api-key'] as string;
    if (!apiKey) {
      return res.status(401).json({ error: 'Missing API key' });
    }

    const user = await getUserByApiKey(apiKey);
    if (!user) {
      return res.status(401).json({ error: 'Invalid API key' });
    }

    // Check cached validation result
    const cacheKey = `bigshield:validation:${user.email}`;
    let cached = await redis.get<CachedValidation>(cacheKey);

    if (!cached) {
      // No cache, validate now
      try {
        const result = await validateEmail(user.email);
        cached = {
          score: result.score,
          verdict: result.verdict,
          validatedAt: new Date().toISOString(),
        };
        await redis.set(cacheKey, cached, { ex: cacheHours * 3600 });
      } catch {
        // If validation fails, allow the request but log it
        console.warn(`BigShield validation failed for ${user.email}`);
        return next();
      }
    }

    // Check if revalidation is needed
    const validatedAt = new Date(cached.validatedAt);
    const hoursSinceValidation =
      (Date.now() - validatedAt.getTime()) / (1000 * 60 * 60);

    if (hoursSinceValidation > revalidateHours) {
      // Trigger async revalidation (don't block the request)
      revalidateAsync(user.email, cacheKey).catch(console.error);
    }

    // Block if score is too low
    if (cached.score < blockThreshold) {
      return res.status(403).json({
        error: 'Account suspended. Contact support.',
      });
    }

    // Attach score to request for downstream use
    (req as any).emailScore = cached.score;
    next();
  };
}

async function revalidateAsync(email: string, cacheKey: string) {
  const result = await validateEmail(email);
  const cached: CachedValidation = {
    score: result.score,
    verdict: result.verdict,
    validatedAt: new Date().toISOString(),
  };
  await redis.set(cacheKey, cached, { ex: 24 * 3600 });

  // Alert if a previously good account is now flagged
  if (result.score < 30) {
    await alertTeam(`Account ${email} dropped to score ${result.score}`);
  }
}

// Apply the middleware to your LLM endpoints
app.use('/api/v1/completions', bigshieldMiddleware({ blockThreshold: 25 }));
app.use('/api/v1/embeddings', bigshieldMiddleware({ blockThreshold: 25 }));

This middleware adds near-zero latency for cached results (a single Redis lookup). Fresh validations happen asynchronously when the cache expires, so users never experience added wait time after their initial signup.

Step 3: Async Batch Processing

Sometimes you need to validate emails in bulk. Maybe you are importing users from another platform, processing a waitlist, or doing a periodic audit of your user base. Here is an efficient batch processing implementation using a simple queue:

interface BatchJob {
  emails: string[];
  onComplete: (results: Map<string, BigShieldResponse>) => void;
}

async function validateBatch(
  emails: string[],
  options: {
    concurrency?: number;
    delayMs?: number;
    onProgress?: (completed: number, total: number) => void;
  } = {}
): Promise<Map<string, BigShieldResponse>> {
  const {
    concurrency = 10,
    delayMs = 50,
    onProgress,
  } = options;

  const results = new Map<string, BigShieldResponse>();
  const queue = [...emails];
  let completed = 0;

  async function processOne(): Promise<void> {
    while (queue.length > 0) {
      const email = queue.shift();
      if (!email) break;

      try {
        const result = await validateEmail(email);
        results.set(email, result);
      } catch (error) {
        console.error(`Failed to validate ${email}:`, error);
        // Retry once after a delay
        await new Promise(r => setTimeout(r, 1000));
        try {
          const result = await validateEmail(email);
          results.set(email, result);
        } catch {
          // Store a failure result
          results.set(email, {
            email,
            score: -1,
            verdict: 'fail' as const,
            signals: [],
            cached: false,
            latency_ms: 0,
          });
        }
      }

      completed++;
      onProgress?.(completed, emails.length);

      // Rate limit to stay within BigShield API limits
      if (delayMs > 0) {
        await new Promise(r => setTimeout(r, delayMs));
      }
    }
  }

  // Run workers in parallel
  const workers = Array.from(
    { length: Math.min(concurrency, emails.length) },
    () => processOne()
  );
  await Promise.all(workers);

  return results;
}

// Example: Audit all users who signed up in the last 30 days
async function auditRecentSignups() {
  const recentUsers = await db.query(
    'SELECT email FROM users WHERE created_at > NOW() - INTERVAL '30 days''
  );

  const emails = recentUsers.rows.map(r => r.email);
  console.log(`Auditing ${emails.length} recent signups...`);

  const results = await validateBatch(emails, {
    concurrency: 5,
    delayMs: 100,
    onProgress: (done, total) => {
      if (done % 100 === 0) {
        console.log(`Progress: ${done}/${total}`);
      }
    },
  });

  // Flag accounts that score poorly
  let flagged = 0;
  for (const [email, result] of results) {
    if (result.score >= 0 && result.score < 30) {
      await flagForReview(email, result);
      flagged++;
    }
  }

  console.log(`Audit complete. Flagged ${flagged} accounts for review.`);
}

Cost Savings Calculations

Let's do the math on what this integration actually saves you. These numbers are based on real data from our case study on token waste savings.

Scenario: Mid-size AI startup, 10,000 signups/month

Without BigShield:

14% fraudulent signups = 1,400 fake accounts per month
Average free-tier usage per fraudulent account: $18 in tokens
Monthly token waste: $25,200

With BigShield:

10,000 validations at $0.005/each = $50/month (or free on the free tier for under 1,000/month)
Catch rate: ~92% of fraudulent signups blocked
Remaining fraud: 112 accounts x $18 = $2,016
Monthly token waste: $2,016
Monthly savings: $23,184
ROI: 463x

Scenario: Larger platform, 100,000 signups/month

Without BigShield:

14% fraud = 14,000 fake accounts
Monthly token waste: $252,000

With BigShield:

100,000 validations at $0.003/each (volume pricing) = $300/month
Remaining fraud: 1,120 accounts x $18 = $20,160
Monthly savings: $231,540

Even conservative estimates show 100x+ ROI. The validation cost is negligible compared to the compute costs of serving fraudulent accounts.

Advanced: Score-Based Token Budgets

Here is a pattern we love: instead of binary allow/block, use the BigShield score to dynamically set token budgets. Higher-trust accounts get more generous limits:

function getTokenBudget(emailScore: number, plan: string): number {
  const baseBudget: Record<string, number> = {
    free: 10_000,
    starter: 100_000,
    pro: 1_000_000,
  };

  const base = baseBudget[plan] || baseBudget.free;

  // Score 90-100: full budget
  // Score 70-89: 75% budget
  // Score 50-69: 50% budget
  // Score 30-49: 25% budget (these passed but are borderline)
  if (emailScore >= 90) return base;
  if (emailScore >= 70) return Math.floor(base * 0.75);
  if (emailScore >= 50) return Math.floor(base * 0.5);
  return Math.floor(base * 0.25);
}

// Use in your completion endpoint
app.post('/api/v1/completions', bigshieldMiddleware(), async (req, res) => {
  const emailScore = (req as any).emailScore;
  const user = (req as any).user;

  const tokenBudget = getTokenBudget(emailScore, user.plan);
  const tokensUsed = await getMonthlyTokenUsage(user.id);

  if (tokensUsed >= tokenBudget) {
    return res.status(429).json({
      error: 'Monthly token limit reached.',
      limit: tokenBudget,
      used: tokensUsed,
    });
  }

  // Proceed with LLM call, passing remaining budget
  const remainingTokens = tokenBudget - tokensUsed;
  const result = await generateCompletion(req.body, {
    maxTokens: Math.min(req.body.max_tokens || 4096, remainingTokens),
  });

  return res.json(result);
});

This approach is elegant because it does not create a hard barrier at any score threshold. Legitimate users who happen to have a slightly suspicious email (maybe they use a privacy relay) still get access, just with a more conservative budget until they build trust.

Error Handling and Resilience

Production integrations need solid error handling. Here are the key patterns:

Circuit breaker: If BigShield returns errors on 3+ consecutive calls, disable validation for 60 seconds and fail open. Do not let a transient API issue block all signups.
Timeout: Set a 2-second timeout on validation calls. BigShield typically responds in under 200ms, so if a request takes longer than 2 seconds, something is wrong.
Idempotency: Cache validation results by email for at least an hour. There is no reason to re-validate the same email on every page load during a signup flow.
Graceful degradation: If you cannot validate, let the user in but apply the minimum token budget and flag for async review.

Next Steps

That covers the main integration patterns for LLM pipelines. The key insight is that email validation should happen as early as possible in the pipeline, before you allocate any compute resources, and the results should be cached and used throughout the user lifecycle.

BigShield's API is designed for exactly this use case: sub-200ms response times, simple REST API, and a scoring model that works across industries. Get started with the free tier (1,000 validations/month) at bigshield.app and see how it fits into your stack.