Schema-Constrained AI Generation

The hardest problem in AI-powered content generation isn't making the output creative. It's making the output valid. When you ask an LLM to generate a WordPress page, it needs to produce JSON that WordPress can actually parse, render, and save. Here's how we solved this with schema-constrained generation.

The Hallucination Problem

Large language models hallucinate. They invent API endpoints that don't exist, reference functions with wrong signatures, and generate data structures with invalid fields. For code generation, this is annoying. For page generation, it's a showstopper.

WordPress blocks have strict schemas. A heading block needs a level attribute between 1 and 6. A columns block needs an innerBlocks array where each child has a specific structure. A button block needs a url, text, and optional className from a valid set.

If the AI generates a heading with level: 7, the page breaks. If it invents a block type called core/fancy-card, WordPress crashes silently. These failures aren't caught by the AI — it has no feedback loop to verify its own output.

Structured Output with Gemini

Our approach uses Gemini's structured output mode, which constrains the model's response to match a predefined JSON schema. This isn't post-processing or validation — the constraint happens during generation. The model literally cannot produce tokens that violate the schema.

const schema = {
  type: "object",
  properties: {
    blocks: {
      type: "array",
      items: {
        type: "object",
        properties: {
          blockName: { enum: VALID_BLOCK_TYPES },
          attrs: { /* per-block schema */ },
          innerBlocks: { /* recursive */ }
        },
        required: ["blockName"]
      }
    }
  }
};

59 element types

Our schema defines 59 valid block types, each with its own attribute constraints. Headings have level 1—6. Images require url and alt. Columns must contain column children. Every constraint is encoded in the schema, not in the prompt.

32 CSS class presets

The className attribute on each block is restricted to a predefined set of Stride CSS classes. The AI can't invent classes. It can only select from the validated set. This guarantees every class it applies actually exists in the stylesheet.

The Validation Pipeline

Even with structured output, we run a post-generation validation pipeline:

  1. Schema validation — confirms the JSON matches the expected structure
  2. Reference integrity — checks that all internal links point to valid blocks
  3. CSS class validation — verifies every className against the Stride class registry
  4. Hierarchy validation — ensures heading levels follow proper nesting (no h4 after h1)
  5. Accessibility audit — checks for alt text, link text, and semantic structure

Pages that fail any check are regenerated with corrective context. In practice, the structured output mode eliminates 98% of schema violations at generation time, and the pipeline catches the remaining edge cases.

Multi-Language Considerations

Schema constraints become even more important for multi-language generation. When generating a page in Arabic, the AI needs to produce valid RTL markup. In Japanese, it needs to handle line-breaking correctly. The schema encodes these language-specific constraints so the model can't produce structurally invalid content regardless of the target language.

The key insight: AI-generated content is only useful if it's guaranteed valid. Schema constraints move validation from hope to engineering.

Results

With this approach, Gallop produces valid, publish-ready WordPress pages on the first attempt in 99.6% of cases. The remaining 0.4% are caught by the validation pipeline and auto-corrected. Zero manual intervention required.


Try Gallop

Generate your first WordPress page in under 60 seconds.

Get Early Access →

Get Early Access

Join the waitlist and be the first to try Gallop Builder.

No spam. Unsubscribe whenever.