Case Study · 01

AI Infrastructure for Content Auditing

Building content intelligence infrastructure at a global scale.

Role

Sole designer & builder, independent initiative, no engineering resources

Scope

Up to 12 specialized agents searching dozens of surfaces simultaneously

Impact

Weeks of cross-team work → 30 minutes for one person; skill published company-wide

The Problem

Content audits are notoriously slow, fragmented, and heavily reliant on institutional knowledge. Whenever a specific term or concept needs to be tracked down across multiple product surfaces, the process usually requires a multi-week coordination headache and hunting down different team owners, and even then, the final results are rarely complete.

The issue isn’t just the timeline. It’s the blind spots. Traditional manual audits only uncover what you already know to look for. If a surface is undocumented or a query isn’t formatted perfectly, it slips through the cracks. In regulated content work, a missed term is a compliance and legal liability. For our teams, this reality consistently led to late-stage project delays, unexpected post-launch content updates, and a constant worry that something critical had been missed.

It’s tempting to assume AI alone would close these blind spots. But AI tools have the same gaps as manual audits, just in a different form. A query that comes back empty reads as “nothing exists” rather than “try a different angle,” and a query that comes back with a handful of results reads as “that’s everything” rather than “that’s what surfaced first.” Without expert training, AI frequently presents incomplete results as complete, producing the same false confidence and persisting gaps a manual audit does, just faster.

Manual & AI-assisted audits continually miss content.

The breaking point arrived during a massive update to a complex privacy and data use initiative. The sheer volume of surfaces containing the concept made it obvious that our current approach was broken, unscalable, and risky. We needed a systematic fix.

My Role

I designed and built the solution from scratch as an independent initiative, operating without engineering resources or an official project mandate. Using Claude Code, I built and iterated on a multi-agent content auditing skill with built-in knowledge and persistence for uncovering terms and concepts across dozens of surfaces. I published the skill as an organization-wide installable tool.

One audit query
12 specialized agents, each fluent in one surface’s search logic
Consumer Product UIs Advertiser Product UIs Help Centers Downloads Educational Resources Marketing Sites
dozens of surfaces, searched simultaneously
evaluation loop
One query in, every surface covered, with evaluation agents checking the work.

The Approach

The core technical challenge was that search logic isn’t uniform across product surfaces; querying one system the wrong way yields zero results. Because a single generalist tool would fail, I built a modular architecture using up to 12 specialized agents, each fine-tuned to navigate and extract data from a specific ecosystem.

Persistence mattered as much as the architecture. AI tools without the skill tended to treat a zero-result query as a dead end; the skill’s logic and evaluation layer instead instructed the system to treat it as a signal to reformulate and try again, using each agent’s surface-specific expertise rather than a generic retry.

These agents simultaneously crawled dozens of surfaces, including user-facing applications, advertiser portals, help centers, and marketing sites. Instead of relying on rigid, literal keyword matching, the system analyzed contextual concepts, allowing it to flag variations and related terms that standard search queries completely bypassed.

Audit run: “data consent”
12 agents · dozens of surfaces · related concepts included
Product UI · 47 instances Help Center · 31 Ads surfaces · 19 Marketing · 12 ⚑ 6 legacy terms flagged ⚑ 2 parallel workstreams
Representative mock of an audit run: inventory plus analysis, delivered in ~30 minutes.

To keep the system accurate, I built specialized evaluation agents into the pipeline. This created an internal quality-control loop that refined outputs automatically, rather than relying on manual post-audit cleanups. Before rolling it out broadly, I ran a pilot program with content and design peers, letting them stress-test the tool against surfaces they knew intimately to validate accuracy and build internal trust.

Generic AI search
Attempt 1: 0 results
Gives up at the first dead end
Partial coverage, false confidence
Specialized agents
Attempt 1: 0 results
↻ Reformulate, using surface expertise
Attempt 2: 47 instances found
Persistence built in, full coverage
Off-the-shelf AI stops at the first empty result. The agents keep going until the answer is real.

The final system delivers two core outputs:

The Outcome

A process that used to require weeks of cross-team coordination now takes a single person about 30 minutes. By deploying the tool as a company-wide skill, we democratized thorough, reliable content auditing for content designers and other roles alike. This saved countless hours for future audits.

Before
Weeks, multiple people across teams
After
30 minutes, one person, complete results

For our privacy and compliance operations, the skill added a strong safety net to help ensure audits are complete from the beginning, and it gave teams visibility into legacy language and discrepancies in language as well as overlapping projects impacting the same content.

What’s more, the skill’s flexibility was proven shortly after launch when a legal team needed to locate everywhere a link still existed to a deprecated site, so that they could mitigate an immediate compliance risk. The framework easily adapted itself to search for URL patterns rather than just content, and I was able to identify every instance across our surfaces in minutes. This new use case proved the tool was a versatile piece of infrastructure.