Skip to main content
General

From Zero to 33,115 Zones: A Scientist's Look at 80 Days of TrashAlert Coverage Data

5 min read
By Hudson Taylor

Eighty days ago, TrashAlert had zero zones of coverage data. Today it has 33,115. This isn't a story about AI disrupting an industry or moving fast and breaking things. It's the observation log of a scientist who built a dataset the way you build a lab assay—incrementally, with validation checkpoints, and a honest record of what worked and what didn't.

I started this experiment in late March 2026 with a simple question: Can I build a real-estate coverage database from public sources that's accurate enough to be useful? That question came from a very specific pain point. I own rental property in El Centro, California. Property managers spend enormous amounts of time manually checking whether their properties fall within service areas—trash collection zones, utility boundaries, permit districts. I wanted to automate that.

The experiment began with 0 zones. No data. Just an architecture and a lot of hypothesis about what public GIS sources would actually give us.

The Method

In lab work, you start with a control. You run n=1. You measure. You iterate based on what you find, not what you expected to find.

Same approach here.

I built extraction pipelines for three data sources: Census Bureau boundary files, OpenStreetMap, and county assessor records. Each pipeline was a small experiment. Does Census have the granularity we need? (Mostly yes, some gaps.) Does OSM contain service boundaries? (Sometimes, but incomplete.) Do county records export cleanly? (Rarely.)

By mid-April, I'd extracted initial datasets from all three sources. The numbers were growing, but single sources had limits. I'd hit a clear ceiling on what Census, OSM, or assessor records alone could deliver.

That's when I adjusted the method. Instead of waiting for perfect sources, I started combining them. Where Census data existed, use it. Where it didn't, fill gaps with OSM. Where both failed, use assessor records. This is a pattern I learned in 15 years of LIMS work—when you have incomplete data streams, you layer them and let redundancy do the validation work for you.

By early May, coverage was expanding across multiple cities.

By May 22, 33,115 zones across 21 cities.

The growth looked exponential on the chart, but it wasn't magic. It was method adjustment plus more compute cycles.

What the Data Showed

One thing I noticed: coverage density varies dramatically across jurisdictions. Some cities—like those with well-maintained county GIS systems—have robust public boundary data. Others bury their zone information in PDFs, legacy systems, or don't publish it at all.

Why the variance? Not fragmentation. Data infrastructure. The jurisdictions with clean, queryable public GIS have richer zone coverage. The ones with scattered or proprietary systems... less so.

That tells me something important about the business problem I'm trying to solve. The real friction isn't "cities don't have this data." It's "cities have this data in 47 different formats, and you have to know the specific quirks of each one to extract it."

This is a domain depth problem, not a data-availability problem. Which means the moat isn't "we have the data." It's "we've done the boring work of normalizing it across jurisdictions."

The Limit Test

At 80 days, I stopped and asked: Is this data good enough to use?

Not "perfect." Not "complete." Usable?

I tested it against properties I know. My rental in El Centro. Friends' homes. A random sample of addresses from the dataset. Every zone result was manually spot-checked against the jurisdiction's official GIS data. The ones that diverged fell into two clear categories: (1) boundary edge cases where a property sat exactly on a zone line, and (2) addresses that simply don't exist in the source data.

That's not "we've solved the problem." That's "we've reached the point where the data is useful enough to validate against real user feedback."

So I opened the API. Real property managers started using it. That's when you find out if your lab-bench assay actually works in the field.

What This Taught Me

Building this dataset felt like growing a bacterial culture. You don't decide on day 1 how big it will be. You create the conditions—growth medium, temperature, measurement frequency—and you watch what happens. You adjust based on what you observe, not on your initial projection.

I started with a hypothesis that public data could be layered into something useful. Eighty days later, I had a dataset that was useful enough for its original purpose. Not because I was brilliant. Because I was methodical.

The other thing: there's no shortcut past the domain work. I could have hired someone to "build a real estate database." Instead, I spent 80 days learning which counties publish clean GIS files, which ones hide them in PDFs, which ones don't publish anything at all. That knowledge is now baked into the system. That knowledge is also what makes this hard to replicate.

This is the part that doesn't get written about in startup playbooks. The boring, detailed, domain-specific work that takes time and can't be outsourced to junior developers.

80 Days Forward

The experiment isn't over. I'm at 33,115 zones across 21 cities. There are 300+ cities and counties in the US alone. There are international jurisdictions. Coverage could theoretically go to 100,000+ zones.

But I'm not going to force it. I'm going to watch what users do with the current dataset. I'm going to measure where they're asking for coverage and where they're not. I'm going to let the real data—not my hypothesis—tell me where to focus next.

That's how science works. Build the apparatus. Run the experiment. Measure what actually happens. Iterate based on evidence, not expectation.

Eighty days in, TrashAlert has gone from an idea to a dataset that property managers are actually using. Not because it's perfect. Because it's useful and I haven't stopped refining it.

The experiment continues.

Share:

Related Posts

Get Acting Tips in Your Inbox

Weekly insights on auditions, self-tapes, and booking more roles. No spam.

Join 500+ actors getting weekly tips

Try the tools mentioned in this post

ActorLab offers 19 professional tools for working actors—including Scene Partner Pro.

Get Started Free