Notes on defensibility in the age of AI

Much of the advice given by founders and VC's on how to build a moat has aged incredibly quickly since the release of GPT 3.5 in November 2022. Silicon valley has lectured us about "building in bits" since the dawn of web 2.0 - and almost overnight the consensus has flipped. You either integrate hardware into your value chain - or be ousted by LLM's that can help build software faster & cheaper.

I've been spending significant time thinking about this. It's obvious that while hardware increases defensibility - it does so at the expense of reduced distribution power, ability to iterate, and capital efficiency. These same properties are exactly what make pure software startups so attractive to investors and builders.

There are other strategies for building a moat outside of hardware. Network effects businesses like Facebook are relatively insulated from immediate AI disruption because the value comes from the billions of people using it - not just the software.

Aside from network effects, most software company moats come from one or more of the following:

Switching costs typically increase the more tightly coupled a companies technology/infrastructure is with a particular vendor. Once your company has logged millions of messages in Slack - it's difficult and often not worth the effort to migrate for small feature improvements.
Examples: AWS, Slack, Shopify

Patents are for boomers, right? They work! Big tech companies hold thousands of patents that protect various aspects of their core technology. PageRank in the case of Google's Ad business, and touchscreen heuristics for Apple's iPhone/iPad products.
Examples: iPhone touch screen, PageRank algorithm

Brand represents reputation and trust. At the time of writing, Apple recently announced their Vision Pro product to huge enthusiasm from audiences previously unexcited about Meta's Oculus devices - despite having many of the same features.
Examples: Google, Apple

Technology moats are exactly what they sound like. These software companies have large and extremely complex tech stacks that would be difficult for competitors to replicate even with significant funding.
Examples: Cruise, OpenAI

Vertical Integration is a type of moat that exists when a company controls various stages of the value chain or production of a product. I like to think of the "base" of this value chain (in the case of software companies) as the "atom" layer - or the part that interfaces with humans. In this sense, Android phones are the base layer, which automatically install Google Chrome, which defaults to Google's search engine. This allows Google to release & maintain Chrome for free. It's part of a broader strategy to protect search.
Examples: Google Search, Salesforce

But in an era of rapidly improving AI code generation where future consumers can build their own, hyper-customized solutions for a fraction of the engineering effort - it's important to think about defensibility a little bit differently.

The model architecture itself (assuming it's not an industry-altering discovery) will not be your moat. Fine tuning existing models with publicly available data - no matter how uniquely it's segmented or hard it was to collect - will also not be your moat.

But designing your product such that collecting private data is intrinsic to it's core value can be a long term strategy. This is traditionally called a data moat (or data network effects) - and it's particularly useful in the age of AI given the importance of high quality data in training runs. Let's run through a few examples, both real and hypothetical:

Rewind.AI is a new AI startup that records your screen and makes it semantically searchable via the GPT-4 API. The model they're using isn't proprietary (GPT-4) - but they're fine tuning it using data no one else can access. Even better, the longer a customer uses rewind, the higher their switching costs - since Rewind continues to build it's treasure trove of private data just from daily use.

Cheggmate is a new AI homework assistant being developed by popular education platform Chegg. Their private library of billions of pieces of homework content are used to fine tune GPT-4 and other models. Competitors without access to this library would likely suffer from data quality issues - impairing the performance of their models.

Tesla's autonomous driving program has substantial advantages over competitors like Cruise due to the millions of hours of training data Tesla generates daily - often encountering new, real world scenarios and edge cases that are hard to simulate.

Current AI model architectures are limited by their dependence on enormous quantities of high quality data. Unlike humans who develop "real world models" from every day experiences, they can't yet learn to drive a car in 20 hours or do the dishes. This limitation is what makes designing around private data collection so important for your startups defensibility.

I'll continue to update this blog post as new, novel model architectures emerge that may have less constraints on the volume of high quality data required.