No means no, even for Perplexity

Perplexity’s response to the Cloudflare report about their crawling practices is terrible, and honestly, I’m not sure what’s more concerning: that they completely missed the point about consent, or that they seem genuinely confused about why people are upset.

Let me break this down in plain terms. When a website puts up a robots.txt file, it’s basically putting up a “No Solicitors” sign on their digital front door. It doesn’t matter if you’re selling Girl Scout cookies, collecting for charity, or claim you’re there to help the homeowner. The sign says no, and that should mean no. Period.

Perplexity’s entire defense boils down to “we’re different because we’re helping users.” They keep insisting that because their AI only fetches content when a real person asks a question, somehow that makes ignoring robots.txt okay. They even compare themselves to Google’s user-triggered fetchers, as if that settles the matter in their favor.

Here’s the thing though: intent doesn’t override consent. If I tell solicitors not to knock on my door, I don’t care if they’re selling something I might actually want, or if they only knock when someone specifically asked them to check my neighborhood. I said no. That should always be enough.

The restaurant analogy they use actually proves my point perfectly. When they say their AI goes to websites to find “the latest reviews for that new restaurant,” they’re essentially walking into someone’s digital restaurant, reading their menu, taking notes, and walking out without asking permission or paying for anything. The fact that they’re doing it for a user doesn’t change the fundamental transaction.

What really gets me is how they frame this as some noble fight for the health of the open web. They warn about a two-tiered internet where access depends on having the right tools blessed by infrastructure controllers, but they’ve got it all backwards. Respecting robots.txt isn’t creating barriers, it’s respecting the basic property rights that make the web work in the first place.

Website owners aren’t being unreasonable gatekeepers when they use robots.txt. They’re making informed decisions about how their content gets used. Maybe they’re concerned about server load. Maybe they have licensing agreements that restrict automated access. Maybe they just don’t want their content feeding someone else’s AI without their permission. These are all valid reasons, and they don’t need to justify them to anyone.

The most frustrating part of Perplexity’s response is how they keep talking about user-driven access as if it’s fundamentally different from traditional crawling. They insist they don’t store information or use it for training, just to answer immediate questions. But that’s still using someone’s content without permission. The fact that you delete it afterward doesn’t make the initial taking okay.

Think about it this way: if someone broke into your house, read your diary to answer their friend’s question about you, then left without taking anything physical, would you be okay with that because it was “user-driven” and they didn’t keep copies? Of course not. The violation is in the access without permission, not necessarily in what happens afterward.

Then they have the nerve to attack Cloudflare’s competence. Maybe Cloudflare got some technical details wrong, and they did confuse some traffic sources. But instead of addressing the core ethical issue, Perplexity spent most of their response playing gotcha with traffic attribution and questioning Cloudflare’s expertise. It seems like pretty obvious deflection.

The technical arguments about whether specific requests came from Perplexity directly or from third-party services they use miss the forest for the trees. If you’re using BrowserBase to access websites that have asked not to be crawled, you’re still responsible for that access. You can’t outsource your way out of ethical obligations.

I find it really troubling how Perplexity talks about websites that deny them access as somehow harming users. They paint a picture of people losing access to valuable information because their chosen tools got blocked. But this frames the entire debate wrong. Users aren’t entitled to have AI tools access every website on their behalf. Website owners get to decide how their content is used, and users need to respect those decisions.

I keep coming back to the consent issue because it’s really that simple. When someone says no, you don’t get to redefine their “no” as unreasonable, outdated, or harmful to the greater good. You don’t get to decide that your use case is special enough to override their clearly stated preferences.

The web works because of mutual respect and agreed-upon standards. Robots.txt might be an old standard, but it’s still the primary way website owners communicate their preferences about automated access. When companies like Perplexity decide those preferences don’t apply to them, they’re not advancing the open web, they’re undermining the trust that makes it possible in the first place.

If Perplexity really believes their approach benefits users and website owners, they should be working to establish new standards that balance everyone’s interests. They should be advocating for better, more fine-grained ways to signal consent for AI access. Instead, they’re arguing that consent is irrelevant as long as their intentions are good.

That’s not how consent works in any other context, and it shouldn’t be how it works on the web either. No means no, even when it’s written in a robots.txt file, and even when you think you know better. The fundamental principle here is simple: consent matters on the web just like everywhere else. Until Perplexity understands that, there’s not much more to say.