How a new dev tool gets cited by AI when the community is fading
The advice to seed Reddit is built on a developer community that is both smaller than the charts suggest and shrinking in real time. Reddit is about 2.5% of what LLMs cite, and its pipe to the models is narrowing. A brand-new tool wins by writing for the agent, not the forum.

A founder asked me last week how his three-month-old startup could show up in ChatGPT and Claude the way the incumbents do. He had a product, a website, a GitHub repo, and one quarter to make something happen. The honest answer has two halves, and the second half has changed since the last time I wrote about this.
The first half: he can't win pre-training. When a model recommends a Postgres backend with built-in auth and vector search, it learned that association from years of developer-written artifacts that existed before the training cut-off. Show HN posts, Stack Overflow answers, GitHub issues that became reference threads, hundreds of tutorials on personal blogs. A company launching today can't reverse-engineer a decade of public goodwill in six months. Pre-training updates on 12-to-18-month cycles, and the corpus for the next one closed months ago.
The second half is the part that's shifted. The standard advice for the retrieval game, meaning the live web content LLMs pull at query time, is "seed Reddit, get on dev.to, collect a few G2 reviews." I've given a version of that advice myself. It rests on a developer community that is both smaller in the citation data than the charts suggest and shrinking in real time. For a brand-new tool, betting your visibility on that community is betting on the wrong horse.
This post is the corrected playbook.
The Reddit number everyone quotes is the wrong number
You've seen the stat: Reddit shows up in 40% of LLM answers, ahead of Wikipedia, YouTube, and LinkedIn. I quoted it myself in an earlier post. It's misleading, and it's worth seeing why.
That figure comes from a pie chart of the top ten cited domains. Inside that top ten, Reddit looks enormous. But the top ten is not the whole pie. Across the full set of citations, Reddit is about 2.5% and Wikipedia about half a percent. Ninety-five percent of cited domains sit outside the top ten entirely. Reddit is the single most-cited domain, which is real and worth something. A strategy built only on Reddit still ignores 95% of the places models actually pull from.
Then there's the pipe between Reddit and the models, which is narrowing. Anthropic has reportedly lost access to Reddit after a scraping dispute, and Reddit's own robots.txt now tells crawlers to stay out. I wrote a few days ago about how Reddit is walling off the last honest channel. The forum is as honest as ever. What changed is access. The connection between it and the models is getting thinner, because Reddit's data-licensing revenue is a rounding error next to its ad business, and it would rather sell that data than let it get scraped for free.
The community layer that recommended tools is thinning out
Reddit is one symptom. Stack Overflow is effectively dead. Independent YouTube tutorials are losing to AI-generated content and to viewers who'd rather ask a model than sit through a fifteen-minute walkthrough. Tim Ferriss, looking at his own book sales down 46% in 2025, made the case that prescriptive nonfiction is next, that anything whose value is transferring instructions from one mind to another now competes with a chatbot that does it instantly and for free.
Two kinds of source taught the models almost everything they know about dev tools: the community, meaning Stack Overflow, Reddit, blogs, and YouTube, and commercial entities, meaning vendor docs and official guides. The community wrote for an audience. Now that the audience asks the model directly instead of reading the forum, the community's reason to keep writing erodes. The vendor's reason does the opposite. When the reader is an agent deciding what to recommend, a vendor's incentive to publish only goes up.
So the community layer thins while the vendor layer grows. For a startup that never had a community presence to begin with, that isn't bad news. It means the game is moving toward content you can actually produce yourself.
Move one: write the comparison no one has written yet
Comparison content is wildly overrepresented in what LLMs cite, especially for dev tools. Ask a model what to use for a database, a vector store, or a backend platform, and the pages it leans on are overwhelmingly "X vs Y" and "best X for Z" write-ups. That's because a comparison page matches the shape of the question a buyer actually types.
The opening for a new tool is the long tail. About 60% of ChatGPT prompts run longer than ten words, against roughly three and a half for a Google search. People bring their whole situation to the model: "Postgres backend for a Flutter app with row-level security," not "Postgres backend." A model asked "best database for a Next.js and Netlify project that already uses Prisma" returns something different than it does for "best database." Most of those specific intersections have no dedicated content yet. That's your ground. Find the intents where you genuinely fit and where no competitor or neutral source has written the page, and write it.
Don't guess at the questions. Mine them. Reddit threads, support tickets, and sales-call recordings are where the real long-tail prompts live, the ones keyword tools miss because they only track short head terms. Webflow did exactly this. They scraped Reddit for the feature questions people kept asking, then answered those questions as structured FAQs on their own feature pages. Those FAQs drove 57% of their new LLM citations within a few weeks. The community surfaced the questions, and Webflow owned the answers. That's the move a startup should copy, and I wrote more about structuring that kind of content in making your content AI-friendly in 2026.
Move two: build product-bot fit
Getting cited is half the job. The other half is what happens after the agent picks you. If a developer's AI agent chooses your tool and then hits friction, a confusing SDK, docs it can't act on, a setup step that needs a human, the agent backs out and tries the next option. Task completion is becoming a stronger signal than any blog post you could write.
This is a different surface from the chatbot. When Cursor or Claude Code makes a decision inside a repo, it's conditioned on the framework, the deploy target, and the dependencies already in the lockfile, and the switching cost once your package is in there is high. Build for that agent first: SDK names that encode the use case rather than just the brand, docs written so an agent can execute them step by step, and an honest llms.txt that describes what your product does. I made the broader argument that your docs are for AI now. For a startup, the docs are also the thing that decides whether the agent finishes the job in your tool or someone else's.
Move three: get humans to vouch for you by name
The agent narrows the list. A human still makes the final call, especially on a tool they've never heard of. This is where partners, named customers, and case studies earn their keep. Someone has to be willing to risk a young product, and the thing that makes that risk feel survivable is another real company's name attached to it.
Case studies may also do quiet work on the model itself. I'm less sure about this one, so take it as a hypothesis: if a model has absorbed a repeated story that your category "doesn't scale," a first-party case study with real numbers gives it a counterweight to cite. The human-trust value is the reliable part. The model-perception value is a bet. Build the case study for the human first, and treat any LLM benefit as a bonus.
What still backfires
The shortcuts are the same traps they always were. Fake Reddit accounts, paid G2 reviews, AI-generated tutorials, and keyword-stuffed llms.txt files all hurt you, because models are increasingly trained to recognize content that was engineered to manipulate them. AI-generated spam gets treated more harshly than old-fashioned search spam. And once a model ties your brand to low-trust signals, that association outlasts whatever short-term lift the trick bought.
The llms.txt file deserves its own warning, because startups keep trying to use it as a billboard. Stuff it with superlatives and you get ignored at best and flagged at worst. Write it as an honest description of your product's structure, nothing more.
The audience is the agent now
The pre-training race is over for anyone launching today, and the community race was one a new brand was never going to win. What's left is the race the incumbents haven't finished either: producing the specific, current, citable content that agents retrieve, and shipping a product an agent can actually succeed with. The most durable version of this is original research, the kind that turns your startup into a source a model cites by name instead of only a product it lists. I made that case in proprietary research is the only content moat left.
I've written before that the next era of developer marketing treats AI agents as a real audience with its own reading habits. For a startup, that's the whole opportunity. You don't have a community yet, and you don't need to fake one. Write the page the agent retrieves and ship a tool the agent can finish a real task in. Then go find the humans willing to put their name next to yours.
For more on AI-era marketing for developer tools, visit the AI Marketing Hub.

Developer marketing expert with 30+ years of experience at Sun Microsystems, Microsoft, AWS, Meta, Twitter, and Supabase. Author of Picks and Shovels, the Amazon #1 bestseller on developer marketing.

Want the complete playbook?
Picks and Shovels is the definitive guide to developer marketing. Amazon #1 bestseller with practical strategies from 30 years of marketing to developers.