5Y View｜如何打造真正有用的AI产品？生成式AI改变的创业规则|AI|创业_新浪科技

推荐人

石允丰五源执行总监

从五年前开始专注AI投资时，我就憧憬能找到天才产品经理来链接「AI的局限能力」和「优雅闭环的用户体验」。ChatGPT发布两年后，期待中的AGI时代的天才产品经理并没有出现，并没有横空出世的乔布斯，张小龙，Evan Spiegel 教我们怎么做AI产品。

Granola 可能在中国并不是一个好的独立商业生态位，但在一个哪怕在美国都极度拥挤的AI笔记赛道，做到了口碑最好的产品体验。我认为 Chris目前探索出的产品方法论，对所有2C / prosumer 产品尤其有借鉴意义。如果你在build的产品适用于以下标准中的3/4，欢迎联系我交流讨论 stevenshi@5ycap.com

How to Build a Truly Useful AI Product

Generative AI breaks the old startup playbook

作者：Chris Pedregal

Granola 联合创始人兼 CEO。他曾联合创办教育科技公司 Socratic，后被 Google 收购。

首次发布时间：2024.12

如果说创业像是在打一个难度很高的视频游戏，那么在生成式 AI 领域创业，就像是在 2 倍速下打这个游戏。

当你在“应用层”创业——即你依赖像 OpenAI 或 Anthropic 提供的 AI 模型时——你实际上是建立在一项发展速度前所未有、不可预测的技术之上：每年至少两次重大模型更新，技术飞跃异常迅猛。如果你不够谨慎，可能花了几周时间开发一个功能，结果下一代模型一发布就自动实现了它。而且，由于所有人都能接入优秀的 API 和最前沿的大语言模型（LLM），你以为独一无二的产品创意，很可能被任何人实现。

当然，AI 的确打开了许多全新可能——例如代码生成、研究辅助等产品能力在过去是完全不可能的——但你需要确保自己是站在技术浪潮之上冲浪，而不是被浪打翻。

这正是我们需要一套“新打法”的原因。

在过去两年里，我一直在打造 Granola ——一款能用转录和 AI 优化会议记录的智能笔记本。这段经历让我相信，生成式 AI 是一个完全不同的领域，传统的“创业物理学定律”在这里并不完全适用。比如，“先解决用户最大的痛点”“用户量越大服务成本越低”这些老规律，在 AI 领域未必成立。

如果你的直觉是由传统创业经验培养出来的，你就需要重新建立一套适用于 AI 的直觉。在这两年的摸索中，我总结出了一套适用于 AI 创业者的四个关键原则，我认为每一位做应用层 AI 的创始人都应该了解。

1. 不要把时间浪费在即将消失的问题上

LLM 正在经历人类历史上最快速的技术发展之一。两年前，ChatGPT 还无法处理图片、解决复杂数学问题或生成复杂代码——而这些现在都已轻松实现。两年后的能力图景，很可能又完全不同。

对于应用层创业者来说，很容易掉进“错解问题”的陷阱：你埋头苦干去解决某个眼前的问题，而下一版 GPT 一出，这个问题就不复存在了。所以，不要把时间浪费在即将消失的问题上。

说起来容易，做起来难——因为你得去预测未来，而这听起来就让人很不舒服。你需要设法预判 GPT-X+1 能做什么，然后围绕这些预测来制定产品路线图和发展战略。

举个例子：Granola 的第一版无法记录超过 30 分钟的会议。当时最强的模型是 OpenAI 的 DaVinci，上下文窗口仅有 4000 tokens，所以只能处理短会。按传统逻辑，我们应该立刻优先解决这个问题——怎么能让人用一个只能记短会的会议记录工具呢？但我们有一个假设：LLM 很快就会变得更强、更快、更便宜，且上下文窗口更长。于是我们决定不花任何时间去优化上下文窗口限制，而是集中精力提升笔记的质量。

这段时间，我们甚至要有意识地忽略用户对“时间太短”的抱怨。但我们的假设是对的：几个月后，LLM 的上下文窗口已经足够大，可以处理更长的会议了。如果我们一开始去解决“记录时长”问题，现在那些工作就全都浪费了。而我们那段时间在“笔记质量”上的投入，至今仍是用户最喜欢 Granola 的原因之一。

2. 边际成本就是机会

在过去，软件最具代表性的特征之一就是：新增一个用户的边际成本几乎为零。如果你的产品能服务 1 万个用户，支持 100 万个用户也不会贵太多。

但在 AI 领域这个规律不再适用。每增加一个用户，其边际成本依然存在，而运行最前沿的 AI 模型非常昂贵。比如，把一个 30 分钟的会议音频发送给 OpenAI 的旗舰语音模型 GPT-4o，一次就要花费约 4 美元。如果每天有成千上万的用户使用，这个成本将十分可观。

不仅如此，你的创业公司也无法随意扩容用户数。即便你拥有无限预算，OpenAI 和 Anthropic（Claude的开发者）也没有足够的算力资源，为数百万用户同时提供前沿模型的服务。所以，历史上首次出现了一种新可能：为少数用户提供更好产品体验，比为千万级用户服务更可行。

但这并不是阻碍，反而是创业者的巨大机会：大公司反而没法和你竞争，因为世界上根本没有足够的计算资源让他们为数百万用户都提供最先进的 AI 体验。

作为创业公司，你完全可以为每个用户提供一辆“法拉利级”的产品体验：

尽情使用最贵、最先进的模型

不要优先考虑成本优化

如果多调用 5 次 API（即发送给 LLM 提供商的请求）能显著提升体验，那就去做

虽然每个用户的成本可能较高，但初期你不会有很多用户。而要记住：像 Google 这样的公司，也只能为用户提供“本田级”的体验。

你可能会想：那如果我的“法拉利体验”吸引大量用户怎么办？我不也会像现在的大公司一样，难以提供高质量的服务吗？

答案是：别担心。即使你的用户数量指数增长，AI 推理的成本也在指数下降。

今天的前沿模型，在一两年后就会变成“白菜价”。今天的法拉利，就是明天的大众。趁现在还开得起法拉利，就去开吧。

3. 上下文为王

当我们最初为 Granola 编写提示词、让模型生成会议记录时，很快就意识到：逐条列出操作指令在实际中并不好用。

现实世界是混乱的，不可能事先为每一种情况都写好规则。即使你真的涵盖了所有场景，规则之间也很容易互相冲突。

这时我们意识到一个关键点：与其把 AI 模型当作“执行命令的工具”，不如把它当作第一天上班的实习生。一个实习生是聪明的，但缺乏上下文，不知道该做什么、怎么做。让一个实习生成功的关键是给足他理解你的“语境”。

这也就是我们现在在 Granola 所采用的提示策略：不是只告诉模型“怎么做”，而是为它提供经过精挑细选的上下文，让它“像你一样思考”。

对于 Granola 来说，它的任务是生成高质量的会议笔记；而它需要的上下文是：谁参与了会议？会议的背景是什么？你希望通过这场会议达成什么？你的长期目标是什么？这场会议如何服务于那个目标？

我们负责从网络和各种来源找到这些关键信息，让模型掌握你的意图，并最终写出真正有用的内容。这个过程的“艺术”就在于：选择哪些上下文信息最有价值、该如何包装这些信息。

无论模型多么强大，它所得到的上下文永远至关重要。

我相信：“上下文窗口选择（context window selection）”将成为我们这个时代的一个核心理念，其意义远远超出 AI 领域本身。就像工业革命时期，在工业革命时期，人们用机械术语来描述大脑的运作，比如“释放压力”就是一种比喻。到了计算机时代，我们开始用“带宽”“存储容量”这样的词来形容大脑。而接下来，我们也许会用“上下文窗口”来理解大脑的工作方式。这个概念，最终会渗透到科技之外的整个社会中。

4. 更窄、更深

如今打造 AI 产品最有趣的挑战之一是：你正在和 ChatGPT、Claude 这样的通用 AI 助手竞争。它们已经在很多事情上做得相当不错。那么，怎样打造一个让用户愿意放弃“瑞士军刀”，选择你的产品呢？

答案只有一个：做得足够“窄”——真的非常窄。选一个非常具体的场景，然后在这个场景里做到极致。

“打造人们真正想要的东西”这个创业铁律，在 AI 时代依然成立，但门槛变得更高了。

但有趣的是：在特定场景下提供卓越体验，往往和 AI 本身关系不大。

我们在 Granola 上花了无数时间优化笔记质量，但我们也花了同样多的时间去打磨非 AI 功能，比如会议提醒是否顺畅、回声消除是否做得足够好（无论你是否戴耳机，使用体验都得完美）。AI外面的“包裹层”（wrapper），往往才是决定用户体验是“惊艳”还是“鸡肋”的关键。

同时，聚焦场景越窄，也更容易优化 AI 部分的表现。

当 AI 的回答正确时，体验会非常“魔法”；但当它出错时，那种错误往往让人出戏甚至不安 —— 你会立刻意识到自己在和一个算法对话，而不是一个人。这种“落入恐怖谷”的产品体验，很可能会让用户彻底流失。

但如果你聚焦足够窄，就更容易识别 AI 最常见的失败模式，从而进行有效的规避，或者至少让它“体面地失败”。

基本原则从未改变

在生成式 AI 的世界里创业，就像你在跑步机上奔跑，而传统科技还在慢慢散步。这种节奏上的不同，会影响你所面对的技术问题、产品打磨路径乃至扩张节奏。

虽然这种“加速状态”确实要求你采用新的策略，但有一件事从未改变：你仍然必须去打造人们真正想要的产品。没有捷径。

你依旧需要关注细节、耐心打磨产品体验。最关键的问题仍然是那个看似简单却最有洞察力的提问：

“这个产品，给我带来了什么样的感受？”

以下为原文：

If building a startup is like playing a tough video game, building a startup in generative AI is like playing that video game at 2x speed.

When you’re building at the application layer—your startup uses an AI model provided by companies like OpenAI and Anthropic—you're relying on technology that is improving at an unpredictable and unprecedented rate, with major model releases happening at least twice a year. If you're not careful, you might spend weeks on a feature, only to find that the next AI model release automates it. And because everyone has access to great APIs and frontier large language models, your incredible product idea can be built by anyone.

Many opportunities are being unlocked—LLMs have opened up product abilities like code generation and research assistance that were impossible before—but you need to make sure you are surfing the wave of AI progress, not getting tumbled by it.

That’s why we need a new playbook.

Having spent the last two years building Granola, a notepad that takes your meeting notes and enhances them using transcription and AI, I’ve come to believe that generative AI is a unique space. The traditional laws of “startup physics”—like solving the biggest pain points first or that supporting users gets cheaper at scale—don’t fully apply here. And if your intuitions were trained on regular startup physics, you’ll need to develop some new ones in AI. After developing these intuitions over the last two years, I have a set of four principles for building in AI that I believe every app-layer founder needs to know.

1. Don't solve problems that won't be problems soon

LLMs are undergoing one of the fastest technical developments in history. Two years ago, ChatGPT couldn’t process images, handle complex math, or generate sophisticated code—tasks that are easy for today’s LLMs. And two years from now, this picture will look very different.

If you’re building at the app layer, it’s easy to spend time on the wrong problems—those that will go away when the next version of GPT comes out. Don’t spend any time working on problems that will go away. It sounds simple, but doing this is hard because it feels wrong.

Predicting the future is now part of your job (uncomfortable, right?). To know what problems will stick around, you’ll need to predict what GPT-X-plus-one will be capable of, and that can feel like staring into a crystal ball. And once you have your predictions, you have to base your product roadmap and strategy around them.

For example, the first version of Granola didn’t work for meetings longer than 30 minutes. The best model at the time, OpenAI’s DaVinci, only had a 4,000-token context window, which limited how long meetings could be.

Normally, lengthening this time frame would have been our top priority. How can you expect people to use a notetaker that only works for short meetings? But we had a hypothesis that LLMs were going to be much better: They’d get smarter, faster, cheaper, and have longer context windows. We decided not to spend any time fixing the context window limitation. Instead, we spent our time improving note quality.

For a while, we had to actively ignore users who complained about the duration limit. But our hypothesis was right: After a couple of months, context windows got big enough to handle longer meetings. Any work we would have done on that would have been wasted. Meanwhile, the work we did on note quality is one of the main reasons users say they love Granola today.

2. Your marginal cost is my opportunity

Historically, a defining characteristic of software was that the marginal cost of supporting an additional user was close to zero. If you had a product that worked for 10,000 users, it wouldn't cost that much more to support 1 million users.

This is not true when it comes to AI. The marginal cost of every additional user remains the same, and cutting-edge AI models are really expensive to run. For example, sending the audio of a half-hour meeting to OpenAI’s flagship GPT4o audio model costs about $4. Imagine that cost scaled across thousands of users, every day. There’s also a limit to the number of users your startup can onboard. Even if you had all the money in the world, OpenAI and Anthropic (which makes Claude) don’t have enough compute to support cutting-edge models for millions of users.

For the first time, it’s possible to provide a better product experience for a small number of users than for millions of users. But this isn’t an obstacle—it’s a big opportunity for startups. Big companies with millions of users literally can’t compete with you because there isn’t enough compute available in the world to provide a cutting-edge experience at scale.

As a startup, you can give each of your users a Ferrari-level product experience. Use the most expensive, cutting-edge models. Don’t worry about optimizing for cost. If doing five additional API calls (server requests to your LLM provider of choice) makes the product experience better, go for it. It might be expensive on a per-user basis, but you probably won’t have many users at first. And remember: At best, companies like Google can provide their users with a Honda-level product experience.

You might be wondering what happens when users come flocking to your Ferrari product experience. Won’t you end up in the same position as the big tech companies of today, unable to provide high-quality, cutting-edge services to your users?

The beauty is that even if your user base is growing exponentially, the cost of AI inference is decreasing exponentially. Today’s cutting-edge models will be affordable commodities in a year or two. Today’s Ferrari’s are tomorrow’s Hondas. Be a Ferrari while you can.

3. Context is king

When we first started writing prompts for Granola to generate meeting notes, we quickly realized that providing a set of step-by-step instructions doesn't work well in practice. The real world is messy, and it’s nearly impossible to anticipate and write rules for every situation an LLM might encounter. Even if you could cover every scenario, you'd inevitably have conflicting guidance.

We had an insight: Instead of treating AI models as something that just follows instructions, we should treat them like interns on their first day. An intern is smart but lacks context on what to do and how to do it. The key to an intern's success is to give them the context they need to think like you.

That's how we approach prompting at Granola now. We provide the model with curated context to guide its thinking. For Granola, the use case is writing great notes from a meeting. The context is understanding who is in the meeting and why it’s being discussed. Our work is to find that information—from the web and other sources—and then get the model to think like you (What are you trying to get out of this meeting? What are your long term goals and how is this meeting in service of that?) and put only the relevant information in the notes. The art is in selecting which context to provide and how to frame it—because no matter how good models get, the context you give them will always matter.

I believe "context window selection" will be one of the defining ideas of our time, with implications far beyond AI. During the Industrial Revolution, the brain was described in terms of mechanical machines—blowing off steam, for example. When computers emerged, we started to use terms like “bandwidth” and “storage capacity.” I think we will start describing how the brain works in terms of "context window selection.” This idea will permeate well beyond tech.

4. Go narrow, go deep

One fascinating challenge with building AI products today is that you're competing with general-purpose AI assistants like ChatGPT and Claude. They’re pretty good at most things. How do you build something good enough that users will choose you over these Swiss Army knives?

The only answer is to go narrow—really narrow. Pick a very specific use case and become exceptional at it. The cardinal rule of startups—building something people want—remains consistent in AI, but the bar is higher.

But here's the plot twist: Exceptional experiences for narrow use cases often have little to do with AI. We spend endless hours on note quality at Granola, but we spend just as much time on features like seamless meeting notifications and great echo cancellation (so our tool works whether you're using headphones or not). The "wrapper" around the AI is often the difference between a delightful experience and a great demo that is disappointing to actually use.

Going narrow also makes it easier to improve the AI part of your product. When AI gets a response right, it’s magical. But when it gets it wrong, it does so in ways that can feel weird and disconcerting. It becomes obvious that you’re not talking with a human, but with an algorithm. Product experiences that fall into the uncanny valley can push users away from your product for good. When you go narrow, it’s much easier to identify the most common AI failure cases, and either mitigate them or try to fail more gracefully.

The fundamentals are the same

Building in generative AI is like running on a treadmill while traditional tech moves at walking speed. This speed impacts everything from the technical problems you tackle to your timeline for reaching scale. While this acceleration should change your strategy, it doesn’t change the fundamentals of building a good product. You need to build something people want.

There are no shortcuts. You still have to sweat the details. And the most clarifying questions remain deceptively simple: How does this product make me feel when I use it?