Interviewing Adrien Treuille, Founder CEO of Streamlit
Table of Contents
- Given Streamlit is an open-source product, what are the most important metrics you watch for while you build this product? Why?
- Have you considered freemium or free trial? What makes open source a better fit for Streamlit?
- Streamlit uses an Apache2 license. Have you considered MongoDB and Elastic’s licensing model? Why not?
- Is it possible to do both PLG and sales motion at the same time?
- How do you prioritize community feature requests vs your product roadmap? What to do with voluntary and unsolicited contributions?
- What remains the biggest challenge in data infra?
- If you were to start Streamlit again, what would you do differently?
Streamlit, about to raise its Series-C, was acquired by Snowflake for $800M in March 2022. In this conversation with Adrien, we chatted about OSS metrics, licenses, open-core vs freemium vs free trial, PLG vs sales motion, third party contributions, and lessons from building Streamlit. Insights belong to Adrien. Errors and omissions are my own.
Given Streamlit is an open-source product, what are the most important metrics you watch for while you build this product? Why?
Open-source telemetry is a gray area in the open-source world. Because the things that you’d like to track are typically not the things that like open-source projects are supposed to track like utilization. There are two kinds of utilization metrics:
Indirect measure of utilization: downloads, GitHub stars, and engagement metrics on forums (slack, stackoverflow).
Direct measure of utilization: which features were used when. This is a SaaS-like approach, people don’t always like this
Streamlit did the latter. We made it very clear when you install Streamlit that we’re going to collect the statistics, and here’s how you turn off the data collection. We wanted to be good citizens in that regard. This opt-out feature means we may not be aware of all utilization patterns. Conversely, we were able to better visibility into the Streamlit community, such as the monthly active developers and viewers.
Active users are trailing metrics, not leading metrics. They don’t really inform product decisions but are a overall score. You brought up a good point about these metrics are more common in consumer software. Streamlit may consider itself as a B2D company, D as in developers. B2D is not too different from B2C, so I want to optimize for virality and engagement. Taking some members of the community and making them famous is a really key strategy. We were doing all that stuff like crazy.
Have you considered freemium or free trial? What makes open source a better fit for Streamlit?
If you target an existing workload at companies, focus on exactly that customer set, make them as happy as possible, do better than the competition, and you might not have to open source. HEX is an example, which is saying like, hey, we’re gonna make this about notebook, but it’s like super annoying, so we’re going to improve on it in like, six, seven ways.
However, for Streamlit, it was clear that we were inventing a new workload. Early adopters were usually groups working on super high-tech things that like they themselves, their processes were so wide open that they could determine everything we fashion and instrument to work perfectly. These early adopters convinced me to start a company. For example, Uber was using Streamlit to figure out where to put chargers for the electric bikes. If you’re inventing new workloads, then the strategy is you have to become universal, we just had to open source.
Charles: It’s a very similar approach, especially in the infrastructure world, where you really have to be the de facto standard. Thus, you need to earn users’ trust so they’re willing to invest in this platform to get the kind of snowball effect rolling.
Adrien: Exactly. It’s all like famous for being famous.
Streamlit uses an Apache2 license. Have you considered MongoDB and Elastic’s licensing model? Why not?
We could always transition to a model like that. Mongo is an example of a company that changed licenses. But the truth is that we never really got to a point where the Apache2 license was an issue, as we were out there trying to win the community. We did in some ways pull away from the pack of people who were doing similar things two or three years ago.
Our next big challenge was to monetize. We had a theory for how to do so, though it was certainly not proven. We were literally onboarding our first paying customers, when snowflake approached us for acquisition. And we said no actually, because we had great term sheets from amazing investors and we had the revenue. Snowflake said, we don’t want you to figure out how to make money, because if you do, you’re gonna get way too expensive. Snowflake matched our term sheet valuation and went over a bit to catch the projected revenue. The term sheet we had was to raise $95 million, which buys years of runways, so we would have figured out the business problems along the way.
Is it possible to do both PLG and sales motion at the same time?
Charles: Integration is always a challenge with any acquisition. Specifically, Streamlit started as an open-source project, and it’s about to get into monetization with product lead growth, which is different from snowflake’s sales-driven model. How do we best integrate the two products together?
Adrien: The quick answer is yes. The question is, what does that actually look like?
I think what it looks like is perhaps less PLG. For me, true PLG looks like this: we’re gonna convince you to pay up like $1,000 a year for us, and then before you know it, you’ll be paying like a million dollars a year for us because we’re just gonna prove our worth to the entire organization and the adoption growth is bottom-up.
Our ambition at snowflake is not to turn Snowflake sales motion into a PLG motion but to piggyback on snowflake’s unbelievably successful sales motion. What we can be is beloved by developers and be a reason why a deal cuts in Snowflake direction. This is the lower ambition. The higher ambition is the above plus driving a ton of credit consumption and indulgence. Snowflake and Streamlit have a lot of joint users. If the next prospect, who is doing diligence on Snowflake, asks internally that Snowflake comes with this Streamlit thing, who has heard of it? And the data science teams all say that would be awesome. This totally goes for snowflake, right? And then all of a sudden, a massive workload moves over to snowflake. That is success. Whether you call it PLG or not, I think it’s completely compatible with Snowflake’s sales motion.
How do you prioritize community feature requests vs your product roadmap? What to do with voluntary and unsolicited contributions?
Charles: I have this question because I get common feedback from open-source maintainers that contributions from individual community members are great, but once we accept their contributions, we have to maintain the features going forward in all future releases. By then the original contributors are gone. However, rejecting their contributions would be such a blow to their love of your products. Would you prioritize differently based on the feedback and contributions you got?
Adrien: The good news about your story about the contributor release is that in practice, it never works that way. For all the serious open-source projects that I know of, there are no nontrivial yet drive-by contributions. The actual so-called community contributions are more things like adding a comma in README. For Streamlit, if someone wants to merge a fundamentally new feature that lets you, for example, parse URL parameters and do whatever, we would just take a look at it and say no, because it was not in our roadmap, and it wasn’t the way we would have done it. We are not gonna let people check random things into Streamlit.
But, the thing that the community does do extremely well, which is kind of evergreen in its own way, is to provide a ton of IP around the project. For example, every StackOverflow answer and every example code in the public repos. GitHub Copilot writes fantastic Streamlit code, which is amazing. You can literally add a comment like show Yahoo stock pricing stream, and Copilot popped out a beautiful app. All those are community contributions.
What remains the biggest challenge in data infra?
Streamlit is just like one piece of a huge ecosystem of data infrastructure, all of which is changing really quickly. Whether snowflake keeps up with and leaves the pack is a question that far transcends Streamlit. We are just going to play a role in a positive direction. In many ways, the promise is still ahead of us, in the sense that the actual number of companies—that are really committed to using us and have like amazing results—wasn’t that big, but those that did got really solid results. The challenge is replicating that experience, like 10x 100x 1000x.
If you were to start Streamlit again, what would you do differently?
The biggest mistake I had was not hiring leaders fast enough and growing the organization’s maturity. I was worried that a hiring mistake could backfire for us, but in reality, the executives we hired worked out extremely well. When I interviewed them, I wished I had met them sooner because they could truly bring lots of value to the team. They really increased the execution velocity by making the organization scalable.