a16z interviews Hedra founder Michael Lingelbach: How generative video can become the next big thing from memes

This article is machine translated
Show original

Table of Contents

Michael Lingelbach is the founder and CEO of Hedra, a former Stanford University computer science PhD student and a former stage actor who combines technical expertise with performance passion to lead Hedra in developing industry-leading generative audio-visual models. Hedra is a company focused on full-body embodied, dialogue-driven video generation, with technology supporting applications from virtual influencers to educational content, significantly lowering content creation barriers. This article is translated from the a16z Podcast, focusing on how AI technology transitions from viral meme content to enterprise-level applications, demonstrating the innovative potential of generative audio-visual technology.

The following is the conversation, compiled and edited by ChainCatcher (with abridgements).

TL&DR

  • Artificial intelligence is seamlessly bridging consumer and enterprise scenarios, such as using this technology to generate baby advertisements to promote enterprise software, highlighting companies' enthusiasm for embracing new technologies.
  • Viral meme content has become a tool for startups, with "Baby Podcast" quickly enhancing brand awareness, demonstrating clever market strategies.
  • Full-body expression and dialogue-driven video generation technology fills creative gaps, dramatically reducing content production time and costs.
  • Virtual influencers like John Lawa shape unique digital characters through "Moses Podcast", giving content distinct personality and appeal.
  • Content creators like "Mommy Bloggers" use technology to quickly produce videos, easily maintaining brand activity and audience connection.
  • Real-time interactive video models open up two-way conversations with virtual characters, bringing immersive experiences to education and entertainment.
  • Character-centric video generation technology emphasizes personality expression and multi-subject control, meeting dynamic content creation needs.
  • Integrated dialogue, action, and rendering platform strategies create smooth generative media experiences, catering to high-quality content demands.
  • Interactive avatar models support dynamic adjustment of video emotions and elements, signaling the next wave of content creation innovation.

(I) From Memes to Enterprise AI Integration

Justine: We see very interesting cross-application of AI between consumer and enterprise scenarios. A few days ago, I saw an advertisement text generated by Hedra in Forbes, which was actually a talking baby promoting enterprise software. But this also shows that we are in a new era where enterprises are rapidly embracing AI technology, showing tremendous enthusiasm.

Michael: As a startup, our responsibility is to draw inspiration from consumer user usage signals and transform them into next-generation content production tools that enterprise users can rely on. Over the past few months, some viral content generated by Hedra has attracted widespread attention, from early anime-style characters to "Baby Podcasts" and this week's trending topic—which I'm not even sure about. Memes are a very effective marketing strategy that quickly occupies users' minds by reaching a large audience. This strategy is becoming increasingly common among startups. For example, another a16z-invested company, Cluey, gained significant brand awareness through viral spread on Twitter. The essence of memes is a technology-enabled carrier for rapid creativity, with short-form video content now dominating cultural consciousness. Hedra's generative video technology allows users to transform any creative idea into content within seconds.

(II) Why Creators and Influencers Choose Hedra

Justine: Could you explain why people use Hedra to create memes and how they use it, and what is the connection to your target market?

Michael: Hedra is the first company to massively deploy a full-body expressive, conversation-driven generative video model. We have supported users in creating millions of pieces of content, and its rapid popularity is because we fill a critical gap in the content creation technology stack. Previously, producing generative podcasts, animated character dialogue scenes, or singing videos was either extremely costly, lacking flexibility, or time-consuming. Our model is fast and low-cost, thus catalyzing the rise of virtual influencers.

Justine: Recently, CNBC published an article about virtual influencers driven by Hedra. Could you provide some specific examples of how influencers use Hedra?

Michael: For instance, famous actor John Lawa (who played Taco in 'The League') used Hedra to create a series of content from "Moses Podcast" to "Baby Podcast", with characters now having unique identities. Another example is Neural Viz, who built a metaverse centered on character identity based on Hedra. Generative performance differs from pure media models, requiring personality, consistency, and control to be injected into the model, which is especially crucial for video performance. Thus, we see these virtual characters' unique personalities becoming popular, even though they are not real people.

[The translation continues in the same manner for the remaining paragraphs, maintaining the specified translation rules.]

(VI) Building an Integrated Generative Media Platform

Justine: Many companies like Black Forest Labs have made technological breakthroughs, but still need partners like Hedra to deliver experiences to consumers and business users. How do you decide to build an integrated platform without being limited to a single technology?

Michael: It's about focus and user needs. When I founded Hedra, I found it very difficult to integrate conversations into media. In the past, users needed to overlay lip-syncing, lacking an overall sense. Our technology's inspiration was to unify breathing, gestures, and other signals with dialogue to create a more natural video model. From a market perspective, we observed differences in users' willingness to pay for different applications. Some popular applications might have low payment willingness, but certain niche areas (like content creators) have a strong demand for high-quality experiences. We choose to integrate the best technologies, whether from Hedra or partners like 11 Labs, to ensure users get the best experience.

Matt: In the future, will AI characters generate text, scripts, voice, and visuals from a single model?

Michael: I think the industry is moving towards a multi-modal input-output paradigm. The challenge with a single model is control. Users need to precisely adjust details like voice, tone, or rhythm. Decoupling inputs can provide more control, but the future might trend towards fully modal models where users can adjust the alignment of each modality through guiding signals.

[The translation continues in the same manner for the rest of the text, maintaining the professional and technical tone while accurately rendering the content in English.]

Source
Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.
Like
Add to Favorites
Comments