Actual AI Text-To-Video is Finally Here!

Actual AI Text-To-Video is Finally Here!

Introduction

The speaker introduces the topic of text-to-video and mentions that while there have been some demos, true text-to-image has not been widely available until now. They introduce a demo of what can be done with text-to-image and provide information on how to access it.

Text-to-Video Demo

  • The speaker provides a list of various demos that were created using text-to-video, including mountains and water in a Chinese painting, fireworks, campfire at night in the snowy forest with Starry Sky in the background, clown fish swimming through the coral reef, ducks swimming in a pond, and more.
  • The speaker plays a video showcasing these various demos.
  • The speaker mentions that this technology is made possible by an open-source model called stable diffusion and provides information on where to access it.

Accessing Text-to-Video

  • The speaker explains that you can access text-to-video for free through hugging face's Model Scope Text to Video Synthesis tool but warns that it may take some time due to high demand.
  • Alternatively, you can duplicate the space yourself but will need to have a credit card on file inside of hugging face. It will likely cost less than two dollars to run.

Using Text-to-Video

The speaker walks through how to use text-to-video using hugging face's Model Scope Text to Video Synthesis tool.

Using Hugging Face's Tool

  • The speaker duplicates the space and explains how to change the settings from CPU basic to something stronger so that the model can run properly.
  • (timestamp not provided in transcript) Once the space is set up, the speaker demonstrates how to enter a prompt and generate a video using text-to-video.

Setting up T4 Medium Instance

In this section, the speaker sets up a T4 medium instance to avoid runtime errors and generate videos without waiting in queues.

Confirming New Hardware

  • The speaker clicks "confirm new hardware" to set up the T4 medium instance.
  • The T4 medium boots up and eliminates the runtime error.
  • The speaker can now generate videos without waiting in queues.

Generating Videos with Custom Prompts

In this section, the speaker generates videos using custom prompts on their own T4 system.

Green Alien Eating a Taco

  • The speaker inputs a prompt for a green alien eating a taco.
  • The video processes successfully on their own T4 system without any errors.

Detailed Green Alien Standing on Red Mars Landscape Eating Yellow Crunchy Taco

  • The speaker adds more detail to the previous prompt by including words like "Unreal Engine," "HD 4K," and "realistic."
  • The video processes successfully on their own T4 system but still has a watermark.

Trying Different Subjects

  • The speaker tries different prompts such as a penguin kicking a soccer ball, clownfish swimming through coral reef, monkey on roller skates, and cat learning to play piano.
  • Some of the generated videos are successful while others lack detail or do not match the prompt exactly.

Examples of Text Generation Video Mock-up

  • The speaker shows examples of text generation video mock-ups such as giraffe underneath microwave, golden doodle playing in park by lake, panda bear driving car, teddy bear running in New York City, drone fly-through of fast food restaurant on dystopian alien planet, and dog wearing superhero outfit with red cape flying through sky.
  • Speaker notes that these are cherry-picked examples from thousands of generations.

Text-to-Video AI: Early Tech Breakthrough

In this section, the speaker discusses the early stages of text-to-video AI technology and how it has progressed over time.

Progression of Text-to-Video AI Technology

  • Less than a year ago, DALL-E was generating images that were less impressive than what can be created today with mid-journey version 5.
  • While it may take many prompts to generate a video that looks exactly like what is envisioned, the technology is still in its early stages and will likely improve rapidly.
  • The text-to-video AI tool demonstrated in the video is available for use but may require duplicating the space and upgrading one's server to use it effectively.

Future Tools for Emerging Technologies

  • To stay up to date on emerging technologies in the AI space, visit futuretools.io where all of the coolest tools are curated.
  • If overwhelmed by too many tools, sign up for the free newsletter which provides a summary of five cool tools every Friday as well as news updates and videos.
Video description

We actually have a working text-to-video model where you enter a text prompt and it will attempt to generate a video from that text. Here's a breakdown of how you can use it right now. Hugging Face Text-To-Video: https://huggingface.co/spaces/damo-vilab/modelscope-text-to-video-synthesis 🛠️ Explore hundreds of AI Tools: https://futuretools.io/ 📰 Weekly Newsletter: https://www.futuretools.io/newsletter 😊 Discord Community: https://futuretools.io/discord 🐤 Follow me on Twitter: https://twitter.com/mreflow 🐺 My personal blog: https://mattwolfe.com/ 🌯 Buy me a burrito: https://ko-fi.com/mattwolfe 🍭 My Backgrounds: https://www.futuretools.io/desktop-backgrounds Outro music generated by Mubert https://mubert.com/render #AI #NoCode #Futurism