generative ai video
Thai PM makes Mandarin AI video to reassure Chinese tourists amid border kidnappings VIDEO
Using Generative AI to Automatically Create a Video Talk from an Article by Lak Lakshmanan
YouTube removed some of the channels and material after NBC News flagged them for comment. NEW YORK – YouTube CEO Neal Mohan announced Wednesday a slate of new artificial intelligence features coming to the platform. AI-assisted generative search could theoretically find that information somewhere online—in a user manual buried in a company’s website, for example—and create a video to show me exactly how to do what I want, just as it could explain that to me with words today.
This is why viral clips depicting extraordinary visuals and Hollywood-level output tend to be either single shots, or a ‘showcase montage’ of the system’s capabilities, where each shot features different characters and environments. Here, we are considering the prospect of true auteur full-length gen-AI productions, created by individuals, with consistent characters, cinematography, and visual effects at least on a par with the current state of the art in Hollywood. It separately published a research paper for those who want a more exhaustive deep dive into the inner workings of the Meta Movie Gen models. In the paper, it claims a number of breakthroughs in model architecture, training objectives, data recipes, inference optimizations and evaluation protocols, and it believes these innovations enable Meta Movie Gen to significantly outperform its competitors. It’s all about enabling more precision for creators, who can use it to add, remove or swap out specific elements of a video, such as the background, objects in the video, or style modifications, the company said.
Capcom has started using generative AI to assist it with game development, mainly to reduce the time spent generating ideas for background elements. That includes creating “thousands to tens of thousands” needed in game creation. Meta Platforms Inc.’s artificial intelligence research team has showcased a new family of generative AI models for media that can generate and edit videos from simple text prompts. The new AI video creation tool is called “Veo.” Creators will input text prompts to create AI images, which can then become the basis of the six-second clips. Mohan teased it with an AI-generated video of a dog and a sheep becoming friends.
“I think our experience with recommending the right content to the right viewer works in this AI world of scale, because we’ve been doing it at this huge scale,” says Ali. She also points out that YouTube’s standard guidelines still apply no matter what tool is used to craft the video. Free users must contend with watermarked videos, which can be a drawback for those looking to use the content commercially.
AI has been a controversial topic in video games and beyond, in part over fears about job losses. Electronic Arts CEO Andrew Wilson has said the rise in AI will lead to job losses in the short term but will ultimately create more jobs total, just like what happened with previous labor revolutions. Video game actors also remain on strike, due in part to concerns about the use of AI.
Get a PDF of the article
This is made possible by an extensive training dataset consisting of 100 million videos and 1 billion images, allowing the AI to replicate facial features and body movements with remarkable accuracy. However, Capcom isn’t handing over the reins of game development to AI. Instead, the company uses these ideas to assist art directors and artists working on games.
At that point, you’ve basically got yourself a Star Trek style holodeck experience on demand. Pictory aims to automate the process of turning scripts and blogs into videos with minimal user effort. It’s great for those who want to make quick, snappy videos for marketing or just for fun.
And as Google rolls this out to a billion people, many of whom will be interacting with a conversational AI for the first time, what will that mean? There’s another hazard as well, though, which is that people ask Google all sorts of weird things. If you want to know someone’s darkest secrets, look at their search history. Google doesn’t just have to be able to deploy its AI Overviews when an answer can be helpful; it has to be extremely careful not to deploy them when an answer may be harmful. Thanks to its ability to preserve context across a conversation, ChatGPT works well for performing searches that benefit from follow-up questions—like planning a vacation through multiple search sessions. OpenAI says users sometimes go “20 turns deep” in researching queries.
Meta introduces generative AI video advertising tools
But for every clip that generates a “wow,” there’s another that violates basic physics. Look for people and animals clipping through each other, or rotating their limbs in ways that in real life would mean a trip to the hospital. All this makes it extremely important to spot AI videos when they appear on platforms like Facebook, Telegram, and WhatsApp. If you can catch one, you won’t just be protecting yourself from disinformation — you’ll be protecting other people, since not everyone is equipped with a skeptic’s toolbox. All I had to do was to prompt the LLM to construct a series of slide contents (keypoints, title, etc.) from the article, and it did. It even returned the data to me in structured format, conducive to using it from a computer program.
You can describe what the bird in your yard looks like, or what the issue seems to be with your refrigerator, or that weird noise your car is making, and get an almost human explanation put together from sources previously siloed across the internet. It’s amazing, and once you start searching that way, it’s addictive. The inserted backgrounds and clothing don’t distort unnaturally when Mosseri rapidly moves his arms or face, but the snippets we get to see are barely a second long. The early previews of OpenAI’s Sora video model also looked extremely polished, however, and the results we’ve seen since it became available to the public haven’t lived up to those expectations. We won’t know how good Instagram’s AI video tools truly are by comparison until they launch.
- When I asked him about it, he couldn’t really explain why the model chose the sources that it did, because the model itself makes that evaluation.
- Explore different styles and find your own with extensive camera controls.
- They also include the sixth-generation NVIDIA decoder, with 2x the decode speed for H.264 video.
- Advertisers can also upload a brand logo to guide genAI-created visual assets.
Still, Google is keen to get more of its enterprise customers using generative AI. Citing its own research, the tech giant says among companies using generative AI in production, 86 percent report an increase in revenue. However, a recent Appen survey found return on investment from AI projects fell by 4.6 percentage points from 2023 to 2024. Millions have used the NVIDIA Broadcast app to turn offices and dorm rooms into home studios using AI-powered features that improve audio and video quality — without needing expensive, specialized equipment. A prepackaged workflow powered by the FLUX NIM microservice and ComfyUI can then generate high-quality images that match the 3D scene’s composition. The GeForce RTX 50 Series adds FP4 support to help address this issue.
Text-to-speech is itself a generative AI model (and another example of the translation superpower). The Google TTS service which was introduced in 2018 (and presumably improved since then) was one of the first generative AI services in production and made available through an API. I don’t want to create a podcast, but I’ve often wished I could generate slides and a video talk from my blog posts —some people prefer paging through slides, and others prefer to watch videos, and this would be a good way to meet them where they are. The researchers say Go-with-the-Flow simply fine-tunes a base model, requiring no changes to the original pipeline or architecture, except the use of warped noise instead of pure IID Gaussian noise.
Netflix-linked AI video model may look janky, but it’s a groundbreaking advance – Creative Bloq
Netflix-linked AI video model may look janky, but it’s a groundbreaking advance.
Posted: Fri, 24 Jan 2025 11:00:30 GMT [source]
In cases where any significant movement is needed, this atrophy of identity becomes severe. These technologies now show up most frequently as adjunct components in alternative architectures. Additionally, creating specific facial performances is pretty much a matter of luck in generative video, as is lip-sync for dialogue. However, diffusion-based methods, as we have seen, have short memories, and also a limited range of motion priors (examples of such actions, included in the training dataset) to draw on.
The feature uses Meta’s Movie Gen AI model and is set to arrive next year.
In effect, these points are equivalent to facial landmarks in ID-based systems, but generalize to any surface. Whisk, our newest experiment from Google Labs, lets you input or create images that convey the subject, scene and style you have in mind. Then, you can bring them together and remix them to create something uniquely your own, from a digital plushie to an enamel pin or sticker. Since GSplat took 34 years to come to the fore, it’s possible too that older contenders such as NeRF and GANs – and even latent diffusion models – are yet to have their day.
- Runway’s tools have been used in various projects, including films and music videos, showcasing their impact on modern storytelling.
- A few weeks after our call, OpenAI incorporated search into ChatGPT, supplementing answers from its language model with information from across the web.
- Vidu’s advancements come as generative AI tools gain global traction.
- ChatGPTWhile Google brought AI to search, OpenAI brought search to ChatGPT.
By combining traditional diffusion models with a cutting-edge technique called Flow Matching, this functionality enhances both the quality and consistency of the final product. Veo has achieved state of the art results in head-to-head comparisons of outputs by human raters over top video generation models. Veo 2 outperforms other leading video generation models, based on human evaluations of its performance. He writes news, features and buying guides and keeps track of the best equipment and software for creatives, from video editing programs to monitors and accessories. A veteran news writer and photographer, he now works as a project manager at the London and Buenos Aires-based design, production and branding agency Hermana Creatives. There he manages a team of designers, photographers and video editors who specialise in producing visual content and design assets for the hospitality sector.
Meta’s Vision for the Future
Google has begun rolling out private access to its Veo and Imagen 3 generative AI models. Starting today, customers of the company’s Vertex AI Google Cloud package can begin using Veo to generate videos from text prompts and images. Then, as of next week, Google will make Imagen 3, its latest text-to-image framework, available to those same users. Even Luma itself recently updated its Dream Machine platform to include new still image generation and brainstorming boards, and also debuted an iOS app.
Of course, had I been using a Python IDE (rather than a Jupyter notebook), I could have avoided the search step completely — I could have written a comment and gotten the code generated for me. This is hugely helpful, and speeds up development using general purpose APIs. At least as far as YouTube is concerned, the worst offenders of AI plagiarism work by downloading the video’s subtitles, passing them through some sort of AI model, and then generating another YouTube video based off of the original creator’s work. Most subtitle files are the fairly straightfoward .srt filetype which only allows for timing and text information. But a more obscure subtitle filetype known as Advanced SubStation Alpha, or .ass, allows for all kinds of subtitle customization like orientation, formatting, font types, colors, shadowing, and many others.
But fundamentally, it’s just fetching information that’s already out there on the internet and showing it to you, in some sort of structured way. Central to Vidu 2.0 is Shengshu’s universal vision transformer (U-ViT) model, combined with its proprietary full-stack interference accelerator. These innovations allow the platform to deliver high-quality videos at speeds and costs previously unattainable in the market.
As the capabilities of generative AI models have grown, you’ve probably seen how they can transform simple text prompts into hyperrealistic images and even extended video clips. Descript stands out from other AI video editing tools – particularly ones that are available free online – with its user-friendliness and range of features. Everything is set up to make it quick for anybody to start producing and editing videos. Handy features such as eye-contact correction and “one-click” studio-quality sound are signs that the team is looking to add valuable innovations as the tool evolves.
Despite the model’s slow speed, pricey cost to operate, and sometimes off-kilter outputs, he says it was an eye-opening moment for them to see fresh video clips generated from a random prompt. Since its launch, Haiper has continued to push the boundaries of video AI, introducing several tools, including a built-in HD upscaler and keyframe conditioning for more precise control over video content. The platform continues to evolve with plans to expand its AI tools, including features that support longer video generation and advanced content customization. Luma Labs’ Dream Machine is one of the best interfaces for working with artificial intelligence video and image platforms. It can be used to create high-quality, realistic videos from text and images.
Video security analysis for privileged access management using generative AI and Amazon Bedrock – AWS Blog
Video security analysis for privileged access management using generative AI and Amazon Bedrock.
Posted: Wed, 22 Jan 2025 19:37:24 GMT [source]
Amid a mix of cultural and economic factors impacting the industry, developers are also still dealing with company enthusiasm for technology that some find ethically concerning. We hope these new video and music generation technologies will inspire more people to bring their ideas to life in vivid, transformative ways. Runway’s tools have been used in various projects, including films and music videos, showcasing their impact on modern storytelling.
The update also introduces a new feature offering templates (preset prompts) to simplify video creation. Users can add detailed actions or props with just a few clicks, bypassing the need to manually write prompts. For example, an outfit or setting can be applied directly from the preconfigured library, making the platform accessible even for beginners. Shengshu plans to expand the template library over time, offering users more options for their projects. With Veo’s rollout, Google says it’s the first hyperscale cloud provider to offer an image-to-video model.
You can also opt to manually force it to search the web if it does not do so on its own. OpenAI won’t reveal how many people are using its web search, but it says some 250 million people use ChatGPT weekly, all of whom are potentially exposed to it. These overviews take information from around the web and Google’s Knowledge Graph and use the company’s Gemini language model to create answers to search queries. Take featured snippets, the passages Google sometimes chooses to highlight and show atop the results themselves.
You can then edit and re-arrange the timeline simply by snipping and moving the text. Modern generative systems such as Luma and Kling allow users to specify a start and an end frame, and can perform this task by analyzing keypoints in the two images and estimating a trajectory between the two images. VFI is also used in the development of better video codecs, and, more generally, in optical flow-based systems (including generative systems), that utilize advance knowledge of coming keyframes to optimize and shape the interstitial content that precedes them.
Leave a Reply
Want to join the discussion?Feel free to contribute!