盘点怎么样用AI做动画,还有各样工具等你取用
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">设备</span>之心<span style="color: black;">报告</span></span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;">编辑:Panda W</span></strong></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">图像生成、视频生成、整合语音合成的人脸动画、生成三维的<span style="color: black;">名人</span>运动以及 LLM 驱动的工具…… 一切都在这篇<span style="color: black;">文案</span>中。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">生成式 AI <span style="color: black;">已然</span><span style="color: black;">作为</span>互联网的一个重要内容<span style="color: black;">源自</span>,<span style="color: black;">此刻</span>你能看到 AI 生成的文本、代码、音频、图像以及视频和动画。今天<span style="color: black;">咱们</span>要介绍的<span style="color: black;">文案</span>来自立陶宛博主和动画师 aulerius,其中按层级介绍和<span style="color: black;">归类</span>了动画<span style="color: black;">行业</span><span style="color: black;">运用</span>的生成式 AI 技术,<span style="color: black;">包含</span>简要介绍、示例、优缺点以及<span style="color: black;">关联</span>工具。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">他写道:「<span style="color: black;">做为</span>一位动画制作者,我<span style="color: black;">期盼</span>一年前就有<span style="color: black;">这般</span>一份资源,那时候我只能在混乱的互联网上<span style="color: black;">自动</span>寻找可能性和<span style="color: black;">持续</span><span style="color: black;">显现</span>的<span style="color: black;">发展</span>。」</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">本文的<span style="color: black;">目的</span>读者是任何对这一<span style="color: black;">行业</span>感兴趣的人,尤其是不知<span style="color: black;">怎样</span>应对 AI <span style="color: black;">行业</span>新技术发展的动画师和创意人士。另需说明,视频风格化虽然<span style="color: black;">亦</span>是<span style="color: black;">关联</span>技术,但本文基本不会<span style="color: black;">触及</span>这方面。</span></p>
<div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-axegupay5k/23d54ac1beb0493ca2938771a790fb0f~noop.image?_iz=58558&from=article.pc_detail&lk3s=953192f4&x-expires=1728100717&x-signature=XVyxHLcKNIcwefYMPNHFNkq5g24%3D" style="width: 50%; margin-bottom: 20px;"></div>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">本文的<span style="color: black;">文案</span>结构。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;">图像生成</span></strong></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">图像生成技术<span style="color: black;">指的是</span><span style="color: black;">运用</span> AI 模型生成图像的技术,这些 AI 模型的训练<span style="color: black;">运用</span>了静态图像。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;">将生成的图像用作素材</span></strong></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">将任意 AI 应用生成的静态图像用作 2D 剪贴画、数字处理、拼贴等传统工作流程中的素材,<span style="color: black;">或</span>用作其它 AI 工具的资源,<span style="color: black;">例如</span><span style="color: black;">供给</span>给图像转视频(image2video)工具来生成视频。除了<span style="color: black;">做为</span>图像和素材<span style="color: black;">源自</span>,这类技术还需依赖剪切和图像编辑等<span style="color: black;">有些</span>常用技能。</span></p>
<div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/7f7cb62397b24e99929cd2198faede57~noop.image?_iz=58558&from=article.pc_detail&lk3s=953192f4&x-expires=1728100717&x-signature=9jKMl9PEyrXQfrUVJihSA%2BL0%2BXk%3D" style="width: 50%; margin-bottom: 20px;"></div>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">短片《Planets and Robots》中<span style="color: black;">运用</span>了数字剪贴画来将生成的 AI 图像动画化,其中的配音<span style="color: black;">亦</span>是<span style="color: black;">运用</span> LLM 基于脚本生成的。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">视频链接:https://mp.weixin.qq.com/s?__biz=MzA3MzI4MjgzMw==&mid=2650907887&idx=4&sn=ca30f3fbde94ec74b32d75b638013594&chksm=84e46091b393e987d442c8c414bdb9b76741d60116bee3419f36a3cb5961906e5d33b4ab312b&token=1179435113&lang=zh_CN#rd</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">优点:</span></p><span style="color: black;">现有动画师就很容易上手<span style="color: black;">运用</span></span><span style="color: black;">可用于生成背景图</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">缺点:</span></p><span style="color: black;">生成结果<span style="color: black;">无</span>多少「新意」</span><span style="color: black;">需要动画师协调处理素材和动画</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">免费工具(任何生成图像模型或应用):</span></p><span style="color: black;">Stable Diffusion(SD,运行在本地计算机上)或这些在线应用:Craiyon</span><span style="color: black;">Invokeai (<span style="color: black;">运用</span>了 SD)</span><span style="color: black;">Enfugue (<span style="color: black;">运用</span>了 SD)</span><span style="color: black;">SkyBox AI—— 能生成适用于 VR 的 360 度场景图</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">插件和附加组件:</span></p><span style="color: black;">在 Blender 中<span style="color: black;">运用</span>的 ComfyUI 节点</span><span style="color: black;">Krita 上的 Stable Diffusion</span><span style="color: black;">Krita 上的 ComfyUI—— 界面简单易用,对艺术家友好</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">另外</span>,Hugging face space 上还有<span style="color: black;">有些</span>免费的演示:https://huggingface.co/spaces</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">付费工具(任何生成图像模型或应用):</span></p><span style="color: black;">MidJourney</span><span style="color: black;">Runway</span><span style="color: black;">DALL・E 2</span><span style="color: black;">Adobe 的 FireFly</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">注:动画制作<span style="color: black;">运用</span>的工具<span style="color: black;">包含</span> After Effects、Moho、Blender……</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;">逐帧生成图像</span></strong></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">这类技术是以一种相当程度上<span style="color: black;">安身</span>动画根源的精神来<span style="color: black;">运用</span>生成式扩散图像模型,其是以逐帧方式生成动作序列,就像是传统动画制作的绘制再拍摄过程。其中的一大关键是这些模型在生成每张图像时<span style="color: black;">无</span>时间或运动的概念,而是<span style="color: black;">经过</span>某种机制或<span style="color: black;">各样</span>应用或扩展来<span style="color: black;">帮忙</span>得到某种程度上的动画,从而实现<span style="color: black;">所说</span>的「时间一致性(temporal consistency)」。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">这些技术得到的动画<span style="color: black;">常常</span>会<span style="color: black;">显现</span>闪烁现象。尽管许多<span style="color: black;">运用</span>这些工具的用户会<span style="color: black;">奋斗</span>清理这些闪烁,但动画师却会把这视为一种艺术形式,<span style="color: black;">叫作</span>为 boiling。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">这方面最常用的是 Stable Diffusion 等开源模型以及基于它们构建的工具。用户<span style="color: black;">能够</span><span style="color: black;">运用</span>公开的参数来配置它们,还<span style="color: black;">能够</span>将它们运行在本地计算机上。相较之下,MidJourney 工具的模型<span style="color: black;">无</span>公开,<span style="color: black;">况且</span><span style="color: black;">重点</span>是为图像生成设计的,<span style="color: black;">因此呢</span><span style="color: black;">没法</span>用来生成逐帧动画。</span></p>
<div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/b9a4640a0faf4c6195c814fda55a19c0~noop.image?_iz=58558&from=article.pc_detail&lk3s=953192f4&x-expires=1728100717&x-signature=Oqw8czUco6P%2Be4gfNhhIr4mj5GM%3D" style="width: 50%; margin-bottom: 20px;"></div>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">视频链接:https://mp.weixin.qq.com/s?__biz=MzA3MzI4MjgzMw==&mid=2650907887&idx=4&sn=ca30f3fbde94ec74b32d75b638013594&chksm=84e46091b393e987d442c8c414bdb9b76741d60116bee3419f36a3cb5961906e5d33b4ab312b&token=1179435113&lang=zh_CN#rd</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">动画<span style="color: black;">亦</span>可能<span style="color: black;">运用</span> Stable WarpFusion 来制作,这其中<span style="color: black;">触及</span>到图像转图像的工作流程,<span style="color: black;">经过</span><span style="color: black;">有些</span>扭变(置换)将底层的视频输入变成动画。视频作者:Sagans。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">用逐帧图像来制作动画<span style="color: black;">一般</span>需要混合<span style="color: black;">运用</span>以下工具:</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">一步到位的工具(文本转图像)</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">有<span style="color: black;">有些</span>新技术支持直接<span style="color: black;">经过</span>文本 prompt 和参数调配来生成动画:</span></p><span style="color: black;">参数插值(变形)</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">在每张生成的图像帧上<span style="color: black;">逐步</span>进行参数插值,以得到过渡动画。<span style="color: black;">这儿</span>的参数可能<span style="color: black;">包含</span>任何与模型<span style="color: black;">关联</span>的设定,<span style="color: black;">例如</span>文本 prompt 本身或底层的种子(隐空间游走)。</span></p>
<div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/2364362cdbe9440b9535529a76843a13~noop.image?_iz=58558&from=article.pc_detail&lk3s=953192f4&x-expires=1728100717&x-signature=S7dTlkJLomFK3guWf01lk15ewI4%3D" style="width: 50%; margin-bottom: 20px;"></div>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">prompt 编辑法,即<span style="color: black;">经过</span><span style="color: black;">逐步</span>改变权重来创建动画过渡。<span style="color: black;">这儿</span><span style="color: black;">运用</span>了 Depth ControlNet 来保持手部整体形状的一致性。</span></p><span style="color: black;">图像到图像(I2I)反馈循环</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">经过</span>图像到图像技术,将每张生成的图像帧<span style="color: black;">做为</span>输入来生成动画的下一帧。<span style="color: black;">这般</span>在其它参数和种子变化时<span style="color: black;">亦</span><span style="color: black;">能够</span>生成看起来<span style="color: black;">类似</span>的帧序列。这个过程<span style="color: black;">一般</span>由 Deforum 中的「去噪强度」或「强度调度」来<span style="color: black;">掌控</span>。<span style="color: black;">初始</span>帧<span style="color: black;">能够</span>是已有的<span style="color: black;">照片</span>。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">这是大<span style="color: black;">都数</span><span style="color: black;">运用</span> Stable Diffusion 的动画实现的一个核心组件,而 Stable Diffusion 是下列许多应用依赖的技术。这种技术很难平衡,并且很大程度上取决于<span style="color: black;">运用</span>的采样器(噪声调度器)。</span></p>
<div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p26-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/c85ec372129e41d1a25903a4321d4495~noop.image?_iz=58558&from=article.pc_detail&lk3s=953192f4&x-expires=1728100717&x-signature=MvQ2YHeqrcPT5TEIO5u8rR0yW8g%3D" style="width: 50%; margin-bottom: 20px;"></div>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">运用</span>一张<span style="color: black;">初始</span>图像,<span style="color: black;">而后</span>使用一个稍有<span style="color: black;">区别</span>的 prompt,使其逐帧变化成其它形态。</span></p><span style="color: black;">2D 或 3D 变换(基于 I2I 循环)</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">逐步</span>变换每一帧生成图像,之后再将其<span style="color: black;">做为</span> I2I 循环的输入。2D 变换对应于简单的平移、旋转和缩放。3D 技术则会想象一个在 3D 空间中移动的虚拟相机,这<span style="color: black;">一般</span>需要估计每帧生成图像的 3D 深度,<span style="color: black;">而后</span>根据想象中的相机运动来进行变形处理。</span></p>
<div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/7d7ea29a94624ad6bd1ea052436dcdba~noop.image?_iz=58558&from=article.pc_detail&lk3s=953192f4&x-expires=1728100717&x-signature=yMJ8dcRKobmHmASPUwFQoN%2BhNDU%3D" style="width: 50%; margin-bottom: 20px;"></div>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">想必你<span style="color: black;">已然</span>看过这种无限放大的动画。它的视觉效果如此之棒,是<span style="color: black;">由于</span>其<span style="color: black;">运用</span>了 SD 来<span style="color: black;">连续</span>构建新细节。</span></p><span style="color: black;">实验性、运动合成、混合等技术</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">运动合成的<span style="color: black;">目的</span>是「想象」后续生成帧之间的运动流,<span style="color: black;">而后</span><span style="color: black;">运用</span>这个运动流来逐帧执行变形处理,从而基于 I2I 循环注入有机的运动。这<span style="color: black;">一般</span>需要依赖在视频的运动估计(光流)上训练的 AI 模型,只<span style="color: black;">不外</span>其关注的不是后续视频帧,而是后续生成帧(<span style="color: black;">经过</span> I2I 循环),或是<span style="color: black;">运用</span>某种混合<span style="color: black;">办法</span>。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">其它技术还<span style="color: black;">包含</span>图像修复和变形技术搭配<span style="color: black;">运用</span>、采用多个处理<span style="color: black;">过程</span>或<span style="color: black;">乃至</span><span style="color: black;">捕捉</span>模型训练过程的快照等先进技术。举个例子,Deforum 有<span style="color: black;">非常多</span>可供用户调控的<span style="color: black;">地区</span>。</span></p>
<div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/fbffc024eeb34238bcf088416e1bbd51~noop.image?_iz=58558&from=article.pc_detail&lk3s=953192f4&x-expires=1728100717&x-signature=v3P684uBkslDV6GKhZoRBsWfSDk%3D" style="width: 50%; margin-bottom: 20px;"></div>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">运用</span> SD-CN Animation 制作,其<span style="color: black;">运用</span>了一种在生成帧之间产生幻觉运动的独特<span style="color: black;">办法</span>。<span style="color: black;">初始</span>图像只是<span style="color: black;">做为</span>起点,<span style="color: black;">无</span>其它用途。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">变换型技术(图像到图像):</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">另外</span>还<span style="color: black;">能够</span><span style="color: black;">运用</span>某个<span style="color: black;">源自</span>的输入来助力生成的帧和所得的动画结果:</span></p><span style="color: black;">混合(风格化)—— 混合视频源或 / 和按<span style="color: black;">要求</span>处理(ControlNets)</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">这类<span style="color: black;">办法</span>范围很广,做法是<span style="color: black;">运用</span>输入视频来混合和影响生成的序列。这些输入视频<span style="color: black;">一般</span>分为多个帧,<span style="color: black;">功效</span><span style="color: black;">一般</span>是风格化现实视频。在现如今的风格化跳舞视频和表演热潮中,这类技术常被用于实现动漫造型和性感体格。但你<span style="color: black;">能够</span><span style="color: black;">运用</span>任何东西<span style="color: black;">做为</span>输入,<span style="color: black;">例如</span>你自己动画的粗略一帧或任何杂乱抽象的录像。在模仿 pixilation 这种定格动画技术和替换动画技术方面,这类技术有广泛的可能性。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">在每一帧,输入帧要么<span style="color: black;">能够</span>直接与生成图像混合,<span style="color: black;">而后</span>再输入回<span style="color: black;">每一个</span> I2I 循环,要么<span style="color: black;">能够</span>采用更高级的设定附加<span style="color: black;">要求</span>的做法,<span style="color: black;">例如</span> ControlNet。</span></p>
<div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/41d9baaabfb7479ca651dd420bd56b51~noop.image?_iz=58558&from=article.pc_detail&lk3s=953192f4&x-expires=1728100717&x-signature=uHPx2zh4l08KnPf7yAY2gAOftQg%3D" style="width: 50%; margin-bottom: 20px;"></div>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">Deforum 搭配 ControlNet <span style="color: black;">要求</span>化处理的混合模式,左图是原视频。遮掩和背景模糊是<span style="color: black;">掰开</span>执行的,与这项技术无关。</span></p><span style="color: black;">光流变形(<span style="color: black;">运用</span>视频输入在 I2I 循环上执行)</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">「光流」<span style="color: black;">指的是</span>视频中估计的运动,可<span style="color: black;">经过</span>每一帧上的运动向量<span style="color: black;">暗示</span>,其指示了屏幕空间中<span style="color: black;">每一个</span>像素的运动<span style="color: black;">状况</span>。当估计出变形工作流程中的源视频的光流后,就<span style="color: black;">能够</span><span style="color: black;">按照</span>它对生成的帧执行变形,使得生成的纹理在对象或相机移动时<span style="color: black;">亦</span>能「粘黏」在对象上。</span></p>
<div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/12596f2ba10245a5a7fc2697cbf9537e~noop.image?_iz=58558&from=article.pc_detail&lk3s=953192f4&x-expires=1728100717&x-signature=dsKN%2FO17EtVTmAaksRuqkGBcvuY%3D" style="width: 50%; margin-bottom: 20px;"></div>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">Deforum 的混合模式支持这种技术搭配<span style="color: black;">各样</span>设置<span style="color: black;">运用</span>。为了得到闪动更少的结果,<span style="color: black;">亦</span>会<span style="color: black;">增多</span> cadence,使得变形的效果更好。遮掩和背景模糊是<span style="color: black;">掰开</span>执行的,与这项技术无关。</span></p><span style="color: black;">3D 衍变</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">经过</span>变形工作流程完成的<span style="color: black;">要求</span>处理<span style="color: black;">亦</span>可能直接<span style="color: black;">相关</span> 3D 数据,这<span style="color: black;">能够</span>跳过一个可能<span style="color: black;">导致</span>模糊的环节,直接在视频帧上完成处理。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">举个例子,<span style="color: black;">能够</span>直接<span style="color: black;">经过</span>虚拟 3D 场景<span style="color: black;">供给</span> openpose 或深度数据,而不是<span style="color: black;">经过</span>视频(或经过 CG 渲染的视频)估计这些数据。这<span style="color: black;">准许</span>采用最模块化和最可控的 3D 原生<span style="color: black;">办法</span>;尤其是组合了有助于时间一致性的<span style="color: black;">办法</span>时,效果更佳。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">这可能是现有技术与用于 VFX 的 AI 技术之间最有<span style="color: black;">潜能</span>的交叉<span style="color: black;">行业</span>,如下视频所示:https://youtu.be/lFE8yI4i0Yw?si=-a-GvsaIVPrdaQKm</span></p>
<div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/be1bc6e6132e455796547c199f3a39c6~noop.image?_iz=58558&from=article.pc_detail&lk3s=953192f4&x-expires=1728100717&x-signature=QsLttZBwM97XjfC1dOgjG1USXfw%3D" style="width: 50%; margin-bottom: 20px;"></div>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">有一个广泛应用的工具<span style="color: black;">亦</span><span style="color: black;">运用</span>了该技术,其可简化并自动化用 Blender 生成直接适用于 ControlNet 的角色图像的过程。在这个示例中,ControlNet <span style="color: black;">运用</span>手部骨架来生成 openpose、深度和法线贴图图像,<span style="color: black;">最后</span>得到最右侧的 SD 结果。(openpose <span style="color: black;">最后</span>被舍弃了,<span style="color: black;">由于</span>事实证明它不适用于<span style="color: black;">仅有</span>手部的<span style="color: black;">状况</span>。)</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">将所有这些技术结合起来,似乎<span style="color: black;">有没有</span>尽的参数<span style="color: black;">能够</span><span style="color: black;">调节</span>动画的生成结果(就像模块化的音频制作)。它要么<span style="color: black;">能够</span><span style="color: black;">经过</span>关键帧进行「调度」并<span style="color: black;">运用</span> Parseq <span style="color: black;">这般</span>的工具绘制图形,要么<span style="color: black;">能够</span>与音频和音乐<span style="color: black;">相关</span>,得到许多随音频变化的动画。只需如此,你就能<span style="color: black;">运用</span> Stable Diffusion 帮你跳舞了。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">优点:</span></p><span style="color: black;">全新且<span style="color: black;">持续</span>演变的美学风格,这是这种<span style="color: black;">媒介</span>形式特有的。</span><span style="color: black;">在概念上与传统的动画技术有<span style="color: black;">一起</span>点。</span><span style="color: black;">最容易定制化、最实用且易于<span style="color: black;">指点</span>。</span><span style="color: black;">模块化、分层的<span style="color: black;">办法</span>。</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">缺点:</span></p><span style="color: black;"><span style="color: black;">常常</span>会有闪动问题,有时候会<span style="color: black;">显出</span>很混乱。</span><span style="color: black;">技术方面要<span style="color: black;">思虑</span>的东西<span style="color: black;">非常多</span>,难以平衡<span style="color: black;">思虑</span>,<span style="color: black;">想要</span><span style="color: black;">作为</span><span style="color: black;">能手</span>必须经历陡峭的学习曲线。</span><span style="color: black;"><span style="color: black;">倘若</span><span style="color: black;">无</span>性能卓越的本地硬件(英伟达 GPU),就会很不方便。</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">免费工具:</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">可在 A1111 webui 中<span style="color: black;">运用</span>的工具:</span></p><span style="color: black;">用于参数插值动画(travel)的小脚本:<span style="color: black;">过程</span>(https://github.com/vladmandic/sd-extension-steps-animation) 、prompt(https://github.com/Kahsolt/stable-diffusion-webui-prompt-travel )、种子(https://github.com/yownas/seed_travel)。</span><span style="color: black;">Deforum—— 能够满足<span style="color: black;">各样</span>动画 SD <span style="color: black;">需要</span>的最佳工房,整合了上面大<span style="color: black;">都数</span>技术。</span><span style="color: black;">Parseq—— 用于 Deforum 的常用视觉参数排序工具。</span><span style="color: black;">Deforum timeline helper—— 另一款参数可视化和调度工具。</span><span style="color: black;">Deforumation—— 用于实时<span style="color: black;">掌控</span> Deforum 参数的 GUI,支持反应性<span style="color: black;">调节</span>和<span style="color: black;">掌控</span>。</span><span style="color: black;">TemporalKit—— 采用了 EBsynth 的<span style="color: black;">有些</span>原则,可与 SD 搭配<span style="color: black;">运用</span>实现一致的视频风格化。</span><span style="color: black;">SD-CN Animation—— 这多少还是个实验性工具,支持<span style="color: black;">有些</span>混合风格化工作流程,<span style="color: black;">亦</span>支持有趣的光流运动合成(这会<span style="color: black;">引起</span>运动抖动)。</span><span style="color: black;">TemporalNet——ControlNet 模型<span style="color: black;">能够</span>用在 Deforum 等其它工作流程中,<span style="color: black;">目的</span>是<span style="color: black;">提高</span>时间一致性。Python 笔记本(需要在 Google Colab 或 Jupyter 上运行)。</span><span style="color: black;">Stable WarpFusion —— 实验性的代码工具包,<span style="color: black;">目的</span>是执行高级的视频风格化和动画。与 Deforum 有<span style="color: black;">非常多</span><span style="color: black;">同样</span>的功能。</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">插件和附加组件:</span></p><span style="color: black;">用于 Blender 的 Dream Textures</span><span style="color: black;">Stabiliy AI 的 Blender 插件</span><span style="color: black;">看起来像用于 Blender 的 Openpose 的角色骨架 —— 可在 Blender 之外<span style="color: black;">运用</span> ControlNet</span><span style="color: black;">用于虚幻引擎 5 的 Unreal Diffusion</span><span style="color: black;">用于 After Effects 的 After-Diffusion(<span style="color: black;">日前</span>还在<span style="color: black;">研发</span>中)</span><span style="color: black;">用于 TouchDesigner 的 A1111 或 ComfyUI API—— <span style="color: black;">倘若</span>你<span style="color: black;">晓得</span><span style="color: black;">怎样</span>操作,<span style="color: black;">那样</span>这可用于执行动画等<span style="color: black;">各样</span>任务</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">付费工具:</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">(<span style="color: black;">一般</span><span style="color: black;">亦</span>依赖于 SD,但运行在「云」上,用起来<span style="color: black;">亦</span>更简单):</span></p><span style="color: black;">Stability AI 的动画 API</span><span style="color: black;">Kaiber 的 Flipbook 模式 —— <span style="color: black;">根据</span>描述,基于 Deforum 代码</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">插件和附加组件:</span></p><span style="color: black;">用于 After Effects 的 Diffusae</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">市面上还有许多应用和工具,但<span style="color: black;">倘若</span>是付费工具,多半是基于开源的 Deforum 代码。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">注:最好的<span style="color: black;">状况</span>是你有足够的优良硬件(即 GPU)在本地运行这些工具。<span style="color: black;">倘若</span><span style="color: black;">无</span>,你<span style="color: black;">亦</span><span style="color: black;">能够</span>尝试运行在远程计算机上的、功能有限的免费服务,<span style="color: black;">例如</span> Google Colab。<span style="color: black;">不外</span>,Google Colab 上的笔记本<span style="color: black;">亦</span><span style="color: black;">能够</span>运行在本地硬件上。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;">视频生成技术</span></strong></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">这类技术<span style="color: black;">运用</span>在运动视频上训练的视频生成 AI 模型,<span style="color: black;">另一</span><span style="color: black;">能够</span>在神经网络层面上<span style="color: black;">运用</span>时间压缩来<span style="color: black;">加强</span>。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">日前</span>,这些模型有一个<span style="color: black;">一起</span>特征是它们仅能处理时间很短的视频片段(几秒),并受到 GPU 上可用视频内存的限制。<span style="color: black;">然则</span>,这方面的发展速度<span style="color: black;">火速</span>,并且<span style="color: black;">能够</span>用<span style="color: black;">有些</span><span style="color: black;">办法</span>将多个生成结果拼接成更长的视频。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;">视频生成模型</span></strong></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">这<span style="color: black;">指的是</span><span style="color: black;">运用</span>从头构建和训练的模型来处理视频。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">现今的这类模型得到的结果<span style="color: black;">常常</span>晃动很大、有<span style="color: black;">显著</span>的 AI 痕迹、<span style="color: black;">显出</span>古怪。就像是很久之前生成图像的 AI 模型<span style="color: black;">同样</span>。这个<span style="color: black;">行业</span>的发展落后<span style="color: black;">有些</span>,但<span style="color: black;">发展</span><span style="color: black;">火速</span>,我个人认为在静态图像生成上取得的<span style="color: black;">发展</span>并不会同等比例地在视频生成方面重现,<span style="color: black;">由于</span>视频生成的难度要大得多。</span></p>
<div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/2dd503815ad4491a9cffb1670bcfa92e~noop.image?_iz=58558&from=article.pc_detail&lk3s=953192f4&x-expires=1728100717&x-signature=ZMSaZvc4ou4J4GCMN%2Fc7fXXp8w4%3D" style="width: 50%; margin-bottom: 20px;"></div>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">Paul Trillo <span style="color: black;">运用</span> Runway 的 Gen-2,仅<span style="color: black;">经过</span>图像和文本 prompt 让 AI 生成的视频。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">视频链接:https://mp.weixin.qq.com/s?__biz=MzA3MzI4MjgzMw==&mid=2650907887&idx=4&sn=ca30f3fbde94ec74b32d75b638013594&chksm=84e46091b393e987d442c8c414bdb9b76741d60116bee3419f36a3cb5961906e5d33b4ab312b&token=1179435113&lang=zh_CN#rd</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">我认为在这方面,动画和传统电影之间的界限很模糊。只要其结果还与现实有差异,<span style="color: black;">那样</span><span style="color: black;">咱们</span>就<span style="color: black;">能够</span>在<span style="color: black;">必定</span>程度上把它们看作是动画和视频艺术的一种怪异新流派。就<span style="color: black;">日前</span>而言,我认为<span style="color: black;">大众</span>还是别想着用这类技术做真实风格的电影了,只把它视为一种新形式的实验<span style="color: black;">媒介</span><span style="color: black;">就可</span>。玩得开心哦!</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;">一步到位的工具(文本转视频):<span style="color: black;">运用</span>文本 prompt 生成全新的视频片段</span></strong></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">理论上讲,这类技术<span style="color: black;">有没有</span>限可能性 —— 只要你能将其描述出来(就像静态图像生成那样),就可能将其用于直播表演或生成任何超现实和风格化的内容。但从实践<span style="color: black;">方向</span>看,为了训练视频模型,收集多样化和足够大的数据集要难得多,<span style="color: black;">因此呢</span>仅靠文本来设定生成<span style="color: black;">要求</span>,很难用这些模型实现利基(niche)的美学风格。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">运用</span>这种<span style="color: black;">办法</span>,只能很宽松地<span style="color: black;">掌控</span>创意工作。当与图像或视频<span style="color: black;">要求</span>化处理(即变形工作流程)组合<span style="color: black;">运用</span>时,这种技术就会强大得多。</span></p>
<div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p26-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/898e84ced7c04a72aab5b417bea12f03~noop.image?_iz=58558&from=article.pc_detail&lk3s=953192f4&x-expires=1728100717&x-signature=r%2BcMePfjh88UuNGpS4HHTLRGHWM%3D" style="width: 50%; margin-bottom: 20px;"></div>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">Kyle Wiggers 做的动画生成测试,<span style="color: black;">运用</span>了 Runway 的 Gen-2</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;">变形:<span style="color: black;">运用</span>文本 prompt,再<span style="color: black;">按照</span>已有的图像或视频进行进一步的<span style="color: black;">要求</span>化处理</span></strong></p><span style="color: black;">图像到视频生成</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">非常多</span>视频生成工具都能让你以图像为<span style="color: black;">要求</span>生成视频。其做法<span style="color: black;">能够</span>是完全从你指定的图像<span style="color: black;">起始</span>生成,<span style="color: black;">亦</span><span style="color: black;">能够</span>将指定图像用作语义信息、构图和颜色的粗略参考。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">人们经常会<span style="color: black;">运用</span>传统的静态图像模型生成<span style="color: black;">初始</span>图像,<span style="color: black;">而后</span>再将其输入视频模型。</span></p>
<div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/fd8f61cecd4a4451adf492ae59f08c36~noop.image?_iz=58558&from=article.pc_detail&lk3s=953192f4&x-expires=1728100717&x-signature=0tlxkM6tFpkvaHWmkzqFRcanVoI%3D" style="width: 50%; margin-bottom: 20px;"></div>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">这儿</span>生成的每一段视频都是<span style="color: black;">运用</span>一张唱片封面<span style="color: black;">做为</span><span style="color: black;">初始</span>图像,作者:Stable Reel</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">视频链接:https://mp.weixin.qq.com/s?__biz=MzA3MzI4MjgzMw==&mid=2650907887&idx=4&sn=ca30f3fbde94ec74b32d75b638013594&chksm=84e46091b393e987d442c8c414bdb9b76741d60116bee3419f36a3cb5961906e5d33b4ab312b&token=1179435113&lang=zh_CN#rd</span></p><span style="color: black;">视频到视频生成</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">类似于图像生成模型中的图像到图像过程,<span style="color: black;">亦</span>有可能将输入视频的信息嵌入到视频模型中,再加上文本 prompt,让其生成(去噪)输出。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">我并<span style="color: black;">不睬</span>解这其中的<span style="color: black;">详细</span>过程,但似乎这个过程不仅能在逐帧层面上匹配输入视频片段(如同<span style="color: black;">运用</span> Stable Diffusion 进行风格化处理),<span style="color: black;">况且</span>能在整体和运动层面上匹配。和图像到图像生成过程<span style="color: black;">同样</span>,这个过程受去噪强度<span style="color: black;">掌控</span>。</span></p>
<div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/f635b5dd6b4149248b21fab8c49e9a37~noop.image?_iz=58558&from=article.pc_detail&lk3s=953192f4&x-expires=1728100717&x-signature=ZYFZ4hnNyIQ66HEUUpxCPXVaG9U%3D" style="width: 50%; margin-bottom: 20px;"></div>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">如果运气好并且有合适的 prompt,你<span style="color: black;">亦</span><span style="color: black;">能够</span>输入视频来「启发」模型重新想象源视频中的运动,并以完全<span style="color: black;">区别</span>的形式将其呈现出来。<span style="color: black;">运用</span> webui txt2vid 中的 Zeroscope 完成,<span style="color: black;">运用</span>了 vid2vid 模式。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">优点:</span></p><span style="color: black;">这类技术<span style="color: black;">拥有</span>最大的可能性,并且会随时间<span style="color: black;">持续</span>改进。</span><span style="color: black;">在专业动画知识方面<span style="color: black;">无</span>进入门槛。</span><span style="color: black;">相比于逐帧的技术,这类技术的结果<span style="color: black;">常常</span>更加平滑,<span style="color: black;">一般</span><span style="color: black;">亦</span>更为一致。</span><span style="color: black;"><span style="color: black;">针对</span>「变形」工作流程而言,这可能是比逐帧<span style="color: black;">办法</span>更简单直接的<span style="color: black;">办法</span>。</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">缺点:</span></p><span style="color: black;">得到的结果<span style="color: black;">一般</span><span style="color: black;">显出</span>离奇怪异,一看<span style="color: black;">便是</span> AI 生成的,<span style="color: black;">况且</span>这种<span style="color: black;">状况</span>比静态图像严重多了。在<span style="color: black;">触及</span>人的仿真实影像方面尤其<span style="color: black;">显著</span>。</span><span style="color: black;">计算成本高。相比于图像 AI,更难以在本地硬件上运行。</span><span style="color: black;">存在视频时长短和上下文短等限制(<span style="color: black;">日前</span>而言)。</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">免费工具:</span></p><span style="color: black;">Stable Video (SVD)—— 来自 StabilityAI 的开源视频扩散模型。<span style="color: black;">日前</span><span style="color: black;">非常多</span>托管式应用和工具都在快速<span style="color: black;">安排</span>实现该模型。</span><span style="color: black;">SVD ComfyUI 实现</span><span style="color: black;">SVD 时间 ControlNet</span><span style="color: black;">MotionCtrl—— 经过<span style="color: black;">加强</span>,允许在<span style="color: black;">各样</span>视频模型中<span style="color: black;">掌控</span><span style="color: black;">目的</span>运动和摄像机轨迹。</span><span style="color: black;">Emu Video——Meta 的视频生成模型的预览演示。</span><span style="color: black;">A1111 webui 的文本转视频插件,可搭配以下模型<span style="color: black;">运用</span>(<span style="color: black;">倘若</span>你的硬件足够):</span><span style="color: black;">VideoCrafter</span><span style="color: black;">Zeroscope</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">插件和附加组件:</span></p><span style="color: black;">用于 Blender 的 Pallaidium—— 一个多功能工具包,<span style="color: black;">包括</span>跨图像、视频<span style="color: black;">乃至</span>音频<span style="color: black;">行业</span>的生成功能。</span><span style="color: black;"><span style="color: black;">另外</span>,你还能在 Hugging face space 上找到<span style="color: black;">有些</span>免费演示。</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">付费工具(有试用版):</span></p><span style="color: black;">Runway 的 Gen2</span><span style="color: black;">Kaiber 的 Motion 模式</span><span style="color: black;">Pika labs(受限 beta 版)</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">注:最好的<span style="color: black;">状况</span>是你有足够的优良硬件(即 GPU)在本地运行这些工具。<span style="color: black;">倘若</span><span style="color: black;">无</span>,你<span style="color: black;">亦</span><span style="color: black;">能够</span>尝试运行在远程计算机上的、功能有限的免费服务,<span style="color: black;">例如</span> Google Colab,<span style="color: black;">不外</span>大<span style="color: black;">都数</span>免费或试用服务的功能都有限。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;"><span style="color: black;">运用</span>运动压缩<span style="color: black;">加强</span>的图像模型</span></strong></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">随着 AnimateDiff 的日益流行,<span style="color: black;">显现</span>了一个<span style="color: black;">运用</span>视频或「运动」压缩来<span style="color: black;">加强</span>已有图像扩散模型的新兴<span style="color: black;">行业</span>。相比于<span style="color: black;">运用</span>逐帧技术生成的结果,其生成的结果更相近于原生视频模型(如上面介绍的)。这种技术的<span style="color: black;">优良</span>是你还<span style="color: black;">能够</span><span style="color: black;">运用</span>为 Stable Diffusion 等图像模型构建的工具,如社区创建的任何<span style="color: black;">检测</span>点模型、LoRA、ControlNet 以及其它<span style="color: black;">要求</span>化处理工具。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">你<span style="color: black;">乃至</span>有可能<span style="color: black;">经过</span> ControlNet <span style="color: black;">供给</span>视频<span style="color: black;">要求</span>化处理,就像是<span style="color: black;">运用</span>逐帧技术<span style="color: black;">同样</span>。社区仍在积极实验这一技术。可用的技术有的来自静态图像模型(<span style="color: black;">例如</span> prompt 遍历),<span style="color: black;">亦</span>有的来自视频原生模型。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">如下视频为<span style="color: black;">运用</span> ComfyUI 中 AnimateDiff 完成的动画,过程<span style="color: black;">运用</span>了多个<span style="color: black;">区别</span>的 prompt 主题。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">视频链接:https://www.instagram.com/p/Cx-iecPusza/?utm_source=ig_embed&utm_campaign=embed_video_watch_again</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">这种技术中的运动本身<span style="color: black;">一般</span>非常原始,只是在视频片段中松散地<span style="color: black;">插进</span>对象和流,这<span style="color: black;">常常</span>会将事物变形成其它模样。<span style="color: black;">不外</span>,这种技术有更好的时间一致性,<span style="color: black;">况且</span>仍<span style="color: black;">处在</span>起步阶段。<span style="color: black;">就地</span>景很抽象,<span style="color: black;">无</span><span style="color: black;">详细</span>物体时,这种<span style="color: black;">办法</span>能得到最好的结果。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">优点:</span></p><span style="color: black;"><span style="color: black;">能够</span>受益于现有图像扩散模型的<span style="color: black;">发展</span>。</span><span style="color: black;"><span style="color: black;">能够</span><span style="color: black;">经过</span>去噪或<span style="color: black;">运用</span> ControlNet 用视频来进行<span style="color: black;">要求</span>化处理。</span><span style="color: black;">处理抽象、流运动效果很好。</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">缺点:</span></p><span style="color: black;">难以为<span style="color: black;">名人</span>或不<span style="color: black;">平常</span>的物体产生<span style="color: black;">繁杂</span>、连贯一致的运动,反而常<span style="color: black;">显现</span>变形问题。</span><span style="color: black;">和视频原生模型<span style="color: black;">同样</span>,计算成本高。相比于图像 AI,更难以在本地硬件上运行。</span><span style="color: black;">受限于较短的上下文窗口(<span style="color: black;">日前</span>而言),但<span style="color: black;">亦</span>有<span style="color: black;">有些</span>人正在实验<span style="color: black;">处理</span><span style="color: black;">方法</span>。</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">免费工具:</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">日前</span>,AnimateDiff (SD v1.5) 的实现一马当先:</span></p><span style="color: black;">用于 AnimateDiff 的 A1111 webui 插件</span><span style="color: black;">ComfyUI 中 AnimateDiff 实现</span><span style="color: black;">VisionCrafter—— 一个用于 AnimateDiff 实现等项目的 GUI 工具</span><span style="color: black;">用于 SD XL:Hotshot-XL</span><span style="color: black;">多功能实现:Enfugue</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">付费工具:</span></p><span style="color: black;"><span style="color: black;">日前</span><span style="color: black;">好似</span><span style="color: black;">无</span></span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;">整合语音合成的人脸动画</span></strong></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">大众</span>都<span style="color: black;">晓得</span>,这是一个流行迷因<span style="color: black;">背面</span>的技术。你可能看过一个相对静止的<span style="color: black;">名人</span>(相机可能在移动)<span style="color: black;">仅有</span>脸动着说话,这多半是用到了 AI 人脸动画化和语音合成工具的组合<span style="color: black;">办法</span>。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">这其中组合了多个技术<span style="color: black;">过程</span>和组件。其源图像多半是<span style="color: black;">运用</span>图像生成 AI 制作的,但<span style="color: black;">亦</span><span style="color: black;">能够</span><span style="color: black;">运用</span>任何带有人脸的图像。语音是<span style="color: black;">按照</span>文本生成的,并<span style="color: black;">按照</span>所选任务的音色进行了<span style="color: black;">要求</span>化处理。<span style="color: black;">而后</span>再<span style="color: black;">运用</span>另一个工具(或工具包中的某个模型)合成与音频唇形同步的人脸动画 —— <span style="color: black;">一般</span>只生成图像中脸部和头部区域的运动。<span style="color: black;">运用</span>预训练的数字化身<span style="color: black;">亦</span>能让身体动起来。</span></p>
<div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/2d3bc48ffff146c297f6be7d801df134~noop.image?_iz=58558&from=article.pc_detail&lk3s=953192f4&x-expires=1728100717&x-signature=wyEURueygFHodC0Wy6mGT3hcoNY%3D" style="width: 50%; margin-bottom: 20px;"></div>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">视频链接:https://mp.weixin.qq.com/s?__biz=MzA3MzI4MjgzMw==&mid=2650907887&idx=4&sn=ca30f3fbde94ec74b32d75b638013594&chksm=84e46091b393e987d442c8c414bdb9b76741d60116bee3419f36a3cb5961906e5d33b4ab312b&token=1179435113&lang=zh_CN#rd</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">在发布热门的 Belenciaga 视频之前,作者 demonflyingfox 就<span style="color: black;">已然</span>发布了一篇分步教程:https://youtu.be/rDp_8lPUbWY?si=BWNKe7-KTJpCrNjF</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">优点:</span></p><span style="color: black;">可用于<span style="color: black;">容易</span>制作迷因动图。</span><span style="color: black;">…… 呃,有喜剧效果?</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">缺点:</span></p><span style="color: black;"><span style="color: black;">一般</span>看起来不自然。我还想不出这能有什么<span style="color: black;">实质</span>用途。</span><span style="color: black;">过于依赖付费应用<span style="color: black;">供给</span>的闭源人脸动画工具。</span><span style="color: black;">即使你<span style="color: black;">运用</span>自己的录像来训练数字化身,得到的结果<span style="color: black;">亦</span>过于<span style="color: black;">呆滞</span>,动态效果很差。</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">免费工具:</span></p><span style="color: black;">ElevenLabs—— 有<span style="color: black;">运用</span>次数限制,但次数似乎<span style="color: black;">每一个</span>月都会刷新。</span><span style="color: black;">A1111 WebUI 的 Wav2Lip 插件 —— 用于生成唇形同步动画的工具。看起来仅限于嘴部区域。</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">你<span style="color: black;">亦</span><span style="color: black;">能够</span>在网上直接搜索文本转语音服务,不可胜计,但效果多半赶不上 ElevenLabs。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">至于全脸动画化,就我所知,<span style="color: black;">日前</span>仅有<span style="color: black;">有些</span>付费应用<span style="color: black;">供给</span>了试用版,<span style="color: black;">况且</span><span style="color: black;">运用</span>很受限。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">付费工具(有试用版):</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">人脸动画制作(<span style="color: black;">一般</span>会搭配语音合成):</span></p><span style="color: black;">D-ID</span><span style="color: black;">Heygen</span><span style="color: black;">Synesthesia</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">搜索「D-ID 替代品」就能找到<span style="color: black;">非常多</span>。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;">生成三维的<span style="color: black;">名人</span>运动</span></strong></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">这<span style="color: black;">指的是</span>为 3D <span style="color: black;">名人</span>合成运动的技术。这类技术<span style="color: black;">能够</span>应用于 3D 动画电影、视频游戏或其它 3D 交互应用。正如图像和视频<span style="color: black;">行业</span><span style="color: black;">同样</span>,新兴的 AI 工具让人可<span style="color: black;">经过</span>文本来描述<span style="color: black;">名人</span>的运动。<span style="color: black;">另外</span>,<span style="color: black;">有些</span>工具还能<span style="color: black;">按照</span>很少的关键姿势来构建运动<span style="color: black;">或</span>在交互环境中实时动态地生成动画。</span></p>
<div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/tos-cn-i-6w9my0ksvp/92186a6e99b143cea93d2f64a90c2158~noop.image?_iz=58558&from=article.pc_detail&lk3s=953192f4&x-expires=1728100717&x-signature=BnIJSD9RI6JymPkQIxlRN1gQ3ZQ%3D" style="width: 50%; margin-bottom: 20px;"></div>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">视频链接:https://mp.weixin.qq.com/s?__biz=MzA3MzI4MjgzMw==&mid=2650907887&idx=4&sn=ca30f3fbde94ec74b32d75b638013594&chksm=84e46091b393e987d442c8c414bdb9b76741d60116bee3419f36a3cb5961906e5d33b4ab312b&token=1179435113&lang=zh_CN#rd</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">Nikita 的充满天才巧思的元人工智能电影预告片,其中将 AI 的运动学习过程展现<span style="color: black;">成为了</span>一部滑稽幽默的有趣短片。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">因为</span>本文的关注重点是生成工具,<span style="color: black;">因此呢</span><span style="color: black;">无</span><span style="color: black;">包括</span>自动化某些非创意任务的 AI 应用,<span style="color: black;">例如</span> AI 驱动的运动跟踪、合成、打码等,例子包括 Move.ai 和 Wonder Dynamics。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">优点:</span></p><span style="color: black;">能整合进现有的 3D 动画制作流程中,可减少重复性任务,有望<span style="color: black;">作为</span>动画老手的好<span style="color: black;">助手</span>。</span><span style="color: black;">能很好地处理<span style="color: black;">理学</span>效果和重量。</span><span style="color: black;">在<span style="color: black;">将来</span>的视频游戏中实现动态的<span style="color: black;">名人</span>动画?</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">缺点:</span></p><span style="color: black;">似乎受限于人类形态的双足式<span style="color: black;">名人</span>。</span><span style="color: black;">还需要其它工具辅助。只是 3D 动画制作流程的一个组件。你需要<span style="color: black;">晓得</span>接下来该做什么。</span><span style="color: black;">训练过程<span style="color: black;">一般</span>基于人类运动数据,这<span style="color: black;">寓意</span>着到<span style="color: black;">日前</span>为止这些工具只能实现基于真实<span style="color: black;">理学</span>效果的运动,<span style="color: black;">没法</span>实现风格化或卡通中的运动机制。</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">免费工具(或可免费<span style="color: black;">运用</span>部分功能的服务):</span></p><span style="color: black;">Mootion</span><span style="color: black;">Omni Animation</span><span style="color: black;">Cascadeur—— 动画制作助理,<span style="color: black;">能够</span><span style="color: black;">按照</span>最小化的输入创建平滑的、基于<span style="color: black;">理学</span>机制的动画和姿势。可控性高,可能会<span style="color: black;">作为</span><span style="color: black;">将来</span>一个主力工具。</span><span style="color: black;">ComfyUI 中的 MDM、MotionDiffuse 和 ReMoDiffuse 实现。</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">付费工具:</span></p><span style="color: black;">免费工具的付费套餐会<span style="color: black;">供给</span><span style="color: black;">更加多</span>功能,<span style="color: black;">运用</span>限制<span style="color: black;">亦</span>更少。</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;">LLM 驱动的工具</span></strong></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">从理论上讲,<span style="color: black;">因为</span>大型语言模型(LLM)在编程任务上表现出色,尤其是经过微调之后,<span style="color: black;">那样</span><span style="color: black;">咱们</span>就<span style="color: black;">能够</span>在制作动画的软件中让其编程和编写脚本。这就<span style="color: black;">寓意</span>着<span style="color: black;">根据</span>常规工作流程制作动画时,能让 AI 从头到尾<span style="color: black;">始终</span>辅助。极端<span style="color: black;">状况</span>下,AI 能帮你完成一切工作,<span style="color: black;">同期</span>还能为后端流程分配适当的任务。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">在实践中,你<span style="color: black;">亦</span>能尝试这么做了!举个例子,Blender 配备了非常广泛的 Python API,<span style="color: black;">准许</span><span style="color: black;">经过</span>代码操作该工具,<span style="color: black;">因此呢</span><span style="color: black;">此刻</span><span style="color: black;">已然</span>有几个类似 ChatGPT 的辅助工具可用了。这个趋势不可避免。只要有代码,LLM 多半就会有用武之地。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">优点:</span></p><span style="color: black;"><span style="color: black;">潜能</span> —— <span style="color: black;">最后</span>突破创意工作者面临的任何技术<span style="color: black;">阻碍</span>。</span><span style="color: black;">可用作创意软件的助理,消除繁琐重复的任务,帮你深度挖掘文档内容。</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">缺点:</span></p><span style="color: black;"><span style="color: black;">倘若</span> AI 能帮你创造一切,<span style="color: black;">那样</span><span style="color: black;">作为</span>创意工作者还有什么<span style="color: black;">道理</span>?</span><span style="color: black;"><span style="color: black;">日前</span>,LLM 只能运行在强大的远程计算机上,<span style="color: black;">一般</span>是按 token 数收费或采用订阅制。</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">免费工具:</span></p><span style="color: black;">Blender Chat Companion——(类似于 Blender Copilot)Blender 中的一个 ChatGPT 实现,专用于处理适当的任务。<span style="color: black;">运用</span>了 ChatGPT API,这需要付费。</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">付费工具:</span></p><span style="color: black;">Genmo—— 承诺会实现「创意通用智能」,采用了多步过程并且全都<span style="color: black;">能够</span><span style="color: black;">经过</span>聊天界面<span style="color: black;">掌控</span>。</span><span style="color: black;">Blender Copilot——(类似于 Blender Chat Companion)Blender 中的一个 ChatGPT 实现,专用于处理适当的任务。<span style="color: black;">运用</span>了 ChatGPT API,这需要付费。</span>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">注:还有一个即将推出的 ChatUSD—— 这是一个<span style="color: black;">能够</span>操作和管理 USD 的聊天<span style="color: black;">设备</span>人,这<span style="color: black;">是由于</span>皮克斯最初创建的标准,用以统一和简化动画电影制作中的 3D 数据交换和并行化。<span style="color: black;">日前</span><span style="color: black;">无</span><span style="color: black;">更加多</span><span style="color: black;">关联</span><span style="color: black;">信息</span>了,但英伟达似乎很欢迎这项标准并在推动其<span style="color: black;">作为</span><span style="color: black;">各样</span> 3D 内容的标准,而不只是电影。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">最终</span>完结了!内容<span style="color: black;">非常多</span>,但我多半还是遗漏了<span style="color: black;">有些</span>东西。你觉得还有什么内容有待<span style="color: black;">弥补</span>或还有什么<span style="color: black;">关联</span>工具值得提及,请在评论区与<span style="color: black;">咱们</span>分享。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">原文链接:https://diffusionpilot.blogspot.com/2023/09/overview-ai-animation.html#id_generative_video_models</span></p>
你的努力一定会被看见,相信自己,加油。 哈哈、笑死我了、太搞笑了吧等。
页:
[1]