AI创业,追逐风口但别轻信“红利”——专访AI人机交互专家季啸白
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">自2023年以Stable Diffusion为主的开源图像生成AI<span style="color: black;">败兴</span>,AI在图像和视频两个方向均有了质的突破。从语音AI到<span style="color: black;">作为</span>创新前沿的AI视频大模型,经历了六十年风雨的人工智能<span style="color: black;">逐步</span>迎来了产业化的临界点。</p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">季啸白<span style="color: black;">日前</span>在<span style="color: black;">全世界</span>头部互联网社交<span style="color: black;">媒介</span><span style="color: black;">机构</span>工作,<span style="color: black;">同期</span><span style="color: black;">亦</span>是<span style="color: black;">长时间</span>的图像生成AI方向创业者。从<span style="color: black;">全世界</span>顶级名校硕士毕业后,季啸白<span style="color: black;">始终</span>投身于图像生成AI<span style="color: black;">制品</span>化的<span style="color: black;">科研</span>和创业,作品<span style="color: black;">亦</span>多次获奖。AI<span style="color: black;">怎样</span>转化为新质生产力?紫牛<span style="color: black;">资讯</span>记者采访了人机交互专家季啸白。</p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">扬子晚报/紫牛<span style="color: black;">资讯</span>记者<strong style="color: blue;"><span style="color: black;">王塞塞</span></strong></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="https://q2.itc.cn/q_70/images03/20240523/be151f5a54944a6caf48a001deae1f73.jpeg" style="width: 50%; margin-bottom: 20px;"></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">图源:视觉中国</p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">一</p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">紫牛<span style="color: black;">资讯</span></strong><span style="color: black;">:</span><span style="color: black;"><span style="color: black;">没</span>论是苹果手机的Siri,还是小爱<span style="color: black;">朋友</span>、天猫精灵,语音助手从一出生<span style="color: black;">起始</span>就被人们视作AI技术的落地<span style="color: black;">制品</span>,但它<span style="color: black;">为么</span>没被<span style="color: black;">都数</span>用户高频<span style="color: black;">运用</span>?</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">季啸白</strong><span style="color: black;">:我觉得<span style="color: black;">重点</span>有两大<span style="color: black;">原由</span>:<span style="color: black;">运用</span>场景有限和效率不高。语音AI最早<span style="color: black;">能够</span>追溯到20世纪50年代,很长的时间段里,它是依赖人类<span style="color: black;">守护</span>来回答问题。近几年语音AI实现了质的飞跃,<span style="color: black;">例如</span>ChatGPT所<span style="color: black;">供给</span>的语音AI不仅能够回答用户的问题,还能引导对话向更有价值的方向发展。但<span style="color: black;">做为</span>信息传递的载体,语音的信息密度常常<span style="color: black;">小于</span>图像,在<span style="color: black;">非常多</span>场景下,单凭语音很难完整表达<span style="color: black;">繁杂</span>的信息。<span style="color: black;">另外</span>,语音AI个性化能力有着先天不足,它很难从你的语气和用词中感知你的<span style="color: black;">爱好</span>,<span style="color: black;">没</span>法<span style="color: black;">得到</span>你的<span style="color: black;">运用</span>数据,<span style="color: black;">亦</span><span style="color: black;">没</span>法<span style="color: black;">有效</span>向你<span style="color: black;">举荐</span>内容。图形化界面的先天<span style="color: black;">优良</span>在于,<span style="color: black;">没</span>论是淘宝还是抖音,你点进去看了多久,查看哪一类商品比较多,这些数据都能<span style="color: black;">做为</span><span style="color: black;">举荐</span>给你视频和商品的依据。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">紫牛<span style="color: black;">资讯</span></strong><span style="color: black;">:<span style="color: black;">但语音AI仍然是人工智能竞逐的<span style="color: black;">要紧</span>战场,<span style="color: black;">例如</span>2024北京车展,车内AI语音交互便是<span style="color: black;">非常多</span>新能源车的亮点。语音AI还有<span style="color: black;">那些</span>发展方向?</span></span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">季啸白</strong><span style="color: black;">:如今基于大模型智驾技术的<span style="color: black;">公司</span>非常多,但即便是业界领先的语音识别算法,在嘈杂环境、口音差异、速度变化等<span style="color: black;">状况</span>下,转录准确率很难达到100%。但人类在<span style="color: black;">设备</span>时代已养<span style="color: black;">成为了</span><span style="color: black;">这般</span>的惯性:对人类犯错习以为常,但<span style="color: black;">不可</span>接受<span style="color: black;">设备</span>犯错,这正是车内AI语音交互发展的限制。<span style="color: black;">另外</span>,在车内场景下,用户对语音AI的诉求不算高频需求,变现方式<span style="color: black;">亦</span>颇为单一,大多打包在车机订阅服务中。<span style="color: black;">针对</span>普通AI创业者<span style="color: black;">来讲</span>,成功的机会很小。相比之下,语音AI在情感<span style="color: black;">陪同</span>方面,赛道更宽。独居老人常常缺乏人际交流,情感<span style="color: black;">陪同</span>型语音助手<span style="color: black;">能够</span><span style="color: black;">经过</span>自然对话,为老人<span style="color: black;">供给</span>情感支持;<span style="color: black;">陪同</span>儿童时,语音助手<span style="color: black;">能够</span>与<span style="color: black;">孩儿</span>进行有趣的互动游戏;有些<span style="color: black;">青年</span>人<span style="color: black;">期盼</span>和<span style="color: black;">爱好</span>的动漫角色谈<span style="color: black;">爱情</span>,和自己崇拜的偶像交流,和<span style="color: black;">爱好</span>的虚拟角色交流,这些都是语音AI可能应用的空间。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><strong style="color: blue;">二</strong></span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">紫牛<span style="color: black;">资讯</span></strong><span style="color: black;">:<span style="color: black;">去年底英国《自然》杂志刊文预测的2024年十大科学<span style="color: black;">发展</span>中,人工智能的进步和ChatGPT人工智能占据前两位。人类<span style="color: black;">怎样</span>对待可能<span style="color: black;">持有</span><span style="color: black;">认识</span>的人工智能,<span style="color: black;">亦</span><span style="color: black;">成为了</span>关注的<span style="color: black;">专题</span>。</span></span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">季啸白</strong><span style="color: black;">:有<span style="color: black;">科研</span>者预测,人工智能产生<span style="color: black;">认识</span>将在5-20年内实现。但我<span style="color: black;">晓得</span>的是,2023年<span style="color: black;">无</span>一项<span style="color: black;">帮助</span>是用于“<span style="color: black;">科研</span>人工智能产生<span style="color: black;">认识</span>”的。我认为,AI的发展仍然在<span style="color: black;">初期</span>。在当下,人们仿佛对AI有些不切<span style="color: black;">实质</span>的幻想,总认为AI<span style="color: black;">已然</span>进化出了人类的思维,有感情、有思考,能代替人类写论文,<span style="color: black;">乃至</span>能和人类谈<span style="color: black;">爱情</span>。这有些一厢情愿了。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">紫牛<span style="color: black;">资讯</span></strong><span style="color: black;">:<span style="color: black;">2023年1月,美国多名艺术家集体起诉三家生成式AI<span style="color: black;">商场</span>应用<span style="color: black;">机构</span>作图软件以<span style="color: black;">她们</span>作品的风格生成图像,法院审理认为AI生成后的<span style="color: black;">照片</span>本来就不享有著作权,并不违反著作权法。今年,谷歌又面临着三位漫画家和一位摄影师提出的集体诉讼。你<span style="color: black;">怎样</span>看待这些诉讼?</span></span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">季啸白</strong><span style="color: black;">:这<span style="color: black;">亦</span>印证了当下AI是<span style="color: black;">无</span>人类思维的。<span style="color: black;">日前</span>,你让AI写生物医药论文、画一幅抽象主义绘画,而AI本质上只是学习过去人们写过的论文、画过的绘画,<span style="color: black;">而后</span><span style="color: black;">根据</span>人类当下的需求重新杂糅并输出。AI是<span style="color: black;">无</span>创造性的,<span style="color: black;">尤其</span>是图像<span style="color: black;">行业</span>。图像生成AI经常<span style="color: black;">显现</span>的版权纠纷,是<span style="color: black;">由于</span>人类对AI的输出结果<span style="color: black;">不可</span>很好地<span style="color: black;">掌控</span>。由此可见,探索更有效的人机交互方式,空间很大。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">紫牛<span style="color: black;">资讯</span></strong><span style="color: black;">:<span style="color: black;">当下,越来越多的创业者涌入AI<span style="color: black;">行业</span>,在图像生成方面,可谓是老中青三代齐上阵,但进去后却<span style="color: black;">发掘</span>并不<span style="color: black;">容易</span>。你<span style="color: black;">怎样</span>看待?</span></span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">季啸白</strong><span style="color: black;">:当前图像生成AI的人机交互方式,从<span style="color: black;">途径</span>上<span style="color: black;">来讲</span>分为用文字生成图像、用“文字+图像”生成图像、用“文字+图像+风格倾向”生成图像、用“文字+图像+标注重点部位”生成图像;从轮次上<span style="color: black;">来讲</span><span style="color: black;">亦</span>分为单次和多次。设计<span style="color: black;">恰当</span>且简化的人机交互,更能<span style="color: black;">帮忙</span>面向普通用户的C端<span style="color: black;">制品</span><span style="color: black;">得到</span>成功。<span style="color: black;">怎样</span>帮图像生成式AI做好人机交互体验,<span style="color: black;">帮忙</span>人们<span style="color: black;">恰当</span>准确表达诉求的<span style="color: black;">同期</span>,降低上手难度和门槛,是创业者和普通用户的需求,<span style="color: black;">亦</span>是我重点<span style="color: black;">科研</span>的方向。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">紫牛<span style="color: black;">资讯</span></strong><span style="color: black;">:<span style="color: black;"><span style="color: black;">非常多</span>创业者认为,C端(个人用户端)需求大,寻求C端扩张<span style="color: black;">更易</span>成功。</span></span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">季啸白</strong><span style="color: black;">:美国硅谷<span style="color: black;">初期</span>的<span style="color: black;">有些</span>图像生成AI都是面向普通消费者的,用户按次付费,或<span style="color: black;">经过</span>订阅制包月<span style="color: black;">运用</span>。大部分消费者抱着尝鲜的目的而来,用户增长<span style="color: black;">火速</span>,但留存和付费转化很低,本质上是<span style="color: black;">由于</span>通用型图像生成AI的天花板太低。当下,图像生成AI的重点正在从 C端向着面向<span style="color: black;">公司</span>的B端转移。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">紫牛<span style="color: black;">资讯</span></strong><span style="color: black;">:<span style="color: black;">面向B端的<span style="color: black;">制品</span>,和面向C端的<span style="color: black;">制品</span>,有什么<span style="color: black;">显著</span><span style="color: black;">区别</span>?</span></span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">季啸白</strong><span style="color: black;">:B端消费者不<span style="color: black;">爱好</span>随机,<span style="color: black;">没</span>法接受太高的自由度。B端用户是要生成<span style="color: black;">照片</span>拿来干活的,<span style="color: black;">不可</span>天马行空的任由AI发挥。<span style="color: black;">因此呢</span>,对B端<span style="color: black;">制品</span><span style="color: black;">必须</span>深入浅出地<span style="color: black;">供给</span>更丰富的多轮生成调优能力,这一点在人机交互的设计上<span style="color: black;">必定</span>要<span style="color: black;">思虑</span>到。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">紫牛<span style="color: black;">资讯</span></strong><span style="color: black;">:<span style="color: black;">那C端<span style="color: black;">制品</span>的落<span style="color: black;">地区</span>向在哪?</span></span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">季啸白</strong><span style="color: black;">:C端<span style="color: black;">制品</span>的<span style="color: black;">商场</span>空间肯定存在。短期内图像生成AI在C端<span style="color: black;">制品</span>的落<span style="color: black;">地区</span>向,是低客单价的<span style="color: black;">广泛</span>需求。千万<span style="color: black;">不可</span>陷入追求高客单价,AI的本质是替代人类的重复劳动和昂贵劳动力,追求高客单价是个误区。AI<span style="color: black;">没</span>法创造奢侈品,<span style="color: black;">亦</span>很难触及高频需求,<span style="color: black;">咱们</span>的<span style="color: black;">平常</span>生活中,出门吃饭买菜,回家刷短视频睡觉,很少<span style="color: black;">必须</span>创造图像,<span style="color: black;">因此呢</span>传统<span style="color: black;">商场</span><span style="color: black;">道理</span>的高客单价和高频都是不可取的。C端创业者,<span style="color: black;">必定</span>要着眼<span style="color: black;">广泛</span>的多个需求,<span style="color: black;">例如</span>做AI图像集成<span style="color: black;">工具</span>,连带AI美妆,AI一键换背景,AI一键改光线,AI生成证件照、艺术照,虽然对单个用户而言不算高频需求,但<span style="color: black;">针对</span><span style="color: black;">全部</span>社会而言,累计<span style="color: black;">运用</span>次数会相当可观。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">紫牛<span style="color: black;">资讯</span></strong><span style="color: black;">:<span style="color: black;">B端用户中,电商是庞大的群体,<span style="color: black;">亦</span>是普通AI创业者重点关注的对象。面向电商的AI创业,你有什么<span style="color: black;">意见</span>?</span></span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">季啸白</strong><span style="color: black;">:电商行业其实是最早<span style="color: black;">起始</span>接触<span style="color: black;">照片</span>生成AI技术的行业,<span style="color: black;">由于</span>行业本身信息<span style="color: black;">敏锐</span>度高,<span style="color: black;">亦</span>有<span style="color: black;">海量</span><span style="color: black;">照片</span>处理需求。<span style="color: black;">咱们</span>工作中<span style="color: black;">亦</span>经常有电商行业的从业者来聊,<span style="color: black;">她们</span>的需求太大了,<span style="color: black;">没</span>论是图像生成、图像批量后期处理,还是一键换衣等需求,本质上<span style="color: black;">便是</span>用AI的低成本去换重拍<span style="color: black;">照片</span>的成本。<span style="color: black;">然则</span>,<span style="color: black;">她们</span>的需求非常杂,相当定制化,<span style="color: black;">详细</span>到每一类需求量,<span style="color: black;">并不</span>大,<span style="color: black;">因此</span><span style="color: black;">照片</span>生成AI反而服务<span style="color: black;">欠好</span>这个行业。之前有个老板来找<span style="color: black;">咱们</span>,<span style="color: black;">期盼</span>把帽子、围巾、手套等物品的平铺图能直接生成到模特的身上。从技术<span style="color: black;">方向</span>,<span style="color: black;">咱们</span>要对物品做定制化的<span style="color: black;">研发</span>,有<span style="color: black;">必定</span>人工成本。电商核算后,<span style="color: black;">发掘</span>找模特快速拍一下效率更高,成本更低。这一个项目<span style="color: black;">最后</span><span style="color: black;">无</span>成功,但能管中窥豹一个行业。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><strong style="color: blue;">三</strong></span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">紫牛<span style="color: black;">资讯</span></strong><span style="color: black;">:<span style="color: black;">2月15日, OpenAI<span style="color: black;">颁布</span>了“文生视频”大模型Sora,并附带<span style="color: black;">颁布</span>了由它生成的48段视频,<span style="color: black;">诱发</span>了社会高度关注。从中<span style="color: black;">长时间</span>看,图像生成AI的发展方向是什么?</span></span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">季啸白</strong><span style="color: black;">:我认为视频生成正<span style="color: black;">处在</span>破晓时分。视频的<span style="color: black;">商场</span>前景比<span style="color: black;">照片</span>大得多,这<span style="color: black;">亦</span>是抖音、TikTok等<span style="color: black;">制品</span>成功的<span style="color: black;">原由</span>。经过互联网20年的发展,人们对消费视频<span style="color: black;">已然</span>习以为常,<span style="color: black;">运用</span>习惯<span style="color: black;">亦</span>从阅读静态<span style="color: black;">媒介</span>向消费动态内容转变。视频生成大约在2025年达到可商用的成熟度。视频生成技术一旦成熟,OpenAI可能会<span style="color: black;">创立</span>自己的视频平台,和抖音、TikTok直接竞争视频消费者,<span style="color: black;">同期</span>给影视行业带来变革。影视行业不<span style="color: black;">必定</span><span style="color: black;">爱好</span>从0<span style="color: black;">起始</span>生成视频,但会对修复拍摄的穿帮细节、虚拟布景AI生成宏大背景、生成<span style="color: black;">没</span>法拍摄的视觉效果等感兴趣。影视行业的付费意愿高,和影视行业紧密合作,会是这个10年下半场的<span style="color: black;">要紧</span>机会。</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">紫牛<span style="color: black;">资讯</span></strong><span style="color: black;">:AI会取代人类的摄影和图像创作吗?</span></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">季啸白</strong><span style="color: black;">:从<span style="color: black;">长时间</span><span style="color: black;">来讲</span>,图像生成AI<span style="color: black;">必定</span>是和人类大脑协作,而不是取代人类的摄影和图像创作。摄影是对客观事物的<span style="color: black;">捉捕</span>,<span style="color: black;">亦</span>是拍摄者<span style="color: black;">心情</span>和思考的表达。我很<span style="color: black;">爱好</span>画画和摄影,AI技术虽然在<span style="color: black;">持续</span>演进,但人类主动创作的主观表达是永远<span style="color: black;">没</span>法替代的。</span><a style="color: black;"><span style="color: black;">返回<span style="color: black;">外链论坛:http://www.fok120.com/</span>,查看<span style="color: black;">更加多</span></span></a></p>
<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">责任编辑:</p>
楼主果然英明!不得不赞美你一下! 请问、你好、求解、谁知道等。
页:
[1]