CP0000 - 一只特立独行的猪

我的2023

疫情结束在23年的元旦前后，随着国内的清零政策放开，疫情一波大爆发之后，新冠在23年从人们的生活中消散了。持续3年的疫情终于结束了。在23年...

Reivew: MobileDiffusion

本文主要介绍 Google 最新的一篇关于如何把 diffusion 模型 port 到移动端设备上的论文 MobileDiffusion: Subsecond Text-to-Image Generation on Mobile Devices。google 的博客上有相关的英文blog：http...

Review: Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation

这一篇关于 Animate Anyone 的读书笔记。Animate Anyone介绍了一种能够根据图像以及结合动作姿态序列生成一段动作视频的方法。方法结构图方法归纳为以...

活到35

年龄焦虑 35了，国内互联网公司有着35岁失业的说法，国内的考公也要求35岁以下。我想在国内从事互联网相关工作的绝大多数人都会有年龄方面的焦虑...

Review:Emu: Enhancing Image Generation Models Using Photogenic Needles in a Haystack

本文提出一个基于latent diffusion框架的文生图模型Emu；利用小量~2000张高质量图片，对pre-trained模型进行qua...

Review: IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Modelsb

本文提出一种 Image Prompt Adapter 方案，实现了类似 Image Prompt的方案。从上图，我们可以看到通过IP-Adapter，我们可以实现（1）Image Varia...

GigaGAN文生图：Scaling up GANs for Text-to-Image Synthesis

这是一篇关于 https://arxiv.org/abs/2303.05511 的笔记。近两年基于扩散概率模型和自回归模型的文生图大模型发展迅速。20，21年的SOTA生成网络GAN在这一波文生图大模型发展...

Stable Diffusion XL 技术报告

前言这是一篇关于 **SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis 技术报告的中文翻译。 SDXL 是 Stability AI 继 SD 1.5， SD 2.0 之后发布的一个新的文生图模型。目前该模型在 reddit 上讨论还是蛮热烈的...

文生图 Imagen 扩散模型简介

Review Imagen 本文是对论文 Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding 的简介 Imagen 是 Google 在22年5月发表的一篇关于文生图任务的论文。其主体框架基于 Diffusion 模型。和其他文生图（比如 DALLE-...

文生图扩散模型

Text-to-image Diffusion Models in Generative AI: A Survey 本文是关于 Text-to-image Diffusion Models in Generative AI:A Survey 的翻译和摘要。自2021年2月 OpenAI 推出 DALL- E 之后，文生图进入了AI研究人员和大众的视野。伴随着22年 DALL-E 2...