我们的方法在概念上很简单。我们采用现成的扩散模型并使用它 估计图像的不同视图或变换中的噪声。 \(v_i\) 然后,通过应用逆视图、 、 \(v_i^{-1}\) 并一起平均。然后,该平均噪声估计值用于执行扩散步骤。
我们发现并非每个视图函数都适用于上述方法。当然, \(v_i\) 必须 是可逆的,但我们讨论了另外两个约束。
对扩散模型进行训练,以估计噪声数据 \(\mathbf{x}_t\) 中的噪声 准时步 \(t\)长。嘈杂的数据 \(\mathbf{x}_t\) 应具有以下形式 \[\mathbf{x}_t = w_t^{\text{signal}}\underbrace{\mathbf{x}_0}_{\text{signal}} + w_t^{\text{noise}}\underbrace{\epsilon\vphantom{\mathbf{x}_0}}_{\text{noise}}.\] 也就是说, \(\mathbf{x}_t\) 是纯信号 \(\mathbf{x_0}\) 的加权平均值 和纯噪声 \(\epsilon\),特别是权重 \(w_t^{\text{signal}}\) 和 \(w_t^{\text{noise}}\)。 因此,我们认为, \(v\) 必须保持信号和噪声之间的这种权重。这是可以实现的 通过使 \(v\) 线性,我们用方阵 \(\mathbf{A}\)表示。按线性度 \[\begin{aligned} v(\mathbf{x}_t) &= \mathbf{A}(w_t^{\text{signal}} \mathbf{x}_0+w_t^{\text{noise}} \epsilon)\\[7pt] &= w_t^{\text{signal}} \underbrace{\mathbf{A}\mathbf{x}_0}_{\text{new signal}} + w_t^{\text{noise}} \underbrace{\mathbf{A}\epsilon}_{\text{new noise}}. \end{aligned}\] Effectively, \(v\) acts on the signal and the noise independently, and combines the result with the correct weighting.
扩散模型的训练假设是噪声是从标准法线中提取的。 因此,我们必须确保转换后的噪声也遵循这些统计数据。也就是说,我们需要 \[\mathbf{A}\epsilon \sim \mathcal{N}(0, I).\] 对于线性变换,这等效于正交条件 \(\mathbf{A}\) 。 直观地说,正交矩阵遵循标准多元高斯分布的球对称性。
因此,要使转换与我们的方法一起使用, 它必须是正交的就足够了。
This project is inspired by previous work in this area, including:
Diffusion Illusions, by Ryan Burgert et al., which produces multi-view illusions, along with other visual effects, through score distillation sampling.
This colab notebook by Matthew Tancik, which introduces a similar idea to ours. We improve upon it significantly in terms of quality of illusions, range of transformations, and theoretical analysis.
Recent work by a pseudonymous artist, Ugleh, uses a Stable Diffusion model finetuned for generating QR codes to produce images whose global structure subtly matches a given template image.
@article{geng2023visualanagrams,
title = {Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models},
author = {Geng, Daniel and Park, Inbum and Owens, Andrew},
journal = {arXiv:2311.17919},
year = {2023},
month = {November},
abbr = {Preprint},
url = {https://arxiv.org/abs/2311.17919},
}