# StyleGAN

• 因为 $z$ 一般是 normalize 之后的结果，所以变量很难解耦合。其最初的 8 层 FC 将 $z$ 映射为 $w$ 是为了进行变量内蕴含的特征的解耦合。

• 风格迁移：$\operatorname{AdaIN}(\mathbf{x}_{i}, \mathbf{y})=\mathbf{y}_{s, i} \frac{\mathbf{x}_{i}-\mu(\mathbf{x}_{i})}{\sigma(\mathbf{x}_{i})}+\mathbf{y}_{b, i}$，其中 $x_i$ 本来蕴含了其特征 $\mu, \sigma$，而做了 normalize 之后，再乘以 $y_{s,i} = \mu(y_i), y_{b,i}=\sigma(y_i)$，得到在特征空间 $Y_i$ 下的 style

• 解释了其生成架构的 Properties

• Style Mixing: 其通过生成 $z_1, z_2$ 两个不同的 latent code，然后过 FC 之后得到 $w_1, w_2$，然后得到两张图片记为 A, B。然后，把 (b) 图中上半部分的特征输入用 $w_1$ 的上半部分，下半部分的特征输入用 $w_2$ 的下半部分，得到新图片 $C$。通过调整“上半部分”与“下半部分”层数的多少，来看每一层到底在控制什么特征。
• Noise：现象是，对不同层的输出变量施加轻微扰动，其控制效果也不同。其猜测的原因：We hypothesize that at any point in the generator, there is pressure to introduce new content as soon as possible, and the easiest way for our network to create stochastic variation is to rely on the noise provided
• 提供了两种量化“特征解耦合度”的方法

• Perceptual path length

• Motivation: 从 $z_1$ 到 $z_2$ 做插值，图像的变化应该越小越好

• This is to avoid that features that are absent in either endpoint may appear in the middle of a linear interpolation path.

• Linear separability

• Motivation: 解耦合的空间应该能找到某一种特征对应的方向向量
• We propose another metric that quantifies this effect by measuring how well the latent-space points can be separated into two distinct sets via a linear hyperplane, so that each set corresponds to a specific binary attribute of the image.
• Use SYM…

# Enhancing photorealism enhancement

• G-buffer
• Image Enhancement Network
• 基于 HRNetV2（在 Dense Prediction 预测上表现突出）改动
• Training Objectives
• LPIPS Loss：focuses on structural differences
• Realism Score
• A specific sampling strategy during training

## Methods

### Layout Differences cause Artifacts

• 解决方案？Sampling only matching patches. So what is matching between different datasets?
• Crop size: 7% of the full image per patch
• Extract 1 x 1 x 512-dim feature map, using a VGG network pretrained on ImageNet at the last relu layer
• Let $p_i, p_j \in$ Different datasets, then call them “matching” if their cosine similiarity > 0.5.
• Using FAISS could accelerate the process!

### Metrics

IS, FID, KID 是常用的衡量图片真实感的 score。但是 KID 的问题是，他只能衡量图片的真实感，即使是图片所表示的物体结构发生了变化，它是反应不出来的。于是本作提出新的 Metrics sKVD:

• Extract patches of 1/8 image size from the semantic label maps of souce and target datasets
• Downsample these patches to 16x16 resolution to obtain a 256-dim vector
• For each such vector from the synthetic dataset, we fifind the nearest neighbor in the set of vectors from the real dataset. We retain pairs of vectors with more than 50% matching entries.

# Standardized Max Logits

A Simple yet Effective Approach for Identifying Unexpected Road Obstacles in Urban-Scene Segmentation

Key Idea: standardize the max logits to align the different distributions and reflect the relative meanings of max logits within each predicted class.

1. standardizing the max logits in a class-wise manner
2. iterative boundary suppression
1. Propate the SMLs of the neighboring non-boundary pixels to the boundary regions
2. Start from the outer areas of the boundary to the inner areas
3. To be specifific, we assume the boundary width as a particular value and update the boundaries by iteratively reducing the boundary width at each iteration.
3. dilated smoothing
1. Gaussian kernel