大工至善|大学至真分享 http://blog.sciencenet.cn/u/lcj2212916

博文

[转载]【计算机科学】【2017】基于深度学习的图像质量增强

已有 1262 次阅读 2021-5-9 17:15 |系统分类:科研笔记|文章来源:转载

图片


本文为澳大利亚悉尼科技大学(作者:Ruxin Wang)的博士论文,共172页。

 

提高图像质量是一个经典的图像处理问题,在过去的几十年中受到了广泛的关注。在各种视觉任务中,人们总是期望得到高质量的图像,并且需要去除噪声、低分辨率和模糊退化等。虽然这项任务的传统技术已经取得了很大的进步,但与传统技术相比,最近表现最好的深度模型能够显著地提高性能。深度学习的优势在于它的高表征能力和模型的强非线性。在这篇论文中,我们通过研究几个不同动机的基本问题来探索先进的图像质量增强深度模型的发展。特别是,我们首先被人类感知系统的一个关键特性所激发,即相似的视觉线索可以刺激同一个神经元,从而产生相似的神经信号。然而,图像退化会导致图像中相似的局部结构表现出不同的观察结果。传统的神经网络并没有考虑到这一重要特性,我们开发了一种(叠加式)非局部自动编码器,它利用自然图像中的自相似信息来提高信号在网络中传播的稳定性。人们期望相似的结构会导致相似的网络传播。这是通过在训练期间限制非局部相似图像块的隐藏表示之间的差异来实现的。通过将该模型应用于图像恢复,我们提出了一种“协同稳定”的方法来进一步校正前向传播

 

在将深度模型应用于图像质量增强任务时,我们关心的是哪个因素(感受域大小或模型深度)更为关键。为了确定答案,我们将重点放在单幅图像的超分辨率任务上,提出了一种基于扩展卷积的策略来研究这两个因素对性能的影响。我们从详尽的研究中发现,单图像超分辨率对感受域大小的变化比模型深度的变化更敏感,并且模型深度必须与感受域大小一致才能产生更好的性能。这些发现启发我们设计一个较浅的架构,它可以节省计算和内存成本,同时保持与较深架构相当的效率。最后,研究了一般的非盲图像反褶积问题。实践表明,利用现有的反褶积技术,锐化图像与估计值之间的残差对锐化图像和噪声都有很大的依赖性。这些技术需要针对不同的模糊核和噪声建立不同的恢复模型,导致计算效率低或模型参数冗余度高。因此,本文提出了一种基于深度卷积神经网络的方法,该方法可以处理不同的核函数和噪声,同时保持较高的效率。该模型不直接输出解卷积结果,而是预测预解卷积图像与相应锐化图像之间的残差,从而简化训练,得到具有抑制伪影的复原图像。

 

Enhancing image quality is a classical image processing problem that has received plenty of attention over the past several decades. A high-quality image is always expected in various vision tasks, and degradations such as noise, low-resolution, and blur are required to be removed. While the conventional techniques for this task have achieved great progress, the recent top performer, deep models, can substantially and significantly boost performance compared with conventional ones. The advantages of deep learning which enables it to achieve such success are its high representational capacity and the strong nonlinearity of the models. In this thesis, we explore the development of advanced deep models for image quality enhancement by researching several fundamental issues with different motivations. In particular, we are first motivated by a pivotal property of the human perceptual system that similar visual cues can stimulate the same neuron to induce similar neurological signals. However, image degradations can result in the fact that similar local structures in images exhibiting dissimilar observations. While the conventional neural networks do not consider this important property, we develop the (stacked) non-local auto-encoder which exploits self-similar information in natural images for enhancing the stability of signal propagation in the network. It is expected that similar structures should induce similar network propagation. This is achieved by constraining the difference between the hidden representations of non-local similar image blocks during training. By applying the proposed model to image restoration, we then develop a collaborative stabilisation step to further rectify forward propagation.

When applying deep models to image quality enhancement tasks, we are concerned about which factor, receptive field size or model depth, is more critical. To determine the answer, we focus on the single image super-resolution task, and propose a strategy based on dilated convolution to investigate how the two factors affect the performance. Our findings from exhaustive investigations suggest that single image super-resolution is more sensitive to the changes of receptive field size than to model depth variations, and that the model depth must be congruent with the receptive field size to produce improved performance. These findings inspire us to design a shallower architecture which can save computational and memory cost while preserving comparable effectiveness with respect to a much deeper architecture. Finally, we study the general non-blind image deconvolution problem. It is observed in practice that by using existing deconvolution techniques, the residual between the sharp image and the estimation is highly dependent on both the sharp image and the noise. These techniques require the construction of different restoration models for different blur kernels and noises, inducing low computational efficiency or highly redundant model parameters. Thus, for general purposes, we propose a method by designing a very deep convolutional neural network which can handle different kernels and noises, while preserving high effectiveness and efficiency. Instead of directly outputting the deconvolved results, the model predicts the residual between a pre-deconvolved image and the corresponding sharp image, which can make the training easier and obtain restored images with suppressed artifacts.

 

1.       引言

2. 文献回顾

3. 协同稳定的非局部自动编码器

4. 感受域 vs. 模型深度

5. 非盲图像反褶积中的残差学习

6. 结论


更多精彩文章请关注公众号:205328s611i1aqxbbgxv19.jpg




https://wap.sciencenet.cn/blog-69686-1285726.html

上一篇:[转载]【无人机】【2016.05】新一代PETERBILT卡车的无人机检查系统研制
下一篇:[转载]【信息技术】【2018】完全同态加密的实例研究
收藏 IP: 61.190.198.*| 热度|

0

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

数据加载中...

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-4-27 06:50

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部