大工至善|大学至真分享 http://blog.sciencenet.cn/u/lcj2212916

博文

[转载]【信息技术】【2016.07】二维与三维跟踪及建模

已有 889 次阅读 2021-2-13 16:14 |系统分类:科研笔记|文章来源:转载

图片


本文为英国萨里大学(作者:Karel Lebeda)的博士论文,共222页。

 

无约束视频序列中未知目标的视觉跟踪是一个极具挑战性的问题。本文探讨了其中的几个问题,并探讨了解决这些问题的可能方法。

 

真实世界中输入序列的无约束性会由于姿态和灯光的变化而在目标对象的外观上产生巨大的差异。此外,对象可以被其自身、场景的其他元素或帧边界遮挡。由于低分辨率、运动模糊、帧间位移过大、相机曝光或焦距不正确,观察结果也可能遭到损坏。最后,由于某些对象的(低)纹理、镜面反射/透明性、非刚性变形等原因,有些对象本身就很难跟踪

 

传统的跟踪器在很大程度上依赖于目标的纹理。这会导致透明或未设置纹理的对象出现跟踪问题。在标准特征点稀少的情况下,可以使用边缘点;但是这些特征点会受到孔径问题的影响。为了解决这个问题,本论文的第一个贡献是探索虚拟角点的概念,使用与图像边缘相切的非相邻线对。此外,本文还研究了长时间跟踪的可能性,引入了一种重检测方案来处理遮挡,同时限制了目标模型的漂移。这项研究的结果是一个基于边缘的跟踪器,能够在包括无纹理物体、完全遮挡和显著长时间的情况下进行跟踪。该跟踪器除了在标准基准测试中报告出色的结果外,还被证明能够成功跟踪迄今为止公布的最长序列

 

视觉跟踪中的一些问题是由于对图像信息的利用不足而引起的。感兴趣的对象很容易占据视频帧区域的百分之十甚至百分之一。这会给挑战性场景带来困难,例如相机突然抖动或完全遮挡。为了改善这种情况下的跟踪,本论文的下一个主要贡献是探索视觉跟踪上下文中的关系,重点是因果关系。其中包括被跟踪对象与场景的其他元素(如摄影机运动或其他对象)之间的因果关系。这种关系的性质是在一个基于信息论的框架中确定的。所得到的技术可以作为基于因果关系的运动模型来改进几乎任何跟踪器的结果。

 

以前人们一直致力于快速学习物体的动态特性。然而,最先进的方法仍然经常失败的情况下,如快速的平面外旋转,即外观突然改变。本论文的主要贡献之一是对传统的建模方法进行了彻底的反思。三维运动被真正建模为三维运动。这种直观但以前未被探索的方法为视觉跟踪研究提供了新的可能性。

 

首先,三维跟踪更为普遍,因为大的平面外运动对二维跟踪器来说往往是致命的,但有助于三维跟踪器建立更好的模型。其次,跟踪器中目标的内部模型可以应用于许多不同的应用,甚至可以成为主要的驱动力,跟踪支持重建而不是反之亦然。

 

这有效地弥合了视觉跟踪和运动结构之间的差距。该方法能够成功地跟踪具有极端平面外旋转的序列,这对二维跟踪器提出了相当大的挑战。这是通过创建真实的目标三维模型来完成的,然后帮助跟踪。

 

在论文的大部分内容中,假设目标的三维形状是刚性的。然而,这是一个相对强烈的限制。在最后一章中,我们将探讨非刚性目标的跟踪和密集建模,在更一般(因此也更具挑战性)的场景中展示结果。这最后的进展真正概括了跟踪问题,支持长期跟踪低纹理和非刚性物体序列中的相机抖动,镜头切割和重大旋转。

 

综合起来,这些贡献解决了视觉跟踪失败的一些主要来源。这项研究推进了视觉跟踪领域,有助于在以前不可行的情况下进行跟踪。在这些具有挑战性的场景中展示了出色的结果。最后,本文论证了三维重建和视觉跟踪可以结合起来解决复杂的任务。

 

Visual tracking of unknown objects in unconstrained video-sequences is extremely challenging due to a number of unsolved issues. This thesis explores several of these and examines possible approaches to tackle them.

The unconstrained nature of real-world input sequences creates huge variation in the appearance of the target object due to changes in pose and lighting. Additionally, the object can be occluded by either parts of itself, other elements of the scene, or the frame boundaries. Observations may also be corrupted due to low resolution, motion blur, large frame-to-frame displacement, or incorrect exposure or focus of the camera. Finally, some objects are inherently difficult to track due to their (low) texture, specular/transparent nature, non-rigid deformations, etc.

Conventional trackers depend heavily on the texture of the target. This causes issues with transparent or untextured objects. Edge points can be used in cases where standard feature points are scarce; these however suffer from the aperture problem. To address this, the first contribution of this thesis explores the idea of virtual corners, using pairs of non-adjacent line correspondences, tangent to edges in the image. Furthermore, the chapter investigates the possibility of long-term tracking, introducing a re-detection scheme to handle occlusions while limiting drift of the object model. The outcome of this research is an edge-based tracker, able to track in scenarios including untextured objects, full occlusions and significant length. The tracker, besides reporting excellent results in standard benchmarks, is demonstrated to successfully track the longest sequence published to date.

Some of the issues in visual tracking are caused by suboptimal utilisation of the image information. The object of interest can easily occupy as few as ten or even one percent of the video frame area. This causes difficulties in challenging scenarios such as sudden camera shakes or full occlusions. To improve tracking in such cases, the next major contribution of this thesis explores relationships within the context of visual tracking, with a focus on causality. These include causal links between the tracked object and other elements of the scene such as the camera motion or other objects.

Properties of such relationships are identified in a framework based on information theory. The resulting technique can be employed as a causality-based motion model to improve the results of virtually any tracker.

Significant effort has previously been devoted to rapid learning of object properties on the fly. However, state-of-the-art approaches still often fail in cases such as rapid out-of-plane rotations, when the appearance changes suddenly. One of the major contributions of this thesis is a radical rethinking of the traditional wisdom of modelling 3D motion as appearance change. Instead, 3D motion is modelled as 3D motion.

This intuitive but previously unexplored approach provides new possibilities in visual tracking research.

Firstly, 3D tracking is more general, as large out-of-plane motion is often fatal for 2D trackers, but helps 3D trackers to build better models. Secondly, the tracker’s internal model of the object can be used in many different applications and it could even become the main motivation, with tracking supporting reconstruction rather than vice versa.

This effectively bridges the gap between visual tracking and Structure from Motion. The proposed method is capable of successfully tracking sequences with extreme out-of-plane rotation, which poses a considerable challenge to 2D trackers. This is done by creating realistic 3D models of the targets, which then aid in tracking.

In the majority of the thesis, the assumption is made that the target’s 3D shape is rigid. This is, however, a relatively strong limitation. In the final chapter, tracking and dense modelling of non-rigid targets is explored, demonstrating results in even more generic (and therefore challenging) scenarios. This final advancement truly generalises the tracking problem with support for long-term tracking of low texture and non-rigid objects in sequences with camera shake, shot cuts and significant rotation.

Taken together, these contributions address some of the major sources of failure in visual tracking. The presented research advances the field of visual tracking, facilitating tracking in scenarios which were previously infeasible. Excellent results are demonstrated in these challenging scenarios. Finally, this thesis demonstrates that 3D reconstruction and visual tracking can be used together to tackle difficult tasks.

 

1.       引言

2. 相关工作

3. 无纹理物体的长时间二维跟踪

4. 视觉跟踪中基于因果关系的运动模型

5. 运动的跟踪与结构

6. 非刚性物体的密集三维跟踪

7. 结论与展望


更多精彩文章请关注公众号:205328s611i1aqxbbgxv19.jpg




https://wap.sciencenet.cn/blog-69686-1272016.html

上一篇:[转载]【计算机科学】【2020】基于卷积神经网络的图像分类
下一篇:[转载]【武器系统】【2014.02】计算延迟对导弹拦截系统影响的研究
收藏 IP: 183.160.73.*| 热度|

0

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

数据加载中...

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-4-25 12:34

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部