大工至善|大学至真分享 http://blog.sciencenet.cn/u/lcj2212916



已有 1220 次阅读 2021-8-10 19:16 |系统分类:科研笔记|文章来源:转载


本文为美国亚利桑那州立大学(作者:Harshitha Katpally)的硕士论文,共131页。






Capturing the information in an image intoa natural language sentence is considered a difficult problem to be solved bycomputers. Image captioning involves not just detecting objects from images butunderstanding the interactions between the objects to be translated intorelevant captions. So, expertise in the fields of computer vision paired withnatural language processing are supposed to be crucial for this purpose. Thesequence to sequence modelling strategy of deep neural networks is thetraditional approach to generate a sequential list of words which are combinedto represent the image. But these models suffer from the problem of highvariance by not being able to generalize well on the training data. The mainfocus of this thesis is to reduce the variance factor which will help in generatingbetter captions. To achieve this, Ensemble Learning techniques have beenexplored, which have the reputation of solving the high variance problem thatoccurs in machine learning algorithms. Three different ensemble techniquesnamely, k-fold ensemble, bootstrap aggregation ensemble and boosting ensemblehave been evaluated in this thesis. For each of these techniques, three outputcombination approaches have been analyzed. Extensive experiments have beenconducted on the Flickr8k dataset which has a collection of 8000 images and 5different captions for every image. The bleu score performance metric, which isconsidered to be the standard for evaluating natural language processing (NLP)problems, is used to evaluate the predictions. Based on this metric, theanalysis shows that ensemble learning performs significantly better andgenerates more meaningful captions compared to any of the individual modelsused.


1.  引言

2. 项目背景

3. 相关工作

4. 图像字幕系统结构

5. 图像字幕的集成学习

6. 分析

7. 结论与展望

附录 部分python源代码



下一篇:[转载]【电信学】【2018.01】Arm Mbed–AWS物联网系统集成
收藏 IP: 60.169.68.*| 热度|


该博文允许注册用户评论 请点击登录 评论 (0 个评论)


Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-9-20 04:29

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社
