Labyrinth分享 致力于行人交通及疏散动力学研究



已有 3905 次阅读 2011-2-12 09:50 |个人分类:百家|系统分类:科研笔记| 图片

最近的Science出版了Dealing with Data的专刊,其中有篇题名Metaknowledge的文章中第一副图介绍了一个领域内的新生、一位知名教授以及计算机对于一篇文章中提取到的信息的不同,很有意思,也值得思考。贴出来,不知道是不是涉及到版权问题,如果有问题我再删除吧。
Fig. 1. Readers vary in the information they extract from an article. A new
graduate student perceives a tiny fraction of available information, focusing on
familiar authors, terms, references, and institutions. Her evaluation is limited to
categorical classification (e.g., of the authors) into known and unknown (“important”
and “unimportant”). For comparison she has the small collection of papers she has
read. A leading scientist perceives a wealth of latent data, assembling individuals
into mentorship relations and locating terms, as well as graphical and mathematical
idioms, in historical and theoretical context. His evaluations generate rank orders
based on his experience in the field. He can compare a paper to thousands, and
searches a large literature efficiently. An appropriately trained computer would
complement this expertise with quantification and scale. It can rapidly access
quantitative and relational information about authors, terms, and institutions, and
order these items along a range of measures. For comparison it can already access a
large fraction of the scientific literature—millions of articles and an increasing pool
of digitized books; in the future it will scrape further data from Web pages, online
databases, video records of conferences, etc.
从图题里的介绍我发现多数时候我看文章还是处于new graduate student 的水平,提取的信息虽然不少,可是不具整体性,未能从宏观角度把握一个问题,对于问题的理解和总结相对不足,有待进一步改进。

收藏 IP: 124.90.182.*| 热度|

5 傅云义 许培扬 伊振中 闫小勇 xuqingzheng

发表评论 评论 (3 个评论)


Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-6-11 13:29

Powered by

Copyright © 2007- 中国科学报社
