计算之智与哲学之慧分享 http://blog.sciencenet.cn/u/huangfuqiang

博文

Robust Resource Management for Parallel Computing Systems

已有 3444 次阅读 2010-5-15 08:44 |个人分类:并行计算与分布式处理|系统分类:博客资讯| for, Parallel, Robust, resource, Management

          清华大学计算机科学与技术系学术报告会(2010.第四讲),来自Colorado State University Department of Electrical and Computer Engineering的H. J. Siegel教授做了"并行计算系统的健壮性资源管理“报告,其中报告中所涉问题如下:
英文信息来自PDF:http://www.cs.tsinghua.edu.cn/web/ListDetail.aspx?id=350&tid=17
      Abstract:
What does it mean for a computer system to be “robust(
鲁棒)”? How can robustness be described? How does one determine if a claim of robustness is true? How can one decide which of two systems is more robust? Parallel computing systems are often heterogeneous mixtures of machines, used to execute collections of tasks with diverse computational requirements. A critical research problem is how to allocate resources to tasks to optimize some performance objective. However, systems frequently have degraded performance due to uncertainties, such as inaccurate estimates of actual workload parameters. It is important for system performance to be robust against uncertainty. To accomplish this, we present a stochastic model for deriving the robustness of a resource allocation. This model assumes that stochastic (experiential) information is available for a parameter whose actual values are uncertain. The robustness of a resource allocation is quantified as the probability that a user-specified level of system performance can be met. We show how to use this stochastic model to evaluate the robustness of resource assignments and to design resource management heuristics that produce robust allocations. The stochastic robustness analysis approach can be applied to a variety of computing and communication system environments, including parallel, distributed, cluster, grid, Internet, cloud, embedded, multicore, content distribution networks, wireless networks, and sensor networks. Furthermore, the robustness model is generally applicable to design problems throughout various scientific and engineering fields.
Speaker:
H. J. Siegel is the George T. Abell Endowed Chair Distinguished Professor of Electrical and Computer Engineering at Colorado State University (CSU), where he is also a Professor of Computer Science. He is Director of the CSU Information Science and Technology Center (ISTeC), a university-wide organization for enhancing CSU’s activities pertaining to the design and innovative application of computer, communication, and information systems. Before joining CSU, he was a Professor at Purdue University from 1976 to 2001. He received two B.S. degrees from the Massachusetts Institute of Technology (MIT), and the M.A., M.S.E., and Ph.D. degrees from Princeton University. He is a Fellow of the IEEE and a Fellow of the ACM. Prof. Siegel has co-authored over 370 published technical papers in the areas of parallel and distributed computing. He was a Coeditor-in-Chief of the Journal of Parallel and Distributed Computing, and was on the Editorial Boards of the IEEE Transactions on Parallel and Distributed Systems and the IEEE Transactions on Computers.

H. J. Siegel教授个人主页





https://wap.sciencenet.cn/blog-89075-324650.html

上一篇:ACM/IEEE-CS计算机相关专业课程推荐
下一篇:信息化与城市发展论坛:互联网之父的演讲
收藏 IP: .*| 热度|

0

该博文允许实名用户评论 评论 (0 个评论)

数据加载中...
扫一扫,分享此博文

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-5-19 13:52

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部