Home Sitemap Contact 中文 CAS
 
Navigation
  • HOME
  • About Us
  • Research
  • People
  • International Cooperation
  • News
  • Education & Training
  • Join Us
  • Publications
  • Papers
  • Resources
  • Life at ICT
  • Links
  • Location:Home>News>Upcoming Events
    RAS Modeling of an HPC Switch System
    Author:
    ArticleSource:
    Update time: 2008-12-19
    Close
    Text Size: A A A
    Print

    Inviter:Dr. Dong Tang is a Senior Staff Engineer at Sun Microsystems, Inc.
    Time: 9:30am, Dec 29, 2008 (MON)
    Place: 440, Institute of Computing Technology, Chinese Academy of Sciences

    Abstract:
    Interconnection of the sheer number of server nodes in a petasacle HPC system plays a vital role in the developments. InfiniBand has emerged as a compelling interconnect technology, and provides more scalability and significantly better cost performance than any other known protocols. This talk will discuss a reliability, availability, and serviceability (RAS) modeling and analysis of the Sun Datacenter Switch 3456 system, the world’s largest standards-based InfiniBand switch, with direct capacity to host up to 3,456 server nodes, against hardware faults. The talk will also discuss the system reliability improvement when practicing redundant ports and deferred repair strategies.

    Bio:
    Dr. Dong Tang is a Senior Staff Engineer at Sun Microsystems, Inc.
    He works on reliability, availability, and serviceability (RAS) assessment for Sun hardware and software products. He designed and developed RAScad, a Sun internal RAS architecture modeling tool for system designers. Dr. Tang received his MS degree from ICT, CAS in 1983 and PhD degree from University of Illinois at Urbana-Champaign in 1992.Before joining Sun in 1999, he was a Senior Research Engineer at SoHaR Inc. for six years and Principal Investigator for several US government sponsored research projects.
    He was the architect of MEADEP, a commercial dependability modeling and data analysis tool. He has frequently served on program committees for international conferences in the dependable computing area. He was Program Co-Chair for the 2005 International Conference on Dependable Computing (DSN 2005).

     

    Address :No.6 Kexueyuan South Road Zhongguancun,Haidian District Beijing,China
    Postcode :100190 Tel : (8610)62601166 Email : office@ict.ac.cn