Argus DC3500 Manuel d'utilisateur

Naviguer en ligne ou télécharger Manuel d'utilisateur pour appareil photos de pont Argus DC3500. CHIPP – CSCS F2F Meeting Manuel d'utilisatio

  • Télécharger
  • Ajouter à mon manuel
  • Imprimer

Résumé du contenu

Page 1 - CHIPP – CSCS F2F Meeting

CHIPP – CSCS F2F Meeting Zürich, August 19th 2014 Sadaf Alam George Brown Miguel Gila Gianni Ricciardi

Page 2 - HPC operations 2

Statistics – Swiss resources •  Compute (8x nodes, ~3.5k HS06) –  Deployed and fully operational. –  Pending to increase priority of Swiss users. • 

Page 3 - Tier 2 status and plans

Operations © CSCS 2014 - HPC operations 11

Page 4 - HPC operations 4

Operations •  Deployed Virtualization servers, evaluating oVirt and/or RHEV •  Reserved 4 full nodes for ATLAS, 4 for CMS and 2 for LHCb –  A manual f

Page 5 - GPFS problems (inodes)

Operations Next maintenance on Sept 3 or 17. Significant changes: •  Swiss users mapping and priority –  need to define specific mappings for CMSch a

Page 6 - Statistics – CPU Usage

Operations GPFS2 (https://wiki.chipp.ch/twiki/bin/view/LCGTier2/ServiceGPFS2) •  First production service fully configured using Puppet •  Observed pe

Page 7 - (HS06-hours)

Plans © CSCS 2014 - HPC operations 15

Page 8 - (extra)

Phases of Phoenix © CSCS 2014 - HPC operations 16 2012 2013 2014 2015 Phase H Phase J Phase K Phase F+G Now

Page 9 - Statistics – Storage usage

Pledges © CSCS 2014 - HPC operations 17 Phase Compute power actual/pledged [HS06] Storage actual/pledged [TB] Scratch actual/desired [GB/s] Ph

Page 10 - Statistics – Swiss resources

Decommissions & purchases Purchases •  Storage for a total of 720TiB –  intended to replace 3x half-racks of IBM DC3500 •  20x compute nodes (~8.5

Page 11 - Operations

Thank you for your attention © CSCS 2014 - HPC operations 19

Page 12

Agenda •  9:45 - Coffee, presentation and agenda •  10:15 - Tier-2 status and plans –  CSCS (40') –  UNIBE-LHEP (20') •  11:15 - Tier-3 stat

Page 13

Extra slides © CSCS 2014 - HPC operations 20

Page 14

NetApp problems (Swiss users storage) •  Initial tests ran on the storage were successful. •  When the system was put in production under heavy I/O, p

Page 15 - HPC operations 15

GPFS issues •  Metadata inode exhaustion –  Due to several identified problems, inodes were exhausted on metadata servers. –  This caused the whole cl

Page 16 - Phases of Phoenix

dCache issues •  Information system not properly handling this •  dCache did provided an official fix for this on release 2.6.31. –  We run 2.6.27

Page 17 - Pledges

Swiss National Argus service •  3 KVM VMs on 3 different KVM hosts. •  Load-balanced with a common DNS alias: argus.lcg.cscs.ch –  Similar to current

Page 18 - HPC operations 18

Tier 2 status and plans CSCS © CSCS 2014 - HPC operations 3

Page 19 - Thank you for your attention

Status © CSCS 2014 - HPC operations 4

Page 20 - Extra slides

Statistics – Availability & Reliability •  Relatively stable operation with small hiccups: –  GPFS: inode usage above threshold and IB cable broke

Page 21 - HPC operations 21

Statistics – CPU Usage •  CPU usage increased (specially during July) © CSCS 2014 - HPC operations 6 $%&)$%&4$%&5$%&6$%&"$%

Page 22 - GPFS issues

Statistics – CPU Usage from EGI perspective •  Computation hours restored to previous values over past months: •  There is still a mismatch between l

Page 23 - HPC operations 23

Statistics – CPU Usage from EGI perspective (extra) © CSCS 2014 - HPC operations 8 •  Total computation hours (HS06) (SUM)

Page 24 - Swiss National Argus service

Statistics – Storage usage © CSCS 2014 - HPC operations 9 !5$#'&))7#(&'6!#!&!7'#)&75#'&'5)#(&'7

Commentaires sur ces manuels

Pas de commentaire