The ATLAS Computing Model embraces the Grid paradigm and a high degree of decentralization and sharing of computing resources. The required level of computing resources means that off-site facilities will be vital to the operation of ATLAS in a way that was not the case for previous CERN-based experiments.
The primary event processing occurs at CERN in a Tier-0 facility. The RAW data is archived at CERN and copied (along with the primary processed data) to the Tier-1 facilities around the world. These facilities archive the raw data, provide the reprocessing capacity, provide access to the various processed versions, and allow scheduled analysis of the processed data by physics analysis groups. Derived datasets produced by the physics groups are copied to the Tier-2 facilities for further analysis. The Tier-2 facilities also provide the simulation capacity for the experiment, with the simulated data housed at Tier-1s. In addition, Tier-2 centres will provide analysis facilities, and some will provide the capacity to produce calibrations based on processing raw data. A CERN Analysis Facility provides an additional analysis capacity, with an important role in the calibration and algorithmic development work.
ATLAS has adopted an object-oriented approach to software, based primarily on the C++ programming language, but with some components implemented using FORTRAN and Java. A component-based model has been adopted, whereby applications are built up from collections of plug-compatible components based on a variety of configuration files. This capability is supported by a common framework that provides common data-processing support. This approach results in great flexibility in meeting the basic processing needs of the experiment, and also for responding to changing requirements throughout its lifetime. The heavy use of abstract interfaces allows for different implementations to be provided, supporting different persistency technologies, or optimized for the offline or high-level trigger environments.
The Athena framework is an enhanced version of the Gaudi framework that was originally developed by the LHCb experiment, but is now a common ATLAS-LHCb project. Major design principles are the clear separation of data and algorithms, and of transient (in-memory) and persistent (in-file) data. All levels of processing of ATLAS data, from high-level trigger to event simulation, reconstruction and analysis, take place within the Athena framework; in this way it is easier for code developers and users to test and run algorithmic code, with the assurance that all geometry and conditions data will be the same for all types of applications (simulation, reconstruction, analysis, visualization).
One of the principal challenges for ATLAS computing is to develop and operate a data storage and management infrastructure able to meet the demands of a yearly data volume of O(10 PB) utilized by data processing and analysis activities spread around the world. The ATLAS Computing Model establishes the environment and operational requirements that ATLAS data-handling systems must support, and, together with the operational experience gained to date in test beams and data challenges, provides the primary guidance for the development of the data management systems.
The ATLAS Databases and Data Management Project (DB Project) leads and coordinates ATLAS activities in these areas, with a scope encompassing technical databases (detector production, installation and survey data), detector geometry, online/TDAQ databases, conditions databases (online and offline), event data, offline processing configuration and book-keeping, distributed data management, and distributed database and data management services. The project is responsible for ensuring the coherent development, integration, and operational capability of the distributed database and data management software and infrastructure for ATLAS across these areas.
The ATLAS Computing Model foresees the distribution of raw and processed data to Tier-1 and Tier-2 centres, so as to be able to exploit fully the computing resources that are made available to the Collaboration. Additional computing resources will be available for data processing and analysis at Tier-3 centres and other computing facilities to which ATLAS may have access. A complex set of tools and distributed services, enabling the automatic distribution and processing of the large amounts of data, has been developed and deployed by ATLAS in cooperation with the LHC Computing Grid (LCG) Project and with the middleware providers of the three large Grid infrastructures we use: EGEE, OSG and NorduGrid. The tools are designed in a flexible way, in order to have the possibility to extend them to use other types of Grid middleware in the future. These tools, and the service infrastructure on which they depend, were initially developed in the context of centrally managed, distributed Monte Carlo production exercises. They will be re-used wherever possible to create systems and tools for individual users to access data and compute resources, providing a distributed analysis environment for general usage by the ATLAS Collaboration. The first version of the production system was deployed in summer 2004 and has been used since the second half of 2004. It was used for Data Challenge 2, for the production of simulated data for the 5th ATLAS Physics Workshop (Rome, June 2005) and for the reconstruction and analysis of the 2004 Combined Test-Beam data.
The main computing operations that ATLAS will have to run comprise the preparation, distribution and validation of ATLAS software, and the computing and data management operations run centrally on Tier-0, Tier-1s and Tier-2s. The ATLAS Virtual Organization will allow production and analysis users to run jobs and access data at remote sites using the ATLAS-developed Grid tools.
In the past few years the Computing Model has been tested and developed by running Data Challenges of increasing scope and magnitude, as was proposed by the LHC Computing Review in 2001. We have run two major Data Challenges since 2002 and performed other massive productions in order to provide simulated data to the physicists and to reconstruct and analyse real data coming from test-beam activities; this experience is now useful in setting up the operations model for the start of LHC data-taking in 2007.
The Computing Model, together with the knowledge of the resources needed to store and process each ATLAS event, gives rise to estimates of required resources that can be used to design and set up the various facilities. It is not assumed that all Tier-1s or Tier-2s will be of the same size; however, in order to ensure a smooth operation of the Computing Model, all Tier-1s should have broadly similar proportions of disk, tape and CPU, and the same should apply for the Tier-2s.
The organization of the ATLAS Software & Computing Project reflects all areas of activity within the project itself. Strong high-level links have been established with other parts of the ATLAS organization, such as the T-DAQ Project and Physics Coordination, through cross-representation in the respective steering boards. The Computing Management Board, and in particular the Planning Officer, acts to make sure that software and computing developments take place coherently across sub-systems and that the project as a whole meets its milestones. The International Computing Board assures the information flow between the ATLAS Software & Computing Project and the national resources and their Funding Agencies.
Copyright © CERN 2005