A prototype for the evolution of ATLAS EventIndex based on Apache Kudu storage
EPJ Web of Conferences EDP Sciences 214 (2019)
Abstract:
The ATLAS EventIndex has been in operation since the beginning of LHC Run 2 in 2015. Like all software projects, its components have been constantly evolving and improving in performance. The main data store in Hadoop, based on MapFiles and HBase, can work for the rest of Run 2 but new solutions are explored for the future. Kudu offers an interesting environment, with a mixture of BigData and relational database features, which look promising at the design level. This environment is used to build a prototype to measure the scaling capabilities as functions of data input rates, total data volumes and data query and retrieval rates. In this proceedings we report on the selected data schemas and on the current performance measurements with the Kudu prototype.Conditions evolution of an experiment in mid-life, without the crisis (in ATLAS)
23rd International Conference on Computing in High Energy and Nuclear Physics (CHEP 2018) EPJ Web of Conferences 214 (2019)
Abstract:
The ATLAS experiment is approaching mid-life: the long shutdown period (LS2) between LHC Runs 1 and 2 (ending in 2018) and the future collision data-taking of Runs 3 and 4 (starting in 2021). In advance of LS2, we have been assessing the future viability of existing computing infrastructure systems. This will permit changes to be implemented in time for Run 3. In systems with broad impact such as the conditions database, making assessments now is critical as the full chain of operations from online data-taking to offline processing can be considered: evaluating capacity at peak times, looking for bottlenecks, identifying areas of high maintenance, and considering where new technology may serve to do more with less. We have been considering changes to the ATLAS Conditions Database related storage and distribution infrastructure based on similar systems of other experiments. We have also examined how new technologies may help and how we might provide more RESTful services to clients. In this presentation, we give an overview of the identified constraints and considerations, and our conclusions for the best way forward: balancing preservation of critical elements of the existing system with the deployment of the new technology in areas where the existing system falls short.Optimizing access to conditions data in ATLAS event data processing
23rd International Conference on Computing in High Energy and Nuclear Physics (CHEP 2018) EDP Sciences (2019)
Abstract:
The processing of ATLAS event data requires access to conditions data which are stored in database systems. This data includes, for example alignment, calibration, and configuration information which may be characterized by large volumes, diverse content, and/or information which evolves over time as refinements are made in those conditions. Additional layers of complexity are added by the need to provide this information across the worldwide ATLAS computing grid and the sheer number of simultaneously executing processes on the grid, each demanding a unique set of conditions to proceed. Distributing this data to all the processes that require it in an efficient manner has proven to be an increasing challenge with the growing needs and numbers of event-wise tasks. In this presentation, we briefly describe the systems in which we have collected information about the database content and the use of conditions in event data processing. We then proceed to explain how this information has been used not only to refine reconstruction software and job configuration but also to guide modifications of underlying conditions data configuration and in some cases, rewrites of the data in the database into a more harmonious form for offline usage in the processing of both real and simulated data..The challenges of mining logging data in ATLAS
23rd International Conference on Computing in High Energy and Nuclear Physics (CHEP 2018) EPJ Web of Conferences 214 (2019)
Abstract:
Processing ATLAS event data requires a wide variety of auxiliary information from geometry, trigger, and conditions database systems. This information is used to dictate the course of processing and refine the measurement of particle trajectories and energies to construct a complete and accurate picture of the remnants of particle collisions. Such processing occurs on a worldwide computing grid, necessitating wide-area access to this information. Event processing tasks may deploy thousands of jobs. Each job calls for a unique set of information from the databases via SQL queries to dedicated squids in the ATLAS Frontier system, a system designed to pass queries to the database only if that result has not already been cached from another request. Many queries passing through Frontier are logged in an Elastic Search cluster along with pointers to the associated tasks and jobs, various metrics, and states at the time of execution. PanDA, which deploys the jobs, stores various configuration files as well as many log files after each job completes. Information is stored at each stage, but no system contains all information needed to draw a complete picture. This presentation describes the challenges of mining information from these sources to compile a view of database usage by jobs and tasks as well as assemble a global picture of the coherence and competition of tasks in resource usage to identify inefficiencies and bottlenecks within the overall system.Search for diboson resonances in hadronic final states in 139 fb −1 of pp collisions at s = 13 TeV with the ATLAS detector
Journal of High Energy Physics Springer 2019:9 (2019) 91