GStat
Developer(s) | Joanna Huang (Academia Sinica), Laurence Field, David Horat (CERN) |
---|---|
Stable release |
2.0 RC 2
/ 19 February 2010 |
Operating system | Scientific Linux 5 |
Type | Grid computing |
License | Apache License, Version 2.0 |
Website | http://cern.ch/gridinfo |
GStat is a web application which is aimed at displaying information about grid services, the grid information system itself and related metrics. The system is designed in modular way so that the software can be reused in different application scenarios.[1]
History
Gstat has evolved over the past few years from a simple CGI script that displayed the summary of a grid infrastructure to a production quality service providing rich features such as information content testing and infrastructure monitoring. An evolutionary approach to its development has enabled GStat to add functionally in response to real use cases and to become a key operational tool. GStat 2.0 is a major redesign of the original version which will ensure that it will meet the future demands of an evolving infrastructure and easily integrate with other operational tools.
GStat is the result of a collaboration between Academia Sinica and the Grid Technology Group at CERN. The main purpose of the joint project is to align GStat with direction taken by the WLCG monitoring group with respect to operational tools and in addition ensure that GStat can make a contribution to middleware certification and site validation.
GStat is compatible with version 1.3 of the Grid Laboratory Uniform Environment data model, taking the information of existing Berkeley Database Information Index instances. Currently there is an effort on developing compatibility with version 2.0 of the Grid Laboratory Uniform Environment data model.
The initial version of GStat were designed and developed by Min Tsai. The current team members can be contacted in the Grid Information Product Team webpage.
High-level system architecture
GStat provides a method to visualize a grid infrastructure from an operational perspective based on information found in the grid information system. Even in the absence of an information system, information about the existence of grid services needs to be communicated. The existence of grid services and the communication of their existence defines the grid infrastructures and as such one of the main concepts in GStat 2.0 is this is should be bootstrapped by the information system endpoint that defines the view of the grid infrastructure. It periodically takes a snapshot of the information system and maintains a cache of the main entities found in the infrastructure which provides the basic structure for the visualization. The main entities cache is also used to configure monitoring framework that monitors the information system and reports the health of the various components from which the information system is composed along with further metrics about the performance. The resulting information from both the information system itself and the monitoring thereof is used to produce various displays that address specific use cases.
The GStat architecture makes a clear separation between data, infrastructure monitoring, content validation and visualization. At the core is the data model used to maintain a snapshot of the information system and a cache of the main entities. Probes are used to monitor the information system components and validation checks are used ensure that the information content is correct. A visualization framework is used for displaying the resulting data. The modular approach enables the software to be reused in other application scenarios.
Gstat is uniquely positioned to support modern day Big Data initiatives.
Implementation
The GStat architecture is implemented using two main frameworks; Django and Nagios. Django is an open source web application framework, written in Python, which follows the model–view–controller architectural pattern. Django models are used to provide the core data model of the system. The snapshot script takes a snapshot of the information system and uses the Django framework to store the information. The import-entities script extracts the main entities, such as Sites and Services, from the snapshot and maintains a cache of entities. In addition, certain attributes are extracted from the snapshot and stored in rrd databases using the gstat-update-rrd script. Nagios is an open source monitoring framework and is used in GStat to both monitor the information system components and validate the information content via the use of custom probes. These monitoring probes can be re-used by other Nagios based monitoring tools and also executed on the command line, which enables them to be easily incorporated in other test suites. Django is also leveraged for the visualization aspects of GStat. The entity cache is used to provide the main structure for the displays. The snapshot and result from testing are used to provide more detailed information.