DevOps Architecture – DevOps Monitoring

DevOps Monitoring Introduction

This article describes how application and infrastructure architecture can contribute to the selection and application of the DevOps monitoring tooling.

DevOps Monitoring Definitions

Monitoring layer model:

There are thousands of monitor tools from which a choice can be made for the monitoring of the deployment pipeline and in production.

Event blacklist:

Events that should be signaled by the monitor tooling are listed on the event blacklist. This means that an alert occurs when the event takes place. The operations team can then take action.

Event whitelist:

Events that are not relevant are placed on the whitelist.

False negative:

The application raises an error that causes an exception, while there in fact is no error.

False positive:

The application does not raise an error on an event while in reality an incorrect situation has occurred.

DevOps Monitoring Concepts

Monitoring layer model

Figure 1 gives an overview of the various monitor functions that can be used to monitor a service.

DevOps monitor layer model — Figure 1, Monitor Layer Model [BEST 2016].

F1. System Monitoring (Element Monitoring)

The basis of the monitor layer model is the system monitoring that measures all application and infrastructure elements. System monitoring distinguishes from other forms of monitoring through the elementary level, therefore this monitoring is also referred to as element monitoring. Only individual application and infrastructure components are measured without activating them. This monitoring is essential for the technical management of the infrastructure. The system monitoring is made up of the following monitoring functions.

F1.1	Service Monitoring	The monitoring of both infrastructure services and application services. The services are measured with a management protocol, usually a Simple Network Management Protocol (SNMP) get.
F1.2	Resource Monitoring	Monitoring a single component on resource behaviour such as Central Processing Unit (CPU), memory and network bandwidth.
F1.3	Buildin Monitoring	A buildin monitoring is a monitor facility that is programmed as part of the application. The added value of this monitor facility is that application runtime counters are measured which are not visible outside the application.
F1.4	Event Monitoring	Event Monitoring includes all infrastructure components and infrastructure services, as well as application events. The events can usually be read from local log files such as the application log, security log and error log file.

Table 1, System Monitoring.

F2. Application Monitoring

The application monitoring provides information an image of one individual application (subsystem) or service, by measuring the underlying system (system monitoring), infrastructure services and application interfaces. This level is important for application management because it displays all relevant information from one application in one view. The application monitoring consists of the system monitor function as well as the following monitor functions.

F2.1

Application
Interface

Monitoring

The application interface monitoring is specifically intended for application administrators so that they can at one glance determine that an application is operating in accordance with expectations. Therefore, it is necessary that per application is determined which infrastructure components, infrastructure services and application services are important. In addition, measurements are made at the application level, which verifies the interaction between the infrastructure and the application code.

F2.2

Infrastructure

Service
Monitoring

Even if an infrastructure service is “live”, this does not mean that it is also working properly or even accessible. To establish that the service is working, a dynamic test can be performed at the infrastructure level where the service is called.

Table 2, Application Monitoring.

F3. Information System Monitoring

To perform a user function, more applications are required. These are not always identifiable to the user separately. In order to make user agreements at this level, the information system monitor layer is also defined in addition to the application monitoring layer. This is essentially a chain of applications. By measuring an E2E information system, it is possible to monitor as close as possible from the user perspective. These measurements are usually performed by walking through the transaction paths. Therefore monitoring at this level is also referred to as transaction monitoring.

E2E infrastructure measurements are also performed to support this measurement. The E2E Infrastructure Monitoring shows whether there is connectivity between various components of the infrastructure. This is especially true in geographically separated locations and large computing centers with more logical networks (Local Area Network (LAN)), Wide Area Network (WAN), and so forth.

The information system monitoring is built in addition to the application monitoring with the following functions.

F3.1

Information

System

E2E-monitoring

The availability of an application does not yet mean that the user can work. Often applications are coupled to complete chains (information systems).

F3.2

Infrastructure E2E-monitoring

The E2E monitoring infrastructure includes measuring the entire infrastructure layer. This monitoring is often used in Geographically Separated Units (WAN).

F3.3

Infrastructure Domain

monitoring

In case there are more vendors or the network is subdivided into physically / logically separated domains, it is often desirable to monitor these domains separately in terms of availability, etc. There may also be other standards for these domains such as the DeMilitarized Zone (DMZ) domain versus the office automation network.

Table 3, Information System Monitoring.

F4. Chain Monitoring

The chain monitoring is aimed at giving an image of the performance of the total chain as a whole from different perspectives:

The user’s is perception;
The performance that the business process requires;
The information processing by the chain.

In addition to the information system monitoring, the chain monitoring includes the following functionalities.

F4.1

Information

Stream

E2E-monitoring

At this level, the monitoring of information flows is recognized. With an E2E information flow measurement, no ICT service is measured, but purely the information flows of the business processes.

F4.2

Business Process E2E-monitoring

This feature measures the business process from front to back based on the related information systems.

F4.3

End User E2E-monitoring

At this level, service provisioning as experienced by the user is measured. An image of the service provisioning is obtained from an end user perspective (see chapter IV.7).

Table 4, Chain Monitoring.

DevOps Monitoring Best Practices

The DevOps process includes the phase “monitor”. Monitoring a service is essential for DevOps. The monitor function is aimed at monitoring the correct functioning of the service and proactively detecting incidents. In addition, the monitor service enables the measurement of the SLA norms. However, realizing a good monitor service is not a sinecure. This article describes what should be considered in the DevOps process to provide a good monitoring facility.

Monitor principles:

There are a number of principles applicable to ensure that the monitor provision is properly used.

Monitor Functionality	DevOps Monitoring functionality must correspond with the service level agreement.
Correlation	Next to E2E monitoring, system monitoring must also be applied.
Monitor classification	For each class of objects to be monitored objects, the same monitoring tool should be used wherever possible.

Table 5, Monitor principles.

DevOps Monitoring tools and SLA’s

The choice of a DevOps monitoring tool in production depends on the content of the SLA. For example, if in a SLA, only performance indicators and norm are formulated at the level of a hardware platform and network components, then it is enough to monitor the SLAs with a system monitors tool. However, if a SLA describes performance indicators at the information system level, then monitoring will need to be adjusted to that level.

Some tools present this by considering an information system as a sum of underlying objects and defining the availability, capacity, etc. as a function of the sum of the parts. Other tools monitor the information system itself, for example, by monitoring the flow of transactions. Matching the DevOps monitoring tool functionality to the level at which SLA norms are defined provides a reliable SLA reporting.

DevOps process:

Table 6 provides an overview of the DevOps monitoring related per DevOps fase and Monitor function.

		Plan	Code	Build	Test	Release	Operate	Monitor
F1.1	Service Monitoring	Tool selection	–	–	System test	Deploy	–	React on exception
F1.2	Resource Monitoring	Tool selection	–	–	System test	Deploy	–	React on Exception
F1.3	Buildin Monitoring	–	Programme Monitor	Unit test	System test	Deploy	–	React on Exception
F1.4	Event Monitoring	Tool selection	Programme Exception	Unit test	Systeem test	Deploy	–	React on Exception
F2.1	Application Interface Monitoring	–	Programme Exception	–	Integration test	Deploy	–	React on Exception
F2.2	Infrastructure Service Monitoring	Tool selection	–	–	Systeem test	Deploy	–	React on Exception
F3.1	Information System E2E-monitoring	Tool selection	Programme dummy transaction	–	EUX test (FAT/UAT)	Deploy	–	React on Exception
F3.2	Infrastructuur E2E-monitoring	Tool selection	–	–	Infra E2E test (PAT)	Deploy	–	React on Exception
F3.3	Infrastructure Domain Monitoring	Tool selection	–	–	Infra domain test (PAT)	Deploy	–	React on Exception
F4.1	Information Stream E2E-monitoring	Tool selection	Label information	–	Information E2E test (FAT/UAT)	Deploy	–	React on Exception
F4.2	Business Process E2E-monitoring	Tool selection	Label / status Information	–	E2E test (FAT/UAT)	Deploy	–	React on Exception
F4.3	End User E2E-monitoring	Tool selection	Label information	–	RUM test (FAT/UAT)	Deploy	–	React on exception

Table 6, Monitor principles.

Plan:

The plan phase of the DevOps must take into account the costs and time involved in adjusting the monitor tooling. Freeware tools also take time to configure. Often, however, its monitoring facilities are centrally organized and a new Agile project can make use of existing facilities. However, the build-in DevOps monitoring and application interface monitoring needs to be programmed. This has to be budgeted in the plan phase.

Code:

During programming, it must be known how the exceptions are captured by the monitor facility. Writing an event (event) in a logfile is not good enough. A certain event format must be used that can be interpreted by the monitor. The easiest way is to define Standard Rules & Guidelines (SRG). A Standard must be met, a Rule may be waivered if there is a valid reason and Guideline is a recommendation. The following best practice standards apply to build a proper DevOps monitoring device while programming:

S1. Each event has a unique number
S2. Each event refers to the software configuration item that has made the exception
S3. Each event has a severity code assigned
S4. Each event also defines the recovery action
S5. Each new event will be registered on the product backlog of the OPS team

During the build phase, it should be known which monitor functions are applicable. The service desk and operations team are important stakeholders to be involved.

The following aspects are important during programming:

Build-in monitor equipment must be programmed. Importantly, maintenance of this feature may require new deployments of the application.
Event monitoring requires that events are programmed in the correct places in the application. The most important places are where an external call is performed to another, module, another application or an infrastructure service.
Application interface monitoring requires knowledge of how the application communicates with other applications. An example is the implementation of a control to determine whether the communication protocol that two applications use have the same version.
E2E Information Systems Monitoring (EUX) can best take place based on both read and write / mutation transactions. For writing and mutation transactions, however, it must be taken into account that these transactions cannot cause contamination in the information administration. A dummy account can be used. However, this must be isolated throughout the application and abuse by using such an account must be prevented.
Information flow E2E monitoring requires that data formats are measurable. This means that information items must be labelled. This labelling should preferably be included in the programming phase.
Business process E2E monitoring. To monitor a business process, the application must label information to determine which part of and business process that the transaction belongs to.
End User E2E Monitoring (RUM) requires labelling to determine which network package contains a particular type of transaction.

Build:

In the build phase only unit test cases can be used because otherwise the requirement of a maximum of 5 minutes for a build process cannot be met. Therefore, the complete DevOps monitoring facility can only be tested later in the deployment pipeline. This means that most aspects of the monitor function cannot be tested until a check-in has taken place. Therefore, it is important that measures taken in the code phase are taken as much as possible to prevent defects.

The build must check that the format in which the exception is programmed complies with the SRG based on a source code check. During the build, it must be verified that each error message that is programmed is also present in the event blacklist or event white list.

The programmer must in the code phase define the unit testing that will be used in the build phase to detect errors in the exception handling. Both false positives and false negatives must be tested.

Test:

Only after the object code has been created in the build phase and the source code has been checked in the baseline, it is only possible to use system integration tests and system tests to determine whether the monitor facility detects exceptions at this level.

It is often said that automating the acceptance tests in the deployment pipeline should only include the happy path. An important exception on this statement is the acceptance tests that tests all blacklist’s exceptions.

Release:

A successful deployment of application and infrastructure components should also include the deployment of adjustments to the monitor facility. Otherwise, new events will not be recognized. Deploying a monitor tool adjustment is usually not so exciting. However, the regression testing of the monitor device is a challenge. In practice, very little time is spent to this aspect. As a result, DevOps monitoring facilities are often less effective than originally intended.

Additionally, the scripting in the deployment pipeline should also be provided with exceptions if the automatic deployment is not successful. Monitor facilities must therefore also monitor the deployment pipeline, and not only on availability but also on exceptions.

Operate:

Keeping a system in the air usually means recovering of a service deviation. Therefore, the operations team is a key stakeholder to determine which aspects should be monitored. Often deviations will be possible to be solved automatically. Due to the collaboration of Dev and Ops teams, it has become possible to provide maximum support here.

Monitor:

Monitoring a service is not only applicable for production but throughout the D-T-A-P cycle. An important point that cannot be overlooked is the fact that in practice more and more DevOps monitoring tools are used. They should communicate well with each other. This collaboration of tools can impact the coding phase. This is because the communication of the monitor tools is often based on the format of events as they are programmed in the coder phase.

Discuss with us about this article on LinkedIn.

More information

Related training sessions:

Service Management

Samenvatting

Artikel

DevOps Architecture - Monitoring

Beschrijving

How application and infrastructure architecture can contribute to the selection and application of the DevOps monitor tooling. There are thousands of monitor tools from which a choice can be made for the monitoring of the deployment pipeline and in production.

Auteur

Bart de Best

Publisher Naam

ITpedia

Publisher Logo

Mogelijk is dit een vertaling van Google Translate en kan fouten bevatten. Klik hier om mee te helpen met het verbeteren van vertalingen.