Monitoring Golang web apps using Gin and Prometheus

The content of this page hasn't been updated for years and might refer to discontinued products and projects.

Monitoring series:
Monitoring Apache Spark with Prometheus
Monitoring multiple federated clusters with Prometheus - the secure way
Application monitoring with Prometheus and Pipeline
Building a cloud cost management system on top of Prometheus
Monitoring Spark with Prometheus, reloaded
Hands on Thanos
Monitoring Vault on Kubernetes using Cloud Native technologies

In depth application observability is a must-have feature for us; we put significant effort into monitoring all our customers’ Kubernetes deployments on the Pipeline platform, as well as the platform itself. The platform is made up of several components, some of which expose functionality through REST interfaces. To collect more information about how these components behave, we have enabled monitoring for them using Prometheus.

Find the right tools 🔗︎

All our web applications are written in Golang, and we use the Gin Web Framework to implement/expose RESTful APIs. We use Prometheus all across the Pipeline platform, so we needed a metrics exporter that would feed Prometheus from different applications. We wanted to collect general metrics (which are observed by many other web applications) and some proprietary metrics, so we chose to use this library, more specifically a fork of it, which is where we added a small feature to the code that allowed us to get some specific information. This library comes with a generic set of metrics that fits almost all of our needs:

requests_total - HTTP requests processed, partitioned by status code and HTTP method
request_duration_seconds - HTTP request latencies in seconds
response_size_bytes - HTTP response sizes in bytes
request_size_bytes - HTTP request sizes in bytes

Customize the exporter 🔗︎

By default, the Prometheus exporter library takes into account all path parameters and provides counters. This may lead to the proliferation of metric instances (especially counters in our case). We wanted a bit more control over this, so we left the original behavior as it was and added a feature that makes it possible to configure the parameters taken into account during counting.

E.g. the original behavior, in case of requests_total, generates metrics like the following from URLs of the form http://[host]:[port]/api/v1/providers/ec2/services/eks/regions/us-east-1:

1 # separate counter for every path parameter (provider, service, region)
2 http_requests_total{code="200",host="localhost:9091",method="GET",url="/api/v1/providers/aws/services/eks/regions/us-east-1"} 1
3 http_requests_total{code="200",host="localhost:9091",method="GET",url="/api/v1/providers/aws/services/compute/regions/us-east-1"} 1
4 ...
5

Note that there will be prometheus counter instances for every provider, every service and every region, as well as other path parameters.

With our modification, we can provide a list of “important” path parameters, so the number of counter instances can be reduced, making it possible to focus exclusively on relevant counters.

“Important” parameters can be configured in Prometheus when it’s created:

1   p := ginprometheus.NewPrometheus("http", []string{"provider"})

With the above modification, the example url will only generate metrics like:

1 # counters for separate providers only
2 http_requests_total{code="200",host="localhost:9091",method="GET",url="/api/v1/providers/aws/services/:service/regions/:region"} 1
3 http_requests_total{code="200",host="localhost:9091",method="GET",url="/api/v1/providers/azure/services/:service/regions/:region"} 1
4 ...
5
6

(Path parameters not explicitly passed to Prometheus are replaced with their name)

Please note, the information we get using our change can be retrieved with Prometheus queries. The change only reduces the number of metrics and metric values.

Alter the web app’ code by adding monitoring lines 🔗︎

Let’s add the necessary code to the application and enable monitoring. Provided we want to collect the number of a given endpoint, the following steps need to be taken:

Define the metric structs 🔗︎

 1	type Metric struct {
 2    	MetricCollector prometheus.Collector
 3    	ID              string
 4    	Name            string
 5    	Description     string
 6    	Type            string
 7    	Args            []string
 8    }
 9
10    var reqCnt = &Metric{
11    	ID:          "reqCnt",
12    	Name:        "requests_total",
13    	Description: "the number of HTTP requests processed",
14    	Type:        "counter_vec",
15    	Args:        []string{"status"}}

Create a Prometheus client 🔗︎

 1	type Prometheus struct {
 2    	reqCnt        *prometheus.CounterVec
 3    	router        *gin.Engine
 4    	listenAddress string
 5
 6    	Metric      *Metric
 7    	MetricsPath string
 8    }
 9
10    func newPrometheus(subsystem string) *Prometheus {
11    	p := &Prometheus{
12    		Metric:        reqCnt,
13    		MetricsPath:   "/metrics",
14    		listenAddress: ":9901",
15    	}
16
17    	p.registerMetrics(subsystem)
18    	p.router = gin.Default()
19
20    	return p
21    }
22
23    func (p *Prometheus) registerMetrics(subsystem string) {
24		metric := prometheus.NewCounterVec(
25			prometheus.CounterOpts{
26				Subsystem: subsystem,
27				Name:      reqCnt.Name,
28				Help:      reqCnt.Description,
29			},
30			reqCnt.Args,
31		)
32		if err := prometheus.Register(metric); err != nil {
33			log.Infof("%s could not be registered: ", reqCnt, err)
34		} else {
35			log.Infof("%s registered.", reqCnt)
36		}
37		p.reqCnt = metric
38
39		reqCnt.MetricCollector = metric
40    }

Register the handler that collects/exports the desired metric as middleware 🔗︎

 1	// use adds the middleware to a gin engine.
 2    func (p *Prometheus) use(e *gin.Engine) {
 3    	e.Use(p.handlerFunc())
 4    	p.setMetricsPath(e)
 5    }
 6
 7    func (p *Prometheus) handlerFunc() gin.HandlerFunc {
 8    	return func(c *gin.Context) {
 9    		if c.Request.URL.String() == p.MetricsPath {
10    			c.Next()
11    			return
12    		}
13    		c.Next()
14
15    		status := strconv.Itoa(c.Writer.Status())
16    		p.reqCnt.WithLabelValues(status).Inc()
17    	}
18    }
19
20    func (p *Prometheus) setMetricsPath(e *gin.Engine) {
21    	p.router.GET(p.MetricsPath, prometheusHandler())
22    	go p.router.Run(p.listenAddress)
23    }
24
25    func prometheusHandler() gin.HandlerFunc {
26    	h := promhttp.Handler()
27    	return func(c *gin.Context) {
28    		h.ServeHTTP(c.Writer, c.Request)
29    	}
30    }

Initialize Prometheus when the application starts 🔗︎

1		// initialize Prometheus metric
2    	p := newPrometheus("http")
3    	p.use(router)

Examples 🔗︎

Our web applications in which monitoring through Gin is enabled have the following layout:

Cloudinfo diagram

We have enabled monitoring for all of our web applications, and you can check our repositories for more details, here:

The results on the Prometheus UI look as follows (screenshots are taken from our cloudinfo app):

requests_total metrics from Cloudinfo:

Basic metrics

Duration of data collection, or an error when a provider is unavailable:

Provider metrics

Duration of scraping rapidly changing prices:

Spot price metrics

Related resources

One Eye Release 0.4 Updates

article

One Eye release updates

article

One Eye Release 0.3.0

article