导图社区 Prometheus
Prometheus 是一个开源的服务监控系统和时间序列数据库。 prometheus存储的是时序数据,即按相同时序(相同名称和标签),以时间维度存储连续的数据的集合。
编辑于2021-05-08 16:31:20Prometheus
Overview
只针对性能和可用性监控,并不具备日志监控等功能
由于对监控数据的查询通常都是最近几天的,所以P r o m e t h e u s的本地存储的设计初衷只是存储短期(一个月)数据,并非存储大量历史数据
监控数据没有对单位进行定义,这里需要使者自己区分或者事先定义好有监控数据单位,避免发生数据缺少单位的情况
Design
指标
格式
[a-zA-Z:][a-zA-Z0-9_:]*
http_request_total{status="200"}
__开头: 内部使用
{__name__="http_request_total", status="200"}
分类
Counter
Gauge
Histogram
Summary
数据样本
按照时间序列保存
指标
样本值
时间戳
数据采集
Push
实时性
Agent无状态
Master维护状态
Agent控制上报周期
Agent需要配置Master地址
Pull
周期采集
Agen需要一定维护状态能力
Master无状态
Master控制采样频率
Agent不需要感知Master
方式
使用统一的Restful API获取数据
对每一个收集点都启用一个线程去定时采集
配置更新
调用reload接口
发送Kill -HUP Prometheus #P_ID ( 不推荐)
服务发现
静态文件配置
动态发现
容器管理系统
Kubernetes
面以与Kubernetes集成为例讲解监控对象自动发现的流程。(1)需要在Prometheus配置KubernetesAPI的地址和认证凭据,这样Prometheus就可以连接到Kubernetes的API来获取信息。(2)Prometheus的服务发现组件会一直监昕(watch)Kubernetes集群的变化,当有新主机加入集群的时候,会获取新主机的主机名和主机IP, 如果是新创建的容器,则可以获取新创建Pod的名称、命名空间和标签等。相应地,如果删除机器或者容器,则相关事件也会被Prometheus感知,从而更新采集对象列表。
云管理平台
EC2
Azure
服务发现组件
DNS
Zookeeper
数据处理
重新定义标签
标签筛选
Keep
Drop
Hash
数据存储
本地
V3
一段时间一个文件
引入WAL
文件块最大512M
删除分区
分区加载+内存数据合并
概念
Block
最小的block保存2h监控数据。如果步数为3、步长为3,则block的大小依次为:2h、6h、18h。随着数据量的不断增长,TSDB会将小的block合并成大的block,
chunk
保存压缩后的时序数据。每个chunk的大小为512MB,如果超过,则会被截断成多个chunk保存,且以数字编号命名。
index
TOC表
符号表
时序列表
标签索引表
Postings表
meta.json
block的元数据
tombstones
对数据进行软删除
WAL
Write-ahead Log
默认128M
每页32K
参数
保存15d
storage.tsdb.retention
保存路径
storage.tsdb.path
远程
实现Read接口
实现Write接口
Add
AddFast
Commit
Rollback
兼容
fanout
参数
Adapt 地址
超时时间
relabel规则
请求队列
数据查询
PromQL
只支持查询
警告
AlertManager
功能
警告分组
警告抑制
警告静默
组成
路由
接收器
高可用
(1)Prometheus会通过调用AlertManager提供的告警接口将原始的告警消息发送到AlertManager。(2)AlertManager的API除了接收告警,还接收静默请求,将其分别保存到各自的provider里。(3)provider提供了一个订阅(subscribe)接口,这样Dispatcher组件便可以获取告警数据,并对数据进行分组,通过用户预先设置的规则进入告警抑制阶段或静默阶段。(4)如果通过了上面的告警静默阶段,则进入路由分发阶段,最终发送通知。
Gossip 最终一致性
集群
联邦(弃用)
Thanos
组件
Querier
Sidercar
Store
Compactor
节点发现
Gossip
历史数据存储
历史数据降准
搭建
Prometheus
create namespace kube-prometheus
apiVersion: v1 kind: Namespace metadata: name: kube-prometheus
create configMap
apiVersion: v1kind: ConfigMapmetadata: name: prometheus-config namespace: kube-prometheusdata: prometheus.yml: | global: scrape_interval: 15s external_labels: monitor: 'neal-monitor' scrape_timeout: 15s scrape_configs: - job_name: 'prometheus' scrape_interval: 5s static_configs: - targets: ['localhost:9090'] - job_name: 'redis' scrape_interval: 5s static_configs: - targets: ['redis:9121'] - job_name: 'mongo-exporter' scrape_interval: 5s static_configs: - targets: ['mongo-exporter:9216']
create rbac
apiVersion: v1kind: ServiceAccountmetadata: name: prometheus namespace: kube-prometheus---apiVersion: rbac.authorization.k8s.io/v1kind: ClusterRolemetadata: name: prometheusrules: - apiGroups: [ "" ] resources: [ "nodes","nodes/proxy","services","endpoints","pods" ] verbs: [ "get", "watch", "list" ] - apiGroups: [ "" ] resources: [ "configmaps","nodes/metrics","ingresses" ] verbs: [ "get", "list", "watch" ] - nonResourceURLs: [ "/metrics" ] verbs: [ "get" ]---apiVersion: rbac.authorization.k8s.io/v1beta1kind: ClusterRoleBindingmetadata: name: prometheusroleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: prometheussubjects: - kind: ServiceAccount name: prometheus namespace: kube-prometheus
create Deployment
apiVersion: apps/v1kind: Deploymentmetadata: name: prometheus namespace: kube-prometheus labels: app: prometheusspec: selector: matchLabels: app: prometheus template: metadata: labels: app: prometheus spec: serviceAccountName: prometheus containers: - name: prometheus image: prom/prometheus command: - "/bin/prometheus" args: - "--config.file=/etc/prometheus/prometheus.yml" - "--storage.tsdb.path=/prometheus" - "--storage.tsdb.retention=24h" - "--web.enable-lifecycle" ports: - name: http containerPort: 9090 protocol: TCP volumeMounts: - mountPath: /prometheus subPath: prometheus name: data - mountPath: /etc/prometheus name: config-volume securityContext: runAsUser: 0 volumes: - name: data emptyDir: { } - name: config-volume configMap: name: prometheus-config
crate prometheus svc
apiVersion: v1kind: Servicemetadata: name: prometheus namespace: kube-prometheus labels: app: prometheusspec: selector: app: prometheus type: NodePort ports: - name: web port: 9090 targetPort: 9090
create Ingress
apiVersion: extensions/v1beta1kind: Ingressmetadata: name: traefik-default-ingress annotations: kubernetes.io/ingress.class: "traefik"spec: tls: - secretName: traefik-ssl rules: - host: prometheus.local # 替换成你的域名 http: paths: - path: / backend: serviceName: prometheus servicePort: 9090
kube-Operator
Clone Git
https://github.com/prometheus-operator/prometheus-operator
Update prometheus-serviceMonitorKubelet.yaml
https-metrics更改成http-metrics
kubectl apply -f /manifests/setup
kubectl apply -f /manifests
检查
kubectl get crd | grep coreos
kubectl get pods -n monitoring
kubectl port-forward svc/prometheus-k8s 9090:9090 -n monitoring
Redis-Operator
apply yaml
apiVersion: apps/v1kind: Deploymentmetadata: name: redis namespace: kube-prometheusspec: replicas: 1 selector: matchLabels: app: redis template: metadata: annotations: prometheus.io/scrape: "true" prometheus.io/port: "9121" labels: app: redis spec: containers: - name: redis image: redis:4 resources: requests: cpu: 50m memory: 50Mi ports: - containerPort: 6379 - name: redis-exporter image: oliver006/redis_exporter:latest resources: requests: cpu: 50m memory: 50Mi ports: - containerPort: 9121---kind: ServiceapiVersion: v1metadata: name: redis namespace: kube-prometheusspec: selector: app: redis ports: - name: redis port: 6379 targetPort: 6379 - name: prom port: 9121 targetPort: 9121
Mongo-Operator
git clone code
https://github.com/percona/mongodb_exporter
make init
build DockerFile
FROM ubuntuLABEL app=mongo-exportorADD file/mongodb_exporter /optENTRYPOINT ["/opt/mongodb_exporter"]
publish-docker-images
docker login
docker tag $image $tag
docker push $tag
apply mongoDB
apiVersion: apps/v1kind: StatefulSetmetadata: name: mongodb namespace: kube-prometheusspec: selector: matchLabels: app: mongodb serviceName: mongodb replicas: 1 template: metadata: labels: app: mongodb spec: containers: - name: mongodb image: mongo# command: ["mongod", "--replSet", "ft-dev", "--bind_ip_all"] ports: - containerPort: 27017 volumeMounts: - name: data mountPath: /data/db volumeClaimTemplates: - metadata: name: data spec: accessModes: [ "ReadWriteOnce" ] ---apiVersion: v1kind: Servicemetadata: name: mongodb namespace: kube-prometheus labels: name: mongodbspec: ports: - port: 27017 targetPort: 27017 selector: app: mongodb
apply mongo-exportor
apiVersion: apps/v1kind: Deploymentmetadata: name: mongo-exporter namespace: kube-prometheus labels: app: mongo-exporterspec: replicas: 1 selector: matchLabels: app: mongo-exporter template: metadata: labels: app: mongo-exporter spec: containers: - name: mongo-exporter image: xwin1989/mongodb-exportor:latest args: ["--mongodb.uri=mongodb://mongodb:27017/admin","--discovering-mode","--disable.replicasetstatus"] ports: - containerPort: 9216 name: http---apiVersion: v1kind: Servicemetadata: name: mongo-exporter namespace: kube-prometheus labels: app: mongo-exporterspec: ports: - port: 9216 targetPort: 9216 selector: app: mongo-exporter
grafana
apply yaml
apiVersion: v1kind: PersistentVolumeClaimmetadata: name: grafana-pvc namespace: kube-prometheusspec: accessModes: - ReadWriteOnce resources: requests: storage: 200Mi---apiVersion: apps/v1kind: Deploymentmetadata: labels: app: grafana name: grafana namespace: kube-prometheusspec: selector: matchLabels: app: grafana template: metadata: labels: app: grafana spec: securityContext: fsGroup: 472 supplementalGroups: - 0 containers: - name: grafana image: grafana/grafana:7.5.2 imagePullPolicy: IfNotPresent ports: - containerPort: 3000 name: http-grafana protocol: TCP readinessProbe: failureThreshold: 3 httpGet: path: /robots.txt port: 3000 scheme: HTTP initialDelaySeconds: 10 periodSeconds: 30 successThreshold: 1 timeoutSeconds: 2 livenessProbe: failureThreshold: 3 initialDelaySeconds: 30 periodSeconds: 10 successThreshold: 1 tcpSocket: port: 3000 timeoutSeconds: 1 volumeMounts: - mountPath: /var/lib/grafana name: grafana-pv volumes: - name: grafana-pv persistentVolumeClaim: claimName: grafana-pvc---apiVersion: v1kind: Servicemetadata: name: grafana namespace: kube-prometheusspec: ports: - port: 3000 protocol: TCP targetPort: http-grafana selector: app: grafana sessionAffinity: None type: LoadBalancer
Kubernetes
服务发现配置
静态配置
服务发现配置
Node
- job_name: 'kubenetes-nodes' kubernetes_sd_configs: - role: node
Service
Pod
Endpoints
Ingress
监控
容器监控
目标
kubernetes节点
CPU
Load
Disk
Memory
系统组件
kube-scheduler
kube-DNS/CoreDNS
编排的metrics
Deployment
Pod
Daemonset
StatefulSet
cAdvisor
scrape_configs:- job_name: cadvisor scrape_interval: 5s static_configs: - targets: - cadvisor:8080
容器指标
CPU
Memory
Disk I/O
NetWork I/O
apiserver 监控
配置
- job_name: 'kubernetes-apiservers' kubernetes_sd_configs: - role: endpoints scheme: https tls_config: ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token relabel_configs: - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name] action: keep regex: default;kubernetes;https
Service 监控
kube-state-metrics 监控
主机监控
Grafana
数据源
Dashboard
插件
告警
Slack
Exporter
服务分类
在线服务
离线服务
批处理
服务方式
HTTP/HTTPS
TCP
本地文件
标准协议
解析
Node exporter
node_exporter.go
NodeCollector
execute update()
return system metric
Redis exporter
Info ALL
MySQL exporter
授权:进程列表,服务器位置,库,表
read performance_schema
read information_schema
客户端
使用原子变量确保并发数据的一致性
CAS
分位计算
2个buff缓存
热缓存
冷缓存
有序集合stream
热缓存存储上限|超时
冷热缓存交换
冷缓存插入临时集合
快速排序
插入Stream集合
数据压缩
造成精度损失
Mongo-Operator
serverStatus
electionMetrics
extra_info
flowControl
globalLock
locks
Collection
Database
Global
logicalSessionRecordCache
mem
metrics
Stage
commands
cursor
queryExecutor
repl
opcounters
network
security
storageEngine
tcmalloc
wiredTiger
twoPhaseCommitCoordinator
transactions
wiredTiger
systemMetrics
cpu
disks
memory
netstat
vmstat