Thank you for reading this post, don't forget to subscribe!
возникла следующая проблема при обновлении чарта loki вот официальная дока:
https://grafana.com/docs/loki/latest/setup/upgrade/
вот мой старый values версии чарта 5,22
https://github.com/grafana/loki/blob/helm-loki-5.22.0/production/helm/loki/values.yaml
|
loki: compactor: shared_store: s3 compaction_interval: 10m retention_enabled: true retention_delete_delay: 2h retention_delete_worker_count: 150 storage: bucketNames: chunks: "${infra_alias}-${loki_chunks_bucket}" ruler: "${infra_alias}-${loki_ruler_bucket}" admin: "${infra_alias}-${loki_admin_bucket}" type: s3 s3: region: "${region}" limits_config: retention_period: ${retention_period} server: grpc_server_max_recv_msg_size: 8388608 grpc_server_max_send_msg_size: 8388608 tolerations: - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" serviceAccount: create: true name: "${loki_service_account}" annotations: eks.amazonaws.com/role-arn: "${aws_iam_role}" monitoring: dashboards: enabled: false rules: enabled: true alerting: true labels: release: kube-prometheus-stack serviceMonitor: enabled: true labels: release: kube-prometheus-stack interval: 15s lokiCanary: resources: requests: cpu: ${canary_requests_cpu} memory: ${canary_requests_memory} limits: cpu: ${canary_limits_cpu} memory: ${canary_limits_memory} tolerations: - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" selfMonitoring: grafanaAgent: resources: requests: cpu: 200m memory: 256Mi limits: cpu: 500m memory: 512Mi tolerations: - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" write: replicas: 3 resources: requests: cpu: ${write_requests_cpu} memory: ${write_requests_memory} limits: cpu: ${write_limits_cpu} memory: ${write_limits_memory} persistence: enableStatefulSetAutoDeletePVC: true size: 10Gi tolerations: - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" read: replicas: 3 resources: requests: cpu: ${read_requests_cpu} memory: ${read_requests_memory} limits: cpu: ${read_limits_cpu} memory: ${read_limits_memory} persistence: enableStatefulSetAutoDeletePVC: true size: 10Gi tolerations: - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" backend: replicas: 3 resources: requests: cpu: 200m memory: 256Mi limits: cpu: 500m memory: 768Mi persistence: enableStatefulSetAutoDeletePVC: true size: 10Gi tolerations: - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" singleBinary: replicas: 0 resources: requests: cpu: 200m memory: 256Mi limits: cpu: 500m memory: 768Mi persistence: enableStatefulSetAutoDeletePVC: true size: 100Gi tolerations: - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" gateway: resources: requests: cpu: 200m memory: 256Mi limits: cpu: 500m memory: 768Mi tolerations: - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" |
при обновлении на версию 6,22
https://github.com/grafana/loki/blob/helm-loki-6.22.0/production/helm/loki/values.yaml
нужно править values а именно удалить
|
deploymentMode: SimpleScalable loki: compactor: delete_request_store: s3 compaction_interval: 10m retention_enabled: true retention_delete_delay: 2h retention_delete_worker_count: 150 storage: bucketNames: chunks: "${infra_alias}-${loki_chunks_bucket}" ruler: "${infra_alias}-${loki_ruler_bucket}" admin: "${infra_alias}-${loki_admin_bucket}" type: s3 s3: region: "${region}" schemaConfig: configs: - from: 2022-01-11 store: boltdb-shipper object_store: s3 schema: v12 index: prefix: loki_index_ period: 24h - from: 2024-11-28 store: tsdb object_store: s3 schema: v13 index: prefix: loki_index_ period: 24h storageConfig: tsdb_shipper: active_index_directory: var/loki/index cache_location: var/loki/index_cache cache_ttl: 24h limits_config: retention_period: ${retention_period} server: grpc_server_max_recv_msg_size: 8388608 grpc_server_max_send_msg_size: 8388608 tolerations: - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" serviceAccount: create: true name: "${loki_service_account}" annotations: eks.amazonaws.com/role-arn: "${aws_iam_role}" monitoring: dashboards: enabled: false rules: enabled: true alerting: true serviceMonitor: enabled: true interval: 15s lokiCanary: resources: requests: cpu: ${canary_requests_cpu} memory: ${canary_requests_memory} limits: cpu: ${canary_limits_cpu} memory: ${canary_limits_memory} tolerations: - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" selfMonitoring: grafanaAgent: resources: requests: cpu: 200m memory: 256Mi limits: cpu: 500m memory: 512Mi tolerations: - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" write: replicas: 3 resources: requests: cpu: ${write_requests_cpu} memory: ${write_requests_memory} limits: cpu: ${write_limits_cpu} memory: ${write_limits_memory} persistence: enableStatefulSetAutoDeletePVC: true size: 10Gi tolerations: - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" read: replicas: 3 resources: requests: cpu: ${read_requests_cpu} memory: ${read_requests_memory} limits: cpu: ${read_limits_cpu} memory: ${read_limits_memory} persistence: enableStatefulSetAutoDeletePVC: true size: 10Gi tolerations: - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" backend: replicas: 3 resources: requests: cpu: 200m memory: 256Mi limits: cpu: 500m memory: 768Mi persistence: enableStatefulSetAutoDeletePVC: true size: 10Gi tolerations: - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" singleBinary: replicas: 0 resources: requests: cpu: 200m memory: 256Mi limits: cpu: 500m memory: 768Mi persistence: enableStatefulSetAutoDeletePVC: true size: 100Gi tolerations: - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" gateway: resources: requests: cpu: 200m memory: 256Mi limits: cpu: 500m memory: 768Mi tolerations: - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" ## We used nginxConfig and extraContainers because they fix the issue. The explanation is provided below https://github.com/grafana/loki/issues/9522#issuecomment-2132161625 nginxConfig: serverSnippet: | location = /stub_status { stub_status on; allow 127.0.0.1; deny all; } location = /metrics { proxy_pass http://127.0.0.1:9113/metrics; } extraContainers: - name: nginx-exporter securityContext: allowPrivilegeEscalation: false image: nginx/nginx-prometheus-exporter:1.3.0 imagePullPolicy: IfNotPresent ports: - containerPort: 9113 name: http-exporter resources: limits: memory: 128Mi cpu: 500m requests: memory: 64Mi cpu: 100m chunksCache: tolerations: - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" resultsCache: tolerations: - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" |
т.е. я удалил
и добавил
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
schemaConfig: configs: - from: 2022-01-11 store: boltdb-shipper object_store: s3 schema: v12 index: prefix: loki_index_ period: 24h - from: 2024-11-28 store: tsdb object_store: s3 schema: v13 index: prefix: loki_index_ period: 24h storageConfig: tsdb_shipper: active_index_directory: var/loki/index cache_location: var/loki/index_cache cache_ttl: 24h |
в которой store: boltdb-shipper это старое хранилище а store: tsdb новое хранилище.
обратите внимание что для tsdb нужно указать дату того дня когда вы делаете переезд, чтобы вы видели старые данные в моём случае это
- from: 2024-11-28
т.е. если указать дату 2024-11-27 то хоть в старом хранилище boltdb-shipper данные есть, вы их не увидите. так как в хранилище tsdb их нет за 2024-11-27
https://github.com/grafana/loki/issues/9522#issuecomment-2132161625
мы добавили экспортёр и location /metrics
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
nginxConfig: serverSnippet: | location = /stub_status { stub_status on; allow 127.0.0.1; deny all; } location = /metrics { proxy_pass http://127.0.0.1:9113/metrics; } extraContainers: - name: nginx-exporter securityContext: allowPrivilegeEscalation: false image: nginx/nginx-prometheus-exporter:1.3.0 imagePullPolicy: IfNotPresent ports: - containerPort: 9113 name: http-exporter resources: limits: memory: 128Mi cpu: 500m requests: memory: 64Mi cpu: 100m |