Thank you for reading this post, don't forget to subscribe!
возникла следующая проблема при обновлении чарта loki вот официальная дока:
https://grafana.com/docs/loki/latest/setup/upgrade/
вот мой старый values версии чарта 5,22
https://github.com/grafana/loki/blob/helm-loki-5.22.0/production/helm/loki/values.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 |
loki: compactor: shared_store: s3 compaction_interval: 10m retention_enabled: true retention_delete_delay: 2h retention_delete_worker_count: 150 storage: bucketNames: chunks: "${infra_alias}-${loki_chunks_bucket}" ruler: "${infra_alias}-${loki_ruler_bucket}" admin: "${infra_alias}-${loki_admin_bucket}" type: s3 s3: region: "${region}" limits_config: retention_period: ${retention_period} server: grpc_server_max_recv_msg_size: 8388608 grpc_server_max_send_msg_size: 8388608 tolerations: - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" serviceAccount: create: true name: "${loki_service_account}" annotations: eks.amazonaws.com/role-arn: "${aws_iam_role}" monitoring: dashboards: enabled: false rules: enabled: true alerting: true labels: release: kube-prometheus-stack serviceMonitor: enabled: true labels: release: kube-prometheus-stack interval: 15s lokiCanary: resources: requests: cpu: ${canary_requests_cpu} memory: ${canary_requests_memory} limits: cpu: ${canary_limits_cpu} memory: ${canary_limits_memory} tolerations: - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" selfMonitoring: grafanaAgent: resources: requests: cpu: 200m memory: 256Mi limits: cpu: 500m memory: 512Mi tolerations: - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" write: replicas: 3 resources: requests: cpu: ${write_requests_cpu} memory: ${write_requests_memory} limits: cpu: ${write_limits_cpu} memory: ${write_limits_memory} persistence: enableStatefulSetAutoDeletePVC: true size: 10Gi tolerations: - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" read: replicas: 3 resources: requests: cpu: ${read_requests_cpu} memory: ${read_requests_memory} limits: cpu: ${read_limits_cpu} memory: ${read_limits_memory} persistence: enableStatefulSetAutoDeletePVC: true size: 10Gi tolerations: - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" backend: replicas: 3 resources: requests: cpu: 200m memory: 256Mi limits: cpu: 500m memory: 768Mi persistence: enableStatefulSetAutoDeletePVC: true size: 10Gi tolerations: - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" singleBinary: replicas: 0 resources: requests: cpu: 200m memory: 256Mi limits: cpu: 500m memory: 768Mi persistence: enableStatefulSetAutoDeletePVC: true size: 100Gi tolerations: - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" gateway: resources: requests: cpu: 200m memory: 256Mi limits: cpu: 500m memory: 768Mi tolerations: - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" |
при обновлении на версию 6,22
https://github.com/grafana/loki/blob/helm-loki-6.22.0/production/helm/loki/values.yaml
нужно править values а именно удалить
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 |
deploymentMode: SimpleScalable loki: compactor: delete_request_store: s3 compaction_interval: 10m retention_enabled: true retention_delete_delay: 2h retention_delete_worker_count: 150 storage: bucketNames: chunks: "${infra_alias}-${loki_chunks_bucket}" ruler: "${infra_alias}-${loki_ruler_bucket}" admin: "${infra_alias}-${loki_admin_bucket}" type: s3 s3: region: "${region}" schemaConfig: configs: - from: 2022-01-11 store: boltdb-shipper object_store: s3 schema: v12 index: prefix: loki_index_ period: 24h - from: 2024-11-28 store: tsdb object_store: s3 schema: v13 index: prefix: loki_index_ period: 24h storageConfig: tsdb_shipper: active_index_directory: var/loki/index cache_location: var/loki/index_cache cache_ttl: 24h limits_config: retention_period: ${retention_period} server: grpc_server_max_recv_msg_size: 8388608 grpc_server_max_send_msg_size: 8388608 tolerations: - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" serviceAccount: create: true name: "${loki_service_account}" annotations: eks.amazonaws.com/role-arn: "${aws_iam_role}" monitoring: dashboards: enabled: false rules: enabled: true alerting: true serviceMonitor: enabled: true interval: 15s lokiCanary: resources: requests: cpu: ${canary_requests_cpu} memory: ${canary_requests_memory} limits: cpu: ${canary_limits_cpu} memory: ${canary_limits_memory} tolerations: - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" selfMonitoring: grafanaAgent: resources: requests: cpu: 200m memory: 256Mi limits: cpu: 500m memory: 512Mi tolerations: - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" write: replicas: 3 resources: requests: cpu: ${write_requests_cpu} memory: ${write_requests_memory} limits: cpu: ${write_limits_cpu} memory: ${write_limits_memory} persistence: enableStatefulSetAutoDeletePVC: true size: 10Gi tolerations: - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" read: replicas: 3 resources: requests: cpu: ${read_requests_cpu} memory: ${read_requests_memory} limits: cpu: ${read_limits_cpu} memory: ${read_limits_memory} persistence: enableStatefulSetAutoDeletePVC: true size: 10Gi tolerations: - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" backend: replicas: 3 resources: requests: cpu: 200m memory: 256Mi limits: cpu: 500m memory: 768Mi persistence: enableStatefulSetAutoDeletePVC: true size: 10Gi tolerations: - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" singleBinary: replicas: 0 resources: requests: cpu: 200m memory: 256Mi limits: cpu: 500m memory: 768Mi persistence: enableStatefulSetAutoDeletePVC: true size: 100Gi tolerations: - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" gateway: resources: requests: cpu: 200m memory: 256Mi limits: cpu: 500m memory: 768Mi tolerations: - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" ## We used nginxConfig and extraContainers because they fix the issue. The explanation is provided below https://github.com/grafana/loki/issues/9522#issuecomment-2132161625 nginxConfig: serverSnippet: | location = /stub_status { stub_status on; allow 127.0.0.1; deny all; } location = /metrics { proxy_pass http://127.0.0.1:9113/metrics; } extraContainers: - name: nginx-exporter securityContext: allowPrivilegeEscalation: false image: nginx/nginx-prometheus-exporter:1.3.0 imagePullPolicy: IfNotPresent ports: - containerPort: 9113 name: http-exporter resources: limits: memory: 128Mi cpu: 500m requests: memory: 64Mi cpu: 100m chunksCache: tolerations: - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" resultsCache: tolerations: - key: "dedicated" operator: "Equal" value: "infra" effect: "NoSchedule" |
т.е. я удалил
и добавил
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
schemaConfig: configs: - from: 2022-01-11 store: boltdb-shipper object_store: s3 schema: v12 index: prefix: loki_index_ period: 24h - from: 2024-11-28 store: tsdb object_store: s3 schema: v13 index: prefix: loki_index_ period: 24h storageConfig: tsdb_shipper: active_index_directory: var/loki/index cache_location: var/loki/index_cache cache_ttl: 24h |
в которой store: boltdb-shipper это старое хранилище а store: tsdb новое хранилище.
обратите внимание что для tsdb нужно указать дату того дня когда вы делаете переезд, чтобы вы видели старые данные в моём случае это
- from: 2024-11-28
т.е. если указать дату 2024-11-27 то хоть в старом хранилище boltdb-shipper данные есть, вы их не увидите. так как в хранилище tsdb их нет за 2024-11-27
https://github.com/grafana/loki/issues/9522#issuecomment-2132161625
мы добавили экспортёр и location /metrics
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
nginxConfig: serverSnippet: | location = /stub_status { stub_status on; allow 127.0.0.1; deny all; } location = /metrics { proxy_pass http://127.0.0.1:9113/metrics; } extraContainers: - name: nginx-exporter securityContext: allowPrivilegeEscalation: false image: nginx/nginx-prometheus-exporter:1.3.0 imagePullPolicy: IfNotPresent ports: - containerPort: 9113 name: http-exporter resources: limits: memory: 128Mi cpu: 500m requests: memory: 64Mi cpu: 100m |