mirror of
https://github.com/bitnami/charts.git
synced 2026-03-07 08:07:55 +08:00
[bitnami/spark] Adds support for Spark metrics (#4605)
* Initial metrics support * Adds different annotations for master and worker * Adds support for prometheus-operator, and update the values-production and README * Fix indentation in values.yaml and values-production.yaml * Fixes annotations, and use PodMonitor instead of ServiceMonitor * Renames servicemonitor to podmonitor Co-authored-by: rafael <rafael@bitnami.com>
This commit is contained in:
committed by
GitHub
parent
129c744a38
commit
53bec24c61
@@ -21,4 +21,4 @@ name: spark
|
||||
sources:
|
||||
- https://github.com/bitnami/bitnami-docker-spark
|
||||
- https://spark.apache.org/
|
||||
version: 4.1.0
|
||||
version: 4.2.0
|
||||
|
||||
@@ -47,17 +47,35 @@ The command removes all the Kubernetes components associated with the chart and
|
||||
|
||||
The following tables lists the configurable parameters of the spark chart and their default values.
|
||||
|
||||
|
||||
### Global parameters
|
||||
|
||||
| Parameter | Description | Default |
|
||||
|---------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------|
|
||||
| `global.imageRegistry` | Global Docker image registry | `nil` |
|
||||
| `global.imagePullSecrets` | Global Docker registry secret names as an array | `[]` (does not add image pull secrets to deployed pods) |
|
||||
|
||||
### Common paramters
|
||||
|
||||
| Parameter | Description | Default |
|
||||
|---------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------|
|
||||
| `nameOverride` | String to partially override common.names.fullname template with a string (will prepend the release name) | `nil` |
|
||||
| `fullnameOverride` | String to fully override common.names.fullname template with a string | `nil` |
|
||||
|
||||
### Spark parameters
|
||||
|
||||
| Parameter | Description | Default |
|
||||
|---------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------|
|
||||
| `image.registry` | spark image registry | `docker.io` |
|
||||
| `image.repository` | spark Image name | `bitnami/spark` |
|
||||
| `image.tag` | spark Image tag | `{TAG_NAME}` |
|
||||
| `image.pullPolicy` | spark image pull policy | `IfNotPresent` |
|
||||
| `image.pullSecrets` | Specify docker-registry secret names as an array | `[]` (does not add image pull secrets to deployed pods) |
|
||||
| `nameOverride` | String to partially override common.names.fullname template with a string (will prepend the release name) | `nil` |
|
||||
| `fullnameOverride` | String to fully override common.names.fullname template with a string | `nil` |
|
||||
|
||||
### Spark master parameters
|
||||
|
||||
| Parameter | Description | Default |
|
||||
|---------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------|
|
||||
| `master.debug` | Specify if debug values should be set on the master | `false` |
|
||||
| `master.webPort` | Specify the port where the web interface will listen on the master | `8080` |
|
||||
| `master.clusterPort` | Specify the port where the master listens to communicate with workers | `7077` |
|
||||
@@ -87,6 +105,11 @@ The following tables lists the configurable parameters of the spark chart and th
|
||||
| `master.readinessProbe.timeoutSeconds` | When the probe times out | 5 |
|
||||
| `master.readinessProbe.failureThreshold` | Minimum consecutive failures for the probe to be considered failed after having succeeded. | 6 |
|
||||
| `master.readinessProbe.successThreshold` | Minimum consecutive successes for the probe to be considered successful after having failed | 1 |
|
||||
|
||||
### Spark worker parameters
|
||||
|
||||
| Parameter | Description | Default |
|
||||
|---------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------|
|
||||
| `worker.debug` | Specify if debug values should be set on workers | `false` |
|
||||
| `worker.webPort` | Specify the port where the web interface will listen on the worker | `8080` |
|
||||
| `worker.clusterPort` | Specify the port where the worker listens to communicate with the master | `7077` |
|
||||
@@ -124,6 +147,11 @@ The following tables lists the configurable parameters of the spark chart and th
|
||||
| `master.extraEnvVars` | Extra environment variables to pass to the worker container | `{}` |
|
||||
| `worker.extraVolumes` | Array of extra volumes to be added to the Spark worker deployment (evaluated as template). Requires setting `worker.extraVolumeMounts` | `nil` |
|
||||
| `worker.extraVolumeMounts` | Array of extra volume mounts to be added to the Spark worker deployment (evaluated as template). Normally used with `worker.extraVolumes`. | `nil` |
|
||||
|
||||
### Security paramters
|
||||
|
||||
| Parameter | Description | Default |
|
||||
|---------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------|
|
||||
| `security.passwordsSecretName` | Secret to use when using security configuration to set custom passwords | No default |
|
||||
| `security.rpc.authenticationEnabled` | Enable the RPC authentication | `false` |
|
||||
| `security.rpc.encryptionEnabled` | Enable the encryption for RPC | `false` |
|
||||
@@ -132,6 +160,11 @@ The following tables lists the configurable parameters of the spark chart and th
|
||||
| `security.ssl.needClientAuth` | Enable the client authentication | `false` |
|
||||
| `security.ssl.protocol` | Set the SSL protocol | `TLSv1.2` |
|
||||
| `security.certificatesSecretName` | Set the name of the secret that contains the certificates | No default |
|
||||
|
||||
### Exposure parameters
|
||||
|
||||
| Parameter | Description | Default |
|
||||
|---------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------|
|
||||
| `service.type` | Kubernetes Service type | `ClusterIP` |
|
||||
| `service.webPort` | Spark client port | `80` |
|
||||
| `service.clusterPort` | Spark cluster port | `7077` |
|
||||
@@ -149,6 +182,26 @@ The following tables lists the configurable parameters of the spark chart and th
|
||||
| `ingress.hosts[0].tlsHosts` | Array of TLS hosts for ingress record (defaults to `ingress.hosts[0].name` if `nil`) | `nil` |
|
||||
| `ingress.hosts[0].tlsSecret` | TLS Secret (certificates) | `spark.local-tls` |
|
||||
|
||||
### Metrics parameters
|
||||
|
||||
| Parameter | Description | Default |
|
||||
|-------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------|
|
||||
| `metrics.enabled` | Start a side-car prometheus exporter | `false` |
|
||||
| `metrics.service.port` | Service Metrics port | `9117` |
|
||||
| `metrics.service.annotations` | Annotations for enabling prometheus to access the metrics endpoints | `{prometheus.io/scrape: "true", prometheus.io/port: "9117"}` |
|
||||
| `metrics.resources.limits` | The resources limits for the metrics exporter container | `{}` |
|
||||
| `metrics.resources.requests` | The requested resources for the metrics exporter container | `{}` |
|
||||
| `metrics.podMonitor.enabled` | Create PodMonitor Resource for scraping metrics using PrometheusOperator | `false` |
|
||||
| `metrics.podMonitor.namespace` | Namespace where podmonitor resource should be created | `nil` |
|
||||
| `metrics.podMonitor.interval` | Specify the interval at which metrics should be scraped | `30s` |
|
||||
| `metrics.podMonitor.scrapeTimeout` | Specify the timeout after which the scrape is ended | `nil` |
|
||||
| `metrics.masterAnnotations` | Annotations for enabling prometheus to access the metrics endpoint of the master nodes | `{prometheus.io/scrape: "true", prometheus.io/port: "8080"}` |
|
||||
| `metrics.workerAnnotations` | Annotations for enabling prometheus to access the metrics endpoint of the worker nodes | `{prometheus.io/scrape: "true", prometheus.io/port: "8081"}` |
|
||||
| `metrics.prometheusRule.enabled` | Set this to true to create prometheusRules for Prometheus | `false` |
|
||||
| `metrics.prometheusRule.additionalLabels` | Additional labels that can be used so prometheusRules will be discovered by Prometheus | `{}` |
|
||||
| `metrics.prometheusRule.namespace` | namespace where prometheusRules resource should be created | the same namespace as spark |
|
||||
| `metrics.prometheusRule.rules` | [rules](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/) to be created, check values for an example. | `[]` |
|
||||
|
||||
Specify each parameter using the `--set key=value[,key=value]` argument to `helm install`. For example,
|
||||
|
||||
```console
|
||||
|
||||
29
bitnami/spark/templates/podmonitor.yaml
Normal file
29
bitnami/spark/templates/podmonitor.yaml
Normal file
@@ -0,0 +1,29 @@
|
||||
{{- if and .Values.metrics.enabled .Values.metrics.podMonitor.enabled }}
|
||||
apiVersion: monitoring.coreos.com/v1
|
||||
kind: PodMonitor
|
||||
metadata:
|
||||
name: {{ include "common.names.fullname" . }}
|
||||
{{- if .Values.metrics.podMonitor.namespace }}
|
||||
namespace: {{ .Values.metrics.podMonitor.namespace }}
|
||||
{{- end }}
|
||||
labels:
|
||||
{{- include "common.labels.standard" . | nindent 4 }}
|
||||
app.kubernetes.io/component: metrics
|
||||
{{- if .Values.metrics.podMonitor.additionalLabels }}
|
||||
{{- toYaml .Values.metrics.podMonitor.additionalLabels | nindent 4 }}
|
||||
{{- end }}
|
||||
spec:
|
||||
podMetricsEndpoints:
|
||||
- port: http
|
||||
{{- if .Values.metrics.podMonitor.interval }}
|
||||
interval: {{ .Values.metrics.podMonitor.interval }}
|
||||
{{- end }}
|
||||
{{- if .Values.metrics.podMonitor.scrapeTimeout }}
|
||||
scrapeTimeout: {{ .Values.metrics.podMonitor.scrapeTimeout }}
|
||||
{{- end }}
|
||||
namespaceSelector:
|
||||
matchNames:
|
||||
- {{ .Release.Namespace }}
|
||||
selector:
|
||||
matchLabels: {{- include "common.labels.matchLabels" . | nindent 6 }}
|
||||
{{- end }}
|
||||
23
bitnami/spark/templates/prometheusrule.yaml
Normal file
23
bitnami/spark/templates/prometheusrule.yaml
Normal file
@@ -0,0 +1,23 @@
|
||||
{{- if and .Values.metrics.enabled .Values.metrics.prometheusRule.enabled }}
|
||||
apiVersion: monitoring.coreos.com/v1
|
||||
kind: PrometheusRule
|
||||
metadata:
|
||||
name: {{ include "common.names.fullname" . }}
|
||||
{{- with .Values.metrics.prometheusRule.namespace }}
|
||||
namespace: {{ . }}
|
||||
{{- end }}
|
||||
labels:
|
||||
{{- include "common.labels.standard" . | nindent 4 }}
|
||||
{{- with .Values.metrics.prometheusRule.additionalLabels }}
|
||||
{{- toYaml . | nindent 4 }}
|
||||
{{- end }}
|
||||
{{- if .Values.commonAnnotations }}
|
||||
annotations: {{- include "common.tplvalues.render" ( dict "value" .Values.commonAnnotations "context" $ ) | nindent 4 }}
|
||||
{{- end }}
|
||||
spec:
|
||||
{{- with .Values.metrics.prometheusRule.rules }}
|
||||
groups:
|
||||
- name: {{ include "common.names.fullname" . }}
|
||||
rules: {{ tpl (toYaml .) $ | nindent 8 }}
|
||||
{{- end }}
|
||||
{{- end }}
|
||||
@@ -17,8 +17,14 @@ spec:
|
||||
{{- if .Values.master.extraPodLabels }}
|
||||
{{- include "common.tplvalues.render" (dict "value" .Values.master.extraPodLabels "context" $) | nindent 8 }}
|
||||
{{- end }}
|
||||
{{- if or .Values.master.podAnnotations .Values.metrics.enabled }}
|
||||
annotations:
|
||||
{{- if .Values.master.podAnnotations }}
|
||||
annotations: {{- include "common.tplvalues.render" (dict "value" .Values.master.podAnnotations "context" $) | nindent 8 }}
|
||||
{{- include "common.tplvalues.render" (dict "value" .Values.master.podAnnotations "context" $) | nindent 8 }}
|
||||
{{- end }}
|
||||
{{- if and .Values.metrics.enabled }}
|
||||
{{- include "common.tplvalues.render" ( dict "value" .Values.metrics.masterAnnotations "context" $) | nindent 8 }}
|
||||
{{- end }}
|
||||
{{- end }}
|
||||
spec:
|
||||
{{- include "spark.imagePullSecrets" . | nindent 6 }}
|
||||
@@ -66,6 +72,10 @@ spec:
|
||||
- name: BASH_DEBUG
|
||||
value: {{ ternary "1" "0" .Values.image.debug | quote }}
|
||||
{{- end }}
|
||||
{{- if .Values.metrics.enabled }}
|
||||
- name: SPARK_METRICS_ENABLED
|
||||
value: "true"
|
||||
{{- end}}
|
||||
- name: SPARK_DAEMON_MEMORY
|
||||
value: {{ .Values.master.daemonMemoryLimit | quote }}
|
||||
{{- if .Values.master.clusterPort }}
|
||||
|
||||
@@ -17,8 +17,14 @@ spec:
|
||||
{{- if .Values.worker.extraPodLabels }}
|
||||
{{- include "common.tplvalues.render" (dict "value" .Values.worker.extraPodLabels "context" $) | nindent 8 }}
|
||||
{{- end }}
|
||||
{{- if or .Values.worker.podAnnotations .Values.metrics.enabled }}
|
||||
annotations:
|
||||
{{- if .Values.worker.podAnnotations }}
|
||||
annotations: {{- include "common.tplvalues.render" (dict "value" .Values.worker.podAnnotations "context" $) | nindent 8 }}
|
||||
{{- include "common.tplvalues.render" (dict "value" .Values.worker.podAnnotations "context" $) | nindent 8 }}
|
||||
{{- end }}
|
||||
{{- if and .Values.metrics.enabled }}
|
||||
{{- include "common.tplvalues.render" ( dict "value" .Values.metrics.workerAnnotations "context" $) | nindent 8 }}
|
||||
{{- end }}
|
||||
{{- end }}
|
||||
spec:
|
||||
{{- include "spark.imagePullSecrets" . | nindent 6 }}
|
||||
@@ -68,6 +74,10 @@ spec:
|
||||
value: {{ ternary "1" "0" .Values.image.debug | quote }}
|
||||
- name: SPARK_DAEMON_MEMORY
|
||||
value: {{ .Values.worker.daemonMemoryLimit | quote }}
|
||||
{{- if .Values.metrics.enabled }}
|
||||
- name: SPARK_METRICS_ENABLED
|
||||
value: "true"
|
||||
{{- end}}
|
||||
## There are some environment variables whose existence needs
|
||||
## to be checked because Spark checks if they are null instead of an
|
||||
## empty string
|
||||
|
||||
@@ -259,6 +259,69 @@ worker:
|
||||
## Max number of workers when using autoscaling
|
||||
replicasMax: 10
|
||||
|
||||
## Metrics configuration
|
||||
##
|
||||
metrics:
|
||||
enabled: enabled
|
||||
|
||||
## Prometheus metrics service parameters
|
||||
##
|
||||
service:
|
||||
## Metrics port
|
||||
##
|
||||
port: 9117
|
||||
## Annotations for the Prometheus metics service
|
||||
##
|
||||
annotations:
|
||||
prometheus.io/scrape: "true"
|
||||
prometheus.io/port: "{{ .Values.metrics.service.port }}"
|
||||
|
||||
## Prometheus Service Monitor
|
||||
## ref: https://github.com/coreos/prometheus-operator
|
||||
## https://github.com/coreos/prometheus-operator/blob/master/Documentation/api.md#endpoint
|
||||
##
|
||||
podMonitor:
|
||||
## If the operator is installed in your cluster, set to true to create a Service Monitor Entry
|
||||
##
|
||||
enabled: false
|
||||
## Specify the namespace in which the podMonitor resource will be created
|
||||
# namespace: ""
|
||||
## Specify the interval at which metrics should be scraped
|
||||
##
|
||||
# interval: 30s
|
||||
## Specify the timeout after which the scrape is ended
|
||||
# scrapeTimeout: 10s
|
||||
|
||||
masterAnnotations:
|
||||
prometheus.io/scrape: 'true'
|
||||
prometheus.io/port: "{{ .Values.master.webPort }}"
|
||||
|
||||
workerAnnotations:
|
||||
prometheus.io/scrape: 'true'
|
||||
prometheus.io/port: "{{ .Values.worker.webPort }}"
|
||||
|
||||
## Custom PrometheusRule to be defined
|
||||
## The value is evaluated as a template, so, for example, the value can depend on .Release or .Chart
|
||||
## ref: https://github.com/coreos/prometheus-operator#customresourcedefinitions
|
||||
##
|
||||
prometheusRule:
|
||||
enabled: false
|
||||
additionalLabels: {}
|
||||
namespace: ''
|
||||
## These are just examples rules, please adapt them to your needs.
|
||||
## Make sure to constraint the rules to the current postgresql service.
|
||||
## rules:
|
||||
## - alert: HugeReplicationLag
|
||||
## expr: pg_replication_lag{service="{{ template "postgresql.fullname" . }}-metrics"} / 3600 > 1
|
||||
## for: 1m
|
||||
## labels:
|
||||
## severity: critical
|
||||
## annotations:
|
||||
## description: replication for {{ template "postgresql.fullname" . }} PostgreSQL is lagging by {{ "{{ $value }}" }} hour(s).
|
||||
## summary: PostgreSQL replication is lagging by {{ "{{ $value }}" }} hour(s).
|
||||
##
|
||||
rules: []
|
||||
|
||||
## Security configuration
|
||||
##
|
||||
security:
|
||||
@@ -276,6 +339,11 @@ security:
|
||||
##
|
||||
storageEncryptionEnabled: true
|
||||
|
||||
## Name of the secret that contains the certificates
|
||||
## It should contains two keys called "spark-keystore.jks" and "spark-truststore.jks" with the files in JKS format.
|
||||
##
|
||||
certificatesSecretName: my-certificates-secret
|
||||
|
||||
## SSL configuration
|
||||
##
|
||||
ssl:
|
||||
@@ -283,11 +351,6 @@ security:
|
||||
needClientAuth: true
|
||||
protocol: TLSv1.2
|
||||
|
||||
## Name of the secret that contains the certificates
|
||||
## It should contains two keys called "spark-keystore.jks" and "spark-truststore.jks" with the files in JKS format.
|
||||
##
|
||||
certificatesSecretName: my-certificates-secret
|
||||
|
||||
## Service parameters
|
||||
##
|
||||
service:
|
||||
|
||||
@@ -259,6 +259,69 @@ worker:
|
||||
## Max number of workers when using autoscaling
|
||||
replicasMax: 5
|
||||
|
||||
## Metrics configuration
|
||||
##
|
||||
metrics:
|
||||
enabled: false
|
||||
|
||||
## Prometheus metrics service parameters
|
||||
##
|
||||
service:
|
||||
## Metrics port
|
||||
##
|
||||
port: 9117
|
||||
## Annotations for the Prometheus metics service
|
||||
##
|
||||
annotations:
|
||||
prometheus.io/scrape: "true"
|
||||
prometheus.io/port: "{{ .Values.metrics.service.port }}"
|
||||
|
||||
## Prometheus Service Monitor
|
||||
## ref: https://github.com/coreos/prometheus-operator
|
||||
## https://github.com/coreos/prometheus-operator/blob/master/Documentation/api.md#endpoint
|
||||
##
|
||||
podMonitor:
|
||||
## If the operator is installed in your cluster, set to true to create a Service Monitor Entry
|
||||
##
|
||||
enabled: false
|
||||
## Specify the namespace in which the podMonitor resource will be created
|
||||
# namespace: ""
|
||||
## Specify the interval at which metrics should be scraped
|
||||
##
|
||||
# interval: 30s
|
||||
## Specify the timeout after which the scrape is ended
|
||||
# scrapeTimeout: 10s
|
||||
|
||||
masterAnnotations:
|
||||
prometheus.io/scrape: 'true'
|
||||
prometheus.io/port: "{{ .Values.master.webPort }}"
|
||||
|
||||
workerAnnotations:
|
||||
prometheus.io/scrape: 'true'
|
||||
prometheus.io/port: "{{ .Values.worker.webPort }}"
|
||||
|
||||
## Custom PrometheusRule to be defined
|
||||
## The value is evaluated as a template, so, for example, the value can depend on .Release or .Chart
|
||||
## ref: https://github.com/coreos/prometheus-operator#customresourcedefinitions
|
||||
##
|
||||
prometheusRule:
|
||||
enabled: false
|
||||
additionalLabels: {}
|
||||
namespace: ''
|
||||
## These are just examples rules, please adapt them to your needs.
|
||||
## Make sure to constraint the rules to the current postgresql service.
|
||||
## rules:
|
||||
## - alert: HugeReplicationLag
|
||||
## expr: pg_replication_lag{service="{{ template "postgresql.fullname" . }}-metrics"} / 3600 > 1
|
||||
## for: 1m
|
||||
## labels:
|
||||
## severity: critical
|
||||
## annotations:
|
||||
## description: replication for {{ template "postgresql.fullname" . }} PostgreSQL is lagging by {{ "{{ $value }}" }} hour(s).
|
||||
## summary: PostgreSQL replication is lagging by {{ "{{ $value }}" }} hour(s).
|
||||
##
|
||||
rules: []
|
||||
|
||||
## Security configuration
|
||||
##
|
||||
security:
|
||||
@@ -319,7 +382,7 @@ service:
|
||||
## set the LoadBalancer service type to internal only.
|
||||
## ref: https://kubernetes.io/docs/concepts/services-networking/service/#internal-load-balancer
|
||||
##
|
||||
annotations: {}
|
||||
annotations:
|
||||
|
||||
## Ingress paramaters
|
||||
##
|
||||
|
||||
Reference in New Issue
Block a user