[bitnami/spark] Release 4.1.1-debian-12-r6 (#91154)

Signed-off-by: Bitnami Bot <bitnami.bot@broadcom.com>
This commit is contained in:
Bitnami Bot
2026-02-27 16:03:31 +01:00
committed by GitHub
parent 8d845ff11c
commit c8264f1daa
2 changed files with 61 additions and 138 deletions

View File

@@ -8,7 +8,7 @@ ARG JAVA_EXTRA_SECURITY_DIR="/bitnami/java/extra-security"
ARG TARGETARCH
LABEL org.opencontainers.image.base.name="docker.io/bitnami/minideb:bookworm" \
org.opencontainers.image.created="2026-02-19T04:59:03Z" \
org.opencontainers.image.created="2026-02-27T14:11:56Z" \
org.opencontainers.image.description="Application packaged by Broadcom, Inc." \
org.opencontainers.image.documentation="https://github.com/bitnami/containers/tree/main/bitnami/spark/README.md" \
org.opencontainers.image.source="https://github.com/bitnami/containers/tree/main/bitnami/spark" \
@@ -31,9 +31,9 @@ RUN --mount=type=secret,id=downloads_url,env=SECRET_DOWNLOADS_URL \
mkdir -p /tmp/bitnami/pkg/cache/ ; cd /tmp/bitnami/pkg/cache/ || exit 1 ; \
COMPONENTS=( \
"nss-wrapper-1.1.16-0-linux-${OS_ARCH}-debian-12" \
"python-3.12.12-16-linux-${OS_ARCH}-debian-12" \
"python-3.12.12-23-linux-${OS_ARCH}-debian-12" \
"jre-17.0.18-10-1-linux-${OS_ARCH}-debian-12" \
"spark-4.1.1-1-linux-${OS_ARCH}-debian-12" \
"spark-4.1.1-2-linux-${OS_ARCH}-debian-12" \
) ; \
for COMPONENT in "${COMPONENTS[@]}"; do \
if [ ! -f "${COMPONENT}.tar.gz" ]; then \
@@ -56,7 +56,7 @@ RUN /opt/bitnami/scripts/spark/postunpack.sh
RUN /opt/bitnami/scripts/java/postunpack.sh
ENV APP_VERSION="4.1.1" \
BITNAMI_APP_NAME="spark" \
IMAGE_REVISION="5" \
IMAGE_REVISION="6" \
JAVA_HOME="/opt/bitnami/java" \
LD_LIBRARY_PATH="/opt/bitnami/python/lib:/opt/bitnami/spark/venv/lib/python3.12/site-packages/numpy.libs:$LD_LIBRARY_PATH" \
LIBNSS_WRAPPER_PATH="/opt/bitnami/common/lib/libnss_wrapper.so" \

View File

@@ -1,15 +1,13 @@
# Bitnami Secure Image for Apache Spark
## What is Apache Spark?
> Apache Spark is a high-performance engine for large-scale computing tasks, such as data processing, machine learning and real-time data streaming. It includes APIs for Java, Python, Scala and R.
[Overview of Apache Spark](https://spark.apache.org/)
Trademarks: This software listing is packaged by Bitnami. The respective trademarks mentioned in the offering are owned by the respective companies, and use of them does not imply any affiliation or endorsement.
## TL;DR
## <a id="tl-dr"></a> TL;DR
### Docker Compose
### <a id="docker-compose"></a> Docker Compose
```console
docker run --name spark bitnami/spark:latest
@@ -17,7 +15,7 @@ docker run --name spark bitnami/spark:latest
You can find the available configuration options in the [Environment Variables](#environment-variables) section.
## Why use Bitnami Secure Images?
## <a id="why-use-bitnami-secure-images"></a> Why use Bitnami Secure Images?
Those are hardened, minimal CVE images built and maintained by Bitnami. Bitnami Secure Images are based on the cloud-optimized, security-hardened enterprise [OS Photon Linux](https://vmware.github.io/photon/). Why choose BSI images?
@@ -34,19 +32,19 @@ Each image comes with valuable security metadata. You can view the metadata in [
If you are looking for our previous generation of images based on Debian Linux, please see the [Bitnami Legacy registry](https://hub.docker.com/u/bitnamilegacy).
## How to deploy Apache Spark in Kubernetes?
## <a id="deploy-in-kubernetes"></a> How to deploy Apache Spark in Kubernetes
Deploying Bitnami applications as Helm Charts is the easiest way to get started with our applications on Kubernetes. Read more about the installation in the [Bitnami Apache Spark Chart GitHub repository](https://github.com/bitnami/charts/tree/master/bitnami/spark).
## Why use a non-root container?
## <a id="why-non-root"></a> Why use a non-root container?
Non-root container images add an extra layer of security and are generally recommended for production environments. However, because they run as a non-root user, privileged tasks are typically off-limits. Learn more about non-root containers [in our docs](https://techdocs.broadcom.com/us/en/vmware-tanzu/application-catalog/tanzu-application-catalog/services/tac-doc/apps-tutorials-work-with-non-root-containers-index.html).
## Supported tags and respective `Dockerfile` links
## <a id="supported-tags"></a> Supported tags and respective `Dockerfile` links
Learn more about the Bitnami tagging policy and the difference between rolling tags and immutable tags [in our documentation page](https://techdocs.broadcom.com/us/en/vmware-tanzu/application-catalog/tanzu-application-catalog/services/tac-doc/apps-tutorials-understand-rolling-tags-containers-index.html).
## Get this image
## <a id="get-this-image"></a> Get this image
The recommended way to get the Bitnami Apache Spark Docker Image is to pull the prebuilt image from the [Docker Hub Registry](https://hub.docker.com/r/bitnami/spark).
@@ -70,9 +68,17 @@ cd bitnami/APP/VERSION/OPERATING-SYSTEM
docker build -t bitnami/APP:latest .
```
## Configuration
## <a id="using-docker-compose"></a> Using `docker-compose.yaml`
### Environment variables
Please be aware this file has not undergone internal testing. Consequently, we advise its use exclusively for development or testing purposes. For production-ready deployments, we highly recommend utilizing its associated [Bitnami Helm chart](https://github.com/bitnami/charts/tree/main/bitnami/spark).
## <a id="configuration"></a> Configuration
The following sections describe environment variables and related settings.
### <a id="environment-variables"></a> Environment variables
The following tables list the main variables you can set.
#### Customizable environment variables
@@ -117,30 +123,11 @@ Additionally, more environment variables natively supported by Apache Spark can
For example, you could still use `SPARK_WORKER_CORES` or `SPARK_WORKER_MEMORY` to configure the number of cores and the amount of memory to be used by a worker machine.
When you start the spark image, you can adjust the configuration of the instance by passing one or more environment variables either on the docker-compose file or on the `docker run` command line. If you want to add a new environment variable:
When you start the spark image, you can adjust the configuration of the instance by passing one or more environment variables either on the docker-compose file or on the `docker run` command line.
- For docker-compose add the variable name and value under the application section in the [`docker-compose.yml`](https://github.com/bitnami/containers/blob/main/bitnami/spark/docker-compose.yml) file present in this repository:
### <a id="security"></a> Security
```yaml
spark:
...
environment:
- SPARK_MODE=master
...
```
- For manual execution add a -e option with each variable and value:
```console
docker run -d --name spark \
--network=spark_network \
-e SPARK_MODE=master \
bitnami/spark
```
### Security
The Bitnami Apache Spark docker image supports enabling RPC authentication, RPC encryption and local storage encryption easily using the following env vars in all the nodes of the cluster.
The Bitnami Apache Spark docker image supports enabling RPC authentication, RPC encryption and local storage encryption easily using the following environment variables in all the nodes of the cluster.
```diff
+ SPARK_RPC_AUTHENTICATION_ENABLED=yes
@@ -154,7 +141,7 @@ The Bitnami Apache Spark docker image supports enabling RPC authentication, RPC
Additionally, SSL configuration can be easily activated following the next steps:
1. Enable SSL configuration by setting the following env vars:
1. Enable SSL configuration by setting the following environment variables:
```diff
+ SPARK_SSL_ENABLED=yes
@@ -167,11 +154,11 @@ Additionally, SSL configuration can be easily activated following the next steps
> Please note that `KEY_PASSWORD`, `KEYSTORE_PASSWORD`, and `TRUSTSTORE_PASSWORD` are placeholders that needs to be updated with a correct value.
2. You need to mount your spark keystore and truststore files to `/opt/bitnami/spark/conf/certs`. Please note they should be called `spark-keystore.jks` and `spark-truststore.jks` and they should be in JKS format.
2. You need to mount your spark `keystore` and `truststore` files to `/opt/bitnami/spark/conf/certs`. Please note they should be called `spark-keystore.jks` and `spark-truststore.jks` and they should be in JKS format.
### Setting up an Apache Spark Cluster
### <a id="setting-up-spark-cluster"></a> Setting up an Apache Spark cluster
A Apache Spark cluster can easily be setup with the default docker-compose.yml file from the root of this repo. The docker-compose includes two different services, `spark-master` and `spark-worker.`
A Apache Spark cluster can easily be set up with the default docker-compose.yml file from the root of this repository. The docker-compose includes two different services, `spark-master` and `spark-worker.`
By default, when you deploy the docker-compose file you will get an Apache Spark cluster with 1 master and 1 worker.
@@ -181,28 +168,11 @@ If you want N workers, all you need to do is start the docker-compose deployment
docker-compose up --scale spark-worker=3
```
### Mount a custom configuration file
### <a id="mount-custom-configuration"></a> Mount a custom configuration file
The image looks for configuration in the `conf/` directory of `/opt/bitnami/spark`.
#### Using docker-compose
```yaml
...
volumes:
- /path/to/spark-defaults.conf:/opt/bitnami/spark/conf/spark-defaults.conf
...
```
#### Using the command line
```console
docker run --name spark -v /path/to/spark-defaults.conf:/opt/bitnami/spark/conf/spark-defaults.conf bitnami/spark:latest
```
After that, your changes will be taken into account in the server's behaviour.
### Installing additional jars
### <a id="installing-additional-jars"></a> Installing additional jars
By default, this container bundles a generic set of jar files but the default image can be extended to add as many jars as needed for your specific use case. For instance, the following Dockerfile adds [`aws-java-sdk-bundle-1.11.704.jar`](https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk-bundle/1.11.704):
@@ -218,7 +188,7 @@ RUN curl https://repo1.maven.org/maven2/com/amazonaws/aws-java-sdk-bundle/1.11.7
In a similar way that in the previous section, you may want to use a different version of Hadoop jars.
Go to <https://spark.apache.org/downloads.html> and copy the download url bundling the Hadoop version you want and matching the Apache Spark version of the container. Extend the Bitnami container image as below:
Go to <https://spark.apache.org/downloads.html> and copy the download URL bundling the Hadoop version you want and matching the Apache Spark version of the container. Extend the Bitnami container image as below:
```Dockerfile
FROM bitnami/spark:latest
@@ -238,13 +208,13 @@ $ pyspark
'2.7.4'
```
### FIPS configuration in Bitnami Secure Images
### <a id="fips-configuration"></a> FIPS configuration in Bitnami Secure Images
The Bitnami Apache Spark Docker image from the [Bitnami Secure Images](https://go-vmware.broadcom.com/contact-us) catalog includes extra features and settings to configure the container with FIPS capabilities. You can configure the next environment variables:
- `OPENSSL_FIPS`: whether OpenSSL runs in FIPS mode or not. `yes` (default), `no`.
## Logging
## <a id="logging"></a> Logging
The Bitnami Apache Spark Docker image sends the container logs to the `stdout`. To view the logs:
@@ -260,43 +230,45 @@ docker-compose logs spark
You can configure the containers [logging driver](https://docs.docker.com/engine/admin/logging/overview/) using the `--log-driver` option if you wish to consume the container logs differently. In the default configuration docker uses the `json-file` driver.
## Maintenance
## <a id="maintenance"></a> Maintenance
### Backing up your container
The following sections describe how to back up, restore, and upgrade the image.
### <a id="backing-up-container"></a> Backing up your container
To backup your data, configuration and logs, follow these simple steps:
#### Step 1: Stop the currently running container
1. Stop the currently running container
```console
docker stop spark
```
```console
docker stop spark
```
or using Docker Compose:
or using Docker Compose:
```console
docker-compose stop spark
```
```console
docker-compose stop spark
```
#### Step 2: Run the backup command
2. Run the backup command
We need to mount two volumes in a container we will use to create the backup: a directory on your host to store the backup in, and the volumes from the container we just stopped so we can access the data.
We need to mount two volumes in a container we will use to create the backup: a directory on your host to store the backup in, and the volumes from the container we just stopped so we can access the data.
```console
docker run --rm -v /path/to/spark-backups:/backups --volumes-from spark busybox \
cp -a /bitnami/spark /backups/latest
```
```console
docker run --rm -v /path/to/spark-backups:/backups --volumes-from spark busybox \
cp -a /bitnami/spark /backups/latest
```
or using Docker Compose:
or using Docker Compose:
```console
docker run --rm -v /path/to/spark-backups:/backups --volumes-from `docker-compose ps -q spark` busybox \
cp -a /bitnami/spark /backups/latest
```
```console
docker run --rm -v /path/to/spark-backups:/backups --volumes-from `docker-compose ps -q spark` busybox \
cp -a /bitnami/spark /backups/latest
```
### Restoring a backup
### <a id="restoring-backup"></a> Restoring your backup
Restoring a backup is as simple as mounting the backup as volumes in the container.
To restore your backup, mount the backup as volumes in the container.
```console
docker run -v /path/to/spark-backups/latest:/bitnami/spark bitnami/spark:latest
@@ -313,52 +285,9 @@ services:
...
```
### Upgrade this image
## <a id="notable-changes"></a> Notable changes
Bitnami provides up-to-date versions of spark, including security patches, soon after they are made upstream. We recommend that you follow these steps to upgrade your container.
#### Step 1: Get the updated image
```console
docker pull bitnami/spark:latest
```
or if you're using Docker Compose, update the value of the image property to
`bitnami/spark:latest`.
#### Step 2: Stop and backup the currently running container
Before continuing, you should backup your container's data, configuration and logs.
Follow the steps on [creating a backup](#backing-up-your-container).
#### Step 3: Remove the currently running container
```console
docker rm -v spark
```
or using Docker Compose:
```console
docker-compose rm -v spark
```
#### Step 4: Run the new image
Re-create your container from the new image, [restoring your backup](#restoring-a-backup) if necessary.
```console
docker run --name spark bitnami/spark:latest
```
or using Docker Compose:
```console
docker-compose up spark
```
## Notable Changes
The following subsections describe notable changes.
### 4.0.0-debian-10-r2
@@ -372,13 +301,7 @@ docker-compose up spark
- This image now has an aws-cli and two jars: `hadoop-aws` and `aws-java-sdk` for provide an easier way to use AWS.
## Using `docker-compose.yaml`
Please be aware this file has not undergone internal testing. Consequently, we advise its use exclusively for development or testing purposes. For production-ready deployments, we highly recommend utilizing its associated [Bitnami Helm chart](https://github.com/bitnami/charts/tree/main/bitnami/spark).
If you detect any issue in the `docker-compose.yaml` file, feel free to report it or contribute with a fix by following our [Contributing Guidelines](https://github.com/bitnami/containers/blob/main/CONTRIBUTING.md).
## License
## <a id="license"></a> License
Copyright &copy; 2026 Broadcom. The term "Broadcom" refers to Broadcom Inc. and/or its subsidiaries.