mirror of
https://github.com/bitnami/containers.git
synced 2026-03-29 00:09:34 +08:00
[bitnami/spark] Release 4.1.1-debian-12-r6 (#91154)
Signed-off-by: Bitnami Bot <bitnami.bot@broadcom.com>
This commit is contained in:
@@ -8,7 +8,7 @@ ARG JAVA_EXTRA_SECURITY_DIR="/bitnami/java/extra-security"
|
||||
ARG TARGETARCH
|
||||
|
||||
LABEL org.opencontainers.image.base.name="docker.io/bitnami/minideb:bookworm" \
|
||||
org.opencontainers.image.created="2026-02-19T04:59:03Z" \
|
||||
org.opencontainers.image.created="2026-02-27T14:11:56Z" \
|
||||
org.opencontainers.image.description="Application packaged by Broadcom, Inc." \
|
||||
org.opencontainers.image.documentation="https://github.com/bitnami/containers/tree/main/bitnami/spark/README.md" \
|
||||
org.opencontainers.image.source="https://github.com/bitnami/containers/tree/main/bitnami/spark" \
|
||||
@@ -31,9 +31,9 @@ RUN --mount=type=secret,id=downloads_url,env=SECRET_DOWNLOADS_URL \
|
||||
mkdir -p /tmp/bitnami/pkg/cache/ ; cd /tmp/bitnami/pkg/cache/ || exit 1 ; \
|
||||
COMPONENTS=( \
|
||||
"nss-wrapper-1.1.16-0-linux-${OS_ARCH}-debian-12" \
|
||||
"python-3.12.12-16-linux-${OS_ARCH}-debian-12" \
|
||||
"python-3.12.12-23-linux-${OS_ARCH}-debian-12" \
|
||||
"jre-17.0.18-10-1-linux-${OS_ARCH}-debian-12" \
|
||||
"spark-4.1.1-1-linux-${OS_ARCH}-debian-12" \
|
||||
"spark-4.1.1-2-linux-${OS_ARCH}-debian-12" \
|
||||
) ; \
|
||||
for COMPONENT in "${COMPONENTS[@]}"; do \
|
||||
if [ ! -f "${COMPONENT}.tar.gz" ]; then \
|
||||
@@ -56,7 +56,7 @@ RUN /opt/bitnami/scripts/spark/postunpack.sh
|
||||
RUN /opt/bitnami/scripts/java/postunpack.sh
|
||||
ENV APP_VERSION="4.1.1" \
|
||||
BITNAMI_APP_NAME="spark" \
|
||||
IMAGE_REVISION="5" \
|
||||
IMAGE_REVISION="6" \
|
||||
JAVA_HOME="/opt/bitnami/java" \
|
||||
LD_LIBRARY_PATH="/opt/bitnami/python/lib:/opt/bitnami/spark/venv/lib/python3.12/site-packages/numpy.libs:$LD_LIBRARY_PATH" \
|
||||
LIBNSS_WRAPPER_PATH="/opt/bitnami/common/lib/libnss_wrapper.so" \
|
||||
|
||||
@@ -1,15 +1,13 @@
|
||||
# Bitnami Secure Image for Apache Spark
|
||||
|
||||
## What is Apache Spark?
|
||||
|
||||
> Apache Spark is a high-performance engine for large-scale computing tasks, such as data processing, machine learning and real-time data streaming. It includes APIs for Java, Python, Scala and R.
|
||||
|
||||
[Overview of Apache Spark](https://spark.apache.org/)
|
||||
Trademarks: This software listing is packaged by Bitnami. The respective trademarks mentioned in the offering are owned by the respective companies, and use of them does not imply any affiliation or endorsement.
|
||||
|
||||
## TL;DR
|
||||
## <a id="tl-dr"></a> TL;DR
|
||||
|
||||
### Docker Compose
|
||||
### <a id="docker-compose"></a> Docker Compose
|
||||
|
||||
```console
|
||||
docker run --name spark bitnami/spark:latest
|
||||
@@ -17,7 +15,7 @@ docker run --name spark bitnami/spark:latest
|
||||
|
||||
You can find the available configuration options in the [Environment Variables](#environment-variables) section.
|
||||
|
||||
## Why use Bitnami Secure Images?
|
||||
## <a id="why-use-bitnami-secure-images"></a> Why use Bitnami Secure Images?
|
||||
|
||||
Those are hardened, minimal CVE images built and maintained by Bitnami. Bitnami Secure Images are based on the cloud-optimized, security-hardened enterprise [OS Photon Linux](https://vmware.github.io/photon/). Why choose BSI images?
|
||||
|
||||
@@ -34,19 +32,19 @@ Each image comes with valuable security metadata. You can view the metadata in [
|
||||
|
||||
If you are looking for our previous generation of images based on Debian Linux, please see the [Bitnami Legacy registry](https://hub.docker.com/u/bitnamilegacy).
|
||||
|
||||
## How to deploy Apache Spark in Kubernetes?
|
||||
## <a id="deploy-in-kubernetes"></a> How to deploy Apache Spark in Kubernetes
|
||||
|
||||
Deploying Bitnami applications as Helm Charts is the easiest way to get started with our applications on Kubernetes. Read more about the installation in the [Bitnami Apache Spark Chart GitHub repository](https://github.com/bitnami/charts/tree/master/bitnami/spark).
|
||||
|
||||
## Why use a non-root container?
|
||||
## <a id="why-non-root"></a> Why use a non-root container?
|
||||
|
||||
Non-root container images add an extra layer of security and are generally recommended for production environments. However, because they run as a non-root user, privileged tasks are typically off-limits. Learn more about non-root containers [in our docs](https://techdocs.broadcom.com/us/en/vmware-tanzu/application-catalog/tanzu-application-catalog/services/tac-doc/apps-tutorials-work-with-non-root-containers-index.html).
|
||||
|
||||
## Supported tags and respective `Dockerfile` links
|
||||
## <a id="supported-tags"></a> Supported tags and respective `Dockerfile` links
|
||||
|
||||
Learn more about the Bitnami tagging policy and the difference between rolling tags and immutable tags [in our documentation page](https://techdocs.broadcom.com/us/en/vmware-tanzu/application-catalog/tanzu-application-catalog/services/tac-doc/apps-tutorials-understand-rolling-tags-containers-index.html).
|
||||
|
||||
## Get this image
|
||||
## <a id="get-this-image"></a> Get this image
|
||||
|
||||
The recommended way to get the Bitnami Apache Spark Docker Image is to pull the prebuilt image from the [Docker Hub Registry](https://hub.docker.com/r/bitnami/spark).
|
||||
|
||||
@@ -70,9 +68,17 @@ cd bitnami/APP/VERSION/OPERATING-SYSTEM
|
||||
docker build -t bitnami/APP:latest .
|
||||
```
|
||||
|
||||
## Configuration
|
||||
## <a id="using-docker-compose"></a> Using `docker-compose.yaml`
|
||||
|
||||
### Environment variables
|
||||
Please be aware this file has not undergone internal testing. Consequently, we advise its use exclusively for development or testing purposes. For production-ready deployments, we highly recommend utilizing its associated [Bitnami Helm chart](https://github.com/bitnami/charts/tree/main/bitnami/spark).
|
||||
|
||||
## <a id="configuration"></a> Configuration
|
||||
|
||||
The following sections describe environment variables and related settings.
|
||||
|
||||
### <a id="environment-variables"></a> Environment variables
|
||||
|
||||
The following tables list the main variables you can set.
|
||||
|
||||
#### Customizable environment variables
|
||||
|
||||
@@ -117,30 +123,11 @@ Additionally, more environment variables natively supported by Apache Spark can
|
||||
|
||||
For example, you could still use `SPARK_WORKER_CORES` or `SPARK_WORKER_MEMORY` to configure the number of cores and the amount of memory to be used by a worker machine.
|
||||
|
||||
When you start the spark image, you can adjust the configuration of the instance by passing one or more environment variables either on the docker-compose file or on the `docker run` command line. If you want to add a new environment variable:
|
||||
When you start the spark image, you can adjust the configuration of the instance by passing one or more environment variables either on the docker-compose file or on the `docker run` command line.
|
||||
|
||||
- For docker-compose add the variable name and value under the application section in the [`docker-compose.yml`](https://github.com/bitnami/containers/blob/main/bitnami/spark/docker-compose.yml) file present in this repository:
|
||||
### <a id="security"></a> Security
|
||||
|
||||
```yaml
|
||||
spark:
|
||||
...
|
||||
environment:
|
||||
- SPARK_MODE=master
|
||||
...
|
||||
```
|
||||
|
||||
- For manual execution add a -e option with each variable and value:
|
||||
|
||||
```console
|
||||
docker run -d --name spark \
|
||||
--network=spark_network \
|
||||
-e SPARK_MODE=master \
|
||||
bitnami/spark
|
||||
```
|
||||
|
||||
### Security
|
||||
|
||||
The Bitnami Apache Spark docker image supports enabling RPC authentication, RPC encryption and local storage encryption easily using the following env vars in all the nodes of the cluster.
|
||||
The Bitnami Apache Spark docker image supports enabling RPC authentication, RPC encryption and local storage encryption easily using the following environment variables in all the nodes of the cluster.
|
||||
|
||||
```diff
|
||||
+ SPARK_RPC_AUTHENTICATION_ENABLED=yes
|
||||
@@ -154,7 +141,7 @@ The Bitnami Apache Spark docker image supports enabling RPC authentication, RPC
|
||||
|
||||
Additionally, SSL configuration can be easily activated following the next steps:
|
||||
|
||||
1. Enable SSL configuration by setting the following env vars:
|
||||
1. Enable SSL configuration by setting the following environment variables:
|
||||
|
||||
```diff
|
||||
+ SPARK_SSL_ENABLED=yes
|
||||
@@ -167,11 +154,11 @@ Additionally, SSL configuration can be easily activated following the next steps
|
||||
|
||||
> Please note that `KEY_PASSWORD`, `KEYSTORE_PASSWORD`, and `TRUSTSTORE_PASSWORD` are placeholders that needs to be updated with a correct value.
|
||||
|
||||
2. You need to mount your spark keystore and truststore files to `/opt/bitnami/spark/conf/certs`. Please note they should be called `spark-keystore.jks` and `spark-truststore.jks` and they should be in JKS format.
|
||||
2. You need to mount your spark `keystore` and `truststore` files to `/opt/bitnami/spark/conf/certs`. Please note they should be called `spark-keystore.jks` and `spark-truststore.jks` and they should be in JKS format.
|
||||
|
||||
### Setting up an Apache Spark Cluster
|
||||
### <a id="setting-up-spark-cluster"></a> Setting up an Apache Spark cluster
|
||||
|
||||
A Apache Spark cluster can easily be setup with the default docker-compose.yml file from the root of this repo. The docker-compose includes two different services, `spark-master` and `spark-worker.`
|
||||
A Apache Spark cluster can easily be set up with the default docker-compose.yml file from the root of this repository. The docker-compose includes two different services, `spark-master` and `spark-worker.`
|
||||
|
||||
By default, when you deploy the docker-compose file you will get an Apache Spark cluster with 1 master and 1 worker.
|
||||
|
||||
@@ -181,28 +168,11 @@ If you want N workers, all you need to do is start the docker-compose deployment
|
||||
docker-compose up --scale spark-worker=3
|
||||
```
|
||||
|
||||
### Mount a custom configuration file
|
||||
### <a id="mount-custom-configuration"></a> Mount a custom configuration file
|
||||
|
||||
The image looks for configuration in the `conf/` directory of `/opt/bitnami/spark`.
|
||||
|
||||
#### Using docker-compose
|
||||
|
||||
```yaml
|
||||
...
|
||||
volumes:
|
||||
- /path/to/spark-defaults.conf:/opt/bitnami/spark/conf/spark-defaults.conf
|
||||
...
|
||||
```
|
||||
|
||||
#### Using the command line
|
||||
|
||||
```console
|
||||
docker run --name spark -v /path/to/spark-defaults.conf:/opt/bitnami/spark/conf/spark-defaults.conf bitnami/spark:latest
|
||||
```
|
||||
|
||||
After that, your changes will be taken into account in the server's behaviour.
|
||||
|
||||
### Installing additional jars
|
||||
### <a id="installing-additional-jars"></a> Installing additional jars
|
||||
|
||||
By default, this container bundles a generic set of jar files but the default image can be extended to add as many jars as needed for your specific use case. For instance, the following Dockerfile adds [`aws-java-sdk-bundle-1.11.704.jar`](https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk-bundle/1.11.704):
|
||||
|
||||
@@ -218,7 +188,7 @@ RUN curl https://repo1.maven.org/maven2/com/amazonaws/aws-java-sdk-bundle/1.11.7
|
||||
|
||||
In a similar way that in the previous section, you may want to use a different version of Hadoop jars.
|
||||
|
||||
Go to <https://spark.apache.org/downloads.html> and copy the download url bundling the Hadoop version you want and matching the Apache Spark version of the container. Extend the Bitnami container image as below:
|
||||
Go to <https://spark.apache.org/downloads.html> and copy the download URL bundling the Hadoop version you want and matching the Apache Spark version of the container. Extend the Bitnami container image as below:
|
||||
|
||||
```Dockerfile
|
||||
FROM bitnami/spark:latest
|
||||
@@ -238,13 +208,13 @@ $ pyspark
|
||||
'2.7.4'
|
||||
```
|
||||
|
||||
### FIPS configuration in Bitnami Secure Images
|
||||
### <a id="fips-configuration"></a> FIPS configuration in Bitnami Secure Images
|
||||
|
||||
The Bitnami Apache Spark Docker image from the [Bitnami Secure Images](https://go-vmware.broadcom.com/contact-us) catalog includes extra features and settings to configure the container with FIPS capabilities. You can configure the next environment variables:
|
||||
|
||||
- `OPENSSL_FIPS`: whether OpenSSL runs in FIPS mode or not. `yes` (default), `no`.
|
||||
|
||||
## Logging
|
||||
## <a id="logging"></a> Logging
|
||||
|
||||
The Bitnami Apache Spark Docker image sends the container logs to the `stdout`. To view the logs:
|
||||
|
||||
@@ -260,43 +230,45 @@ docker-compose logs spark
|
||||
|
||||
You can configure the containers [logging driver](https://docs.docker.com/engine/admin/logging/overview/) using the `--log-driver` option if you wish to consume the container logs differently. In the default configuration docker uses the `json-file` driver.
|
||||
|
||||
## Maintenance
|
||||
## <a id="maintenance"></a> Maintenance
|
||||
|
||||
### Backing up your container
|
||||
The following sections describe how to back up, restore, and upgrade the image.
|
||||
|
||||
### <a id="backing-up-container"></a> Backing up your container
|
||||
|
||||
To backup your data, configuration and logs, follow these simple steps:
|
||||
|
||||
#### Step 1: Stop the currently running container
|
||||
1. Stop the currently running container
|
||||
|
||||
```console
|
||||
docker stop spark
|
||||
```
|
||||
```console
|
||||
docker stop spark
|
||||
```
|
||||
|
||||
or using Docker Compose:
|
||||
or using Docker Compose:
|
||||
|
||||
```console
|
||||
docker-compose stop spark
|
||||
```
|
||||
```console
|
||||
docker-compose stop spark
|
||||
```
|
||||
|
||||
#### Step 2: Run the backup command
|
||||
2. Run the backup command
|
||||
|
||||
We need to mount two volumes in a container we will use to create the backup: a directory on your host to store the backup in, and the volumes from the container we just stopped so we can access the data.
|
||||
We need to mount two volumes in a container we will use to create the backup: a directory on your host to store the backup in, and the volumes from the container we just stopped so we can access the data.
|
||||
|
||||
```console
|
||||
docker run --rm -v /path/to/spark-backups:/backups --volumes-from spark busybox \
|
||||
cp -a /bitnami/spark /backups/latest
|
||||
```
|
||||
```console
|
||||
docker run --rm -v /path/to/spark-backups:/backups --volumes-from spark busybox \
|
||||
cp -a /bitnami/spark /backups/latest
|
||||
```
|
||||
|
||||
or using Docker Compose:
|
||||
or using Docker Compose:
|
||||
|
||||
```console
|
||||
docker run --rm -v /path/to/spark-backups:/backups --volumes-from `docker-compose ps -q spark` busybox \
|
||||
cp -a /bitnami/spark /backups/latest
|
||||
```
|
||||
```console
|
||||
docker run --rm -v /path/to/spark-backups:/backups --volumes-from `docker-compose ps -q spark` busybox \
|
||||
cp -a /bitnami/spark /backups/latest
|
||||
```
|
||||
|
||||
### Restoring a backup
|
||||
### <a id="restoring-backup"></a> Restoring your backup
|
||||
|
||||
Restoring a backup is as simple as mounting the backup as volumes in the container.
|
||||
To restore your backup, mount the backup as volumes in the container.
|
||||
|
||||
```console
|
||||
docker run -v /path/to/spark-backups/latest:/bitnami/spark bitnami/spark:latest
|
||||
@@ -313,52 +285,9 @@ services:
|
||||
...
|
||||
```
|
||||
|
||||
### Upgrade this image
|
||||
## <a id="notable-changes"></a> Notable changes
|
||||
|
||||
Bitnami provides up-to-date versions of spark, including security patches, soon after they are made upstream. We recommend that you follow these steps to upgrade your container.
|
||||
|
||||
#### Step 1: Get the updated image
|
||||
|
||||
```console
|
||||
docker pull bitnami/spark:latest
|
||||
```
|
||||
|
||||
or if you're using Docker Compose, update the value of the image property to
|
||||
`bitnami/spark:latest`.
|
||||
|
||||
#### Step 2: Stop and backup the currently running container
|
||||
|
||||
Before continuing, you should backup your container's data, configuration and logs.
|
||||
|
||||
Follow the steps on [creating a backup](#backing-up-your-container).
|
||||
|
||||
#### Step 3: Remove the currently running container
|
||||
|
||||
```console
|
||||
docker rm -v spark
|
||||
```
|
||||
|
||||
or using Docker Compose:
|
||||
|
||||
```console
|
||||
docker-compose rm -v spark
|
||||
```
|
||||
|
||||
#### Step 4: Run the new image
|
||||
|
||||
Re-create your container from the new image, [restoring your backup](#restoring-a-backup) if necessary.
|
||||
|
||||
```console
|
||||
docker run --name spark bitnami/spark:latest
|
||||
```
|
||||
|
||||
or using Docker Compose:
|
||||
|
||||
```console
|
||||
docker-compose up spark
|
||||
```
|
||||
|
||||
## Notable Changes
|
||||
The following subsections describe notable changes.
|
||||
|
||||
### 4.0.0-debian-10-r2
|
||||
|
||||
@@ -372,13 +301,7 @@ docker-compose up spark
|
||||
|
||||
- This image now has an aws-cli and two jars: `hadoop-aws` and `aws-java-sdk` for provide an easier way to use AWS.
|
||||
|
||||
## Using `docker-compose.yaml`
|
||||
|
||||
Please be aware this file has not undergone internal testing. Consequently, we advise its use exclusively for development or testing purposes. For production-ready deployments, we highly recommend utilizing its associated [Bitnami Helm chart](https://github.com/bitnami/charts/tree/main/bitnami/spark).
|
||||
|
||||
If you detect any issue in the `docker-compose.yaml` file, feel free to report it or contribute with a fix by following our [Contributing Guidelines](https://github.com/bitnami/containers/blob/main/CONTRIBUTING.md).
|
||||
|
||||
## License
|
||||
## <a id="license"></a> License
|
||||
|
||||
Copyright © 2026 Broadcom. The term "Broadcom" refers to Broadcom Inc. and/or its subsidiaries.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user