[bitnami/spark] Update to /bitnami/spark/README.md "Installing additional Jars" example (#54903)

* Update README.md

Updating the README.md "Installing additional jars" example to work correctly.

Signed-off-by: JMaki <7774872+JMaki3@users.noreply.github.com>

* Update bitnami/spark/README.md

Co-authored-by: Felipe V.C. Serafim <43246350+fevisera@users.noreply.github.com>
Signed-off-by: JMaki <7774872+JMaki3@users.noreply.github.com>

* Update README.md

Updating example to add curl using install_packages instead of a multi-stage Docker build.

Signed-off-by: JMaki <7774872+JMaki3@users.noreply.github.com>

* Update README.md

Updating a second example in the documentation that also needed to install curl.

Signed-off-by: JMaki <7774872+JMaki3@users.noreply.github.com>

* Update README.md

Updating Hadoop jars example

Signed-off-by: JMaki <7774872+JMaki3@users.noreply.github.com>

---------

Signed-off-by: JMaki <7774872+JMaki3@users.noreply.github.com>
Co-authored-by: Felipe V.C. Serafim <43246350+fevisera@users.noreply.github.com>
This commit is contained in:
JMaki
2024-01-26 03:46:12 -08:00
committed by GitHub
parent 4f769bc23a
commit 2e3623f3e8

View File

@@ -202,6 +202,8 @@ By default, this container bundles a generic set of jar files but the default im
```Dockerfile
FROM bitnami/spark
USER root
RUN install_packages curl
USER 1001
RUN curl https://repo1.maven.org/maven2/com/amazonaws/aws-java-sdk-bundle/1.11.704/aws-java-sdk-bundle-1.11.704.jar --output /opt/bitnami/spark/jars/aws-java-sdk-bundle-1.11.704.jar
```
@@ -213,12 +215,13 @@ In a similar way that in the previous section, you may want to use a different v
Go to <https://spark.apache.org/downloads.html> and copy the download url bundling the Hadoop version you want and matching the Apache Spark version of the container. Extend the Bitnami container image as below:
```Dockerfile
FROM bitnami/spark:3.0.0
FROM bitnami/spark:3.5.0
USER root
RUN install_packages curl
USER 1001
RUN rm -r /opt/bitnami/spark/jars && \
curl --location http://mirror.cc.columbia.edu/pub/software/apache/spark/spark-3.0.0/spark-3.0.0-bin-hadoop2.7.tgz | \
tar --extract --gzip --strip=1 --directory /opt/bitnami/spark/ spark-3.0.0-bin-hadoop2.7/jars/
curl --location https://dlcdn.apache.org/spark/spark-3.5.0/spark-3.5.0-bin-hadoop3.tgz | \
tar --extract --gzip --strip=1 --directory /opt/bitnami/spark/ spark-3.5.0-bin-hadoop3/jars/
```
You can check the Hadoop version by running the following commands in the new container image: