The next thing to configure is the Minio instance. Remember: Since Minio has a S3-Compliant interface, it handles translation between S3 and GCS. Think of a good pair of credentials (you don't need to remember them, just use some random generator) and enter thim into MINIOACCESSKEY and MINIOSECRETKEY. MINIOACCESSKEY=anotherusername MINIOSECRETKEY=2ndpasswordrandomlygenerateme #MINIOBROWSER=off. Generate a unique random key, you can use `openssl rand -hex 32. Copy mc alias set YOUR-ACCESS-KEY YOUR-SECRET-KEY Keys must be supplied by argument or standard input. Is simply a short name to your MinIO service. MinIO end-point, access and secret keys are supplied by your MinIO service.
After the last adventure of getting the rack built and acquiring the machines, it was time to set up the software. Originally, I had planned to do this in a day or two, but in practice, it ran like so many other 'simple' projects and some things I had assumed would be 'super quick' ended up taking much longer than planned.
Software-wise, I ended up deciding on using K3s for the Kubernetes deployment, and Rook with Ceph for the persistent volumes. And while I don't travel nearly as much as I used to, I also set up tailscale for VPN access from the exciting distant location of my girlfriend's house (and incase we ended up having to leave due to air quality).
Building the base image for the Raspberry Pis
For the Raspberry Pis I decided to use the Ubuntu Raspberry Pi image as its base. The Raspberry Pis boot off of microsd cards, which allows us to pre-build system images rather than running through the install process on each instance. My desktop is an x86, but by following this guide, I was able to set up an emulation layer so I could cross-build the image for the ARM Raspberry Pis.
I pre-installed the base layer with Avahi (so the workers and find the leader), ZFS (to create a local storage layer to back our volumes), and necessary container tools. This step ended up taking a while, but I made the most of it by re-using the same image on multiple workers. I also had this stage copy over some configuration files, which didn't depend on having emulation set up.
However, not everything is easily baked into an image. For example, at first boot, the leader node installs K3s and generates a certificate. Also, when each worker first boots, it connects to the leader and fetches the configuration required to join the cluster. Ubuntu has a mechanism for this (called cloud-init), but rather than figure out a new system I went with the old school self-disabling init-script to do the 'first boot' activities.
Setting up the Jetsons & my one x86 machine
Unlike the Raspberry Pis, the Jetson AGX's & x86 machines have internal storage that they boot from. While the Jetson nano does boot from a microsd card, the images available are installer images that require user interaction to set up. Thankfully, since I wrote everything down in a shell script, it was fairly simple to install the same packages and do the same setup on the Raspberry Pis.
By default, K3s uses containerd to execute its containers. I found another interesting blog post on using K3s on Jetsons, and the main changes that I needed for the setup is to switch from containerd to docker and to configure docker to use the 'nvidia' runtime as the default.
Getting the cluster to work
So, despite pre-baking the images, and having scripts to install 'everything,' I ended up running into a bunch of random problems along the way. These spanned everything from hardware to networking to my software setup.
The leader node started pretty much as close to perfect as possible, and one of the two workers Raspberry Pis came right up. The second worker Pi kept spitting out malformed packets on the switch – and I'm not really sure what's going on with that one – but the case did melt a little bit, which makes me think there might have been a hardware issue with that one node. I did try replacing the network cable and putting it into a different port, but got the same results. When I replaced it with a different Pi everything worked just fine, so I'll debug the broken node when I've got some spare time.
Minio Generate Random Access Key Generator
I also had some difficulty with my Jetson Nano not booting. At first, I thought maybe the images I was baking were no good, but then I tried the stock image along with a system reset, and that didn't get me any further. Eventually I tried a new microsd card along with the stock image and shorting out pin 40 and it booted like a champ.
On the networking side, I have a fail-over configured for my home network. However, it seems that despite my thinking I had my router configured to fail-over only if the primary connection has an outage and not do any load-balancing otherwise, I kept getting random connection issues. Once I disabled the fail-over connections the networking issues disappeared. I'm not completely sure what's going on with this part, but for now, I can just manually do a failover if sonic goes out.
On the software side, Avahi worked fine on all of the ARM boards but for some reason doesn't seem to be working on the x86 node The only difference that I could figure was that the x86 node has a static lease configured with the DHCP server, but I don't think that would cause this issue. While having local DNS between the worker nodes would be useful, this was getting near the end of the day, so I just added the leader to the x86's node's host files and called it a day. The software issues lead us nicely into the self caused issues I had trying to get persistent volumes working.
Getting persistent volumes working
One of the concepts I'm interested in playing with is fault tolerance. One potential mechanism for this is using persistent volumes to store some kind of state and recovering from them. In this situation we want our volumes to remain working even if we take a node out of service, so we can't just depend on local volume path provisioning to test this out.
There are many different projects that could provide persistent volumes on Kubernetes. My first attempt was with GlusterFS; however, the Gluster Kubernetes project has been 'archived.' So after some headaches, I moved on to trying Rook and Ceph. Getting Rook and Ceph running together ended up being quite the learning adventure; both Kris and Duffy jumped on a video call with me to help figure out what was going on. After a lot of debugging – they noticed that it was an architecture issue – namely, many of the CSI containers were not yet cross-compiled for ARM. We did a lot of sleuthing and found unofficial multi-arch versions of these containers. Since then, the rasbernetes project has started cross-compiling the CSI containers, I've switched to using as it's a bit simpler to keep track of.
Adding an object store
During my first run of Apache Spark on the new cluster, I was reminded of the usefulness of an object-store. I'm used to working in an environment where I have an object store available. Thankfully MinIO is available to provide an S3 compatible object store on Kube. It can be backed by the persistent volumes I set up using Rook & Ceph. It can also use local storage, but I decided to use it as a first test of the persistent volumes. Once I had fixed the issues with Ceph, MinIO deployed relatively simply using a helm chart.
While MinIO does build docker containers for arm64 and amd64, it gives them seperate tags. Since I've got a mix of x86 machines and arm machines in the same cluster I ended up using an un-official multi-arch build. I did end up pinning it to the x86 machine for now, since I haven't had the time to recompile the kernels on the arm machines to support rbd.
Getting kubectl working from my desktop
Once I had K3s set up, I wanted to be able to access it from my desktop without having to SSH to a node in the cluster. The K3s documentation says to copy /etc/rancher/k3s/k3s.yaml
from the cluster to your local ~/.kube/config
and replace the string localhost with the ip/DNS of the leader. Since I had multiple existing clusters I copied the part under each top-level key to the corresponding key, while changing the 'default' string to k3s when copying so that I could remember the context better. The first time I did this I got the whitespace mixed up which lead to Error in configuration: context was not found for specified context: k3s
– but after I fixed my YAML everything worked :)
Setting up a VPN solution
While shelter in place has made accessing my home network remotely less important, I do still occasionally get out of the house while staying within my social bubble. Some of my friends from University/Co-Op are now at a company called tailscale, which does magic with WireGuard to allow even double-natted networks to have VPNs. Since I was doing this part as an afterthought, I didn't have tailscale installed on all of the nodes, so I followed the instructions to enable subnets (note: I missed enabling the 'Enable subnet routes' in the admin console the first time) and have my desktop act as a 'gateway' host for the K8s cluster when I'm 'traveling.' With tailscale, set up I was able to run kubectl from my laptop at Nova's place :)
Josh Patterson has a blog post on using tailscale with RAPIDS.
Conclusion & alternatives
The setup process was a bit more painful than I expected, but it was mostly due to my own choices. In retrospect, building images and flashing them was relatively slow with the emulation required on my old desktop. It would have been much easier to do a non-distributed volume deployment, like local volumes. but I want to set up PVs that I can experiment with using for fault recovery. Nova pointed out that I could have set up sshfs or NFS and could have gotten PVs working with a lot less effort, but by the time we had that conversation the sunk cost fallacy had me believing just one more 'quick fix' was needed and then it would all magically work. Instead of K3s I could have used kubeadm but that seemed relatively heavyweight. Instead of installing K3s 'manually' the k3sup project could have simplified some of this work. However, since I have a mix of different types of nodes, I wanted a bit more control.
Now that the cluster is set up, I'm going to test the cluster out some more with Apache Spark, the distributed computing program I'm most familiar with. Once we've made sure the basics are working with Spark, I'm planning on exploring how to get dask running. You can follow along with my adventures on my YouTube channel over here, or subscribe to the mailing list to keep up to date when I write a new post.
Blobs are a common abstraction for storing unstructured data on Cloud storageservices and accessing them via HTTP. This guide shows how to work withblobs in the Go CDK.
The blob
package supports operations like reading and writing blobs (using standardio
package interfaces), deleting blobs, and listing blobs in a bucket.
Subpackages contain driver implementations of blob for various services,including Cloud and on-prem solutions. You can develop your applicationlocally using fileblob
, then deploy it to multiple Cloud providers withminimal initialization reconfiguration.
Opening a Bucket🔗
The first step in interacting with unstructured storage isto instantiate a portable *blob.Bucket
for your storage service.
The easiest way to do so is to use blob.OpenBucket
and a service-specific URLpointing to the bucket, making sure you 'blank import' the driver package tolink it in.
See Concepts: URLs for general background and the guide below for URL usagefor each supported service.
Alternatively, if you needfine-grained control over the connection settings, you can call the constructorfunction in the driver package directly.
You may find the wire
package useful for managing your initialization codewhen switching between different backing services.
After the last adventure of getting the rack built and acquiring the machines, it was time to set up the software. Originally, I had planned to do this in a day or two, but in practice, it ran like so many other 'simple' projects and some things I had assumed would be 'super quick' ended up taking much longer than planned.
Software-wise, I ended up deciding on using K3s for the Kubernetes deployment, and Rook with Ceph for the persistent volumes. And while I don't travel nearly as much as I used to, I also set up tailscale for VPN access from the exciting distant location of my girlfriend's house (and incase we ended up having to leave due to air quality).
Building the base image for the Raspberry Pis
For the Raspberry Pis I decided to use the Ubuntu Raspberry Pi image as its base. The Raspberry Pis boot off of microsd cards, which allows us to pre-build system images rather than running through the install process on each instance. My desktop is an x86, but by following this guide, I was able to set up an emulation layer so I could cross-build the image for the ARM Raspberry Pis.
I pre-installed the base layer with Avahi (so the workers and find the leader), ZFS (to create a local storage layer to back our volumes), and necessary container tools. This step ended up taking a while, but I made the most of it by re-using the same image on multiple workers. I also had this stage copy over some configuration files, which didn't depend on having emulation set up.
However, not everything is easily baked into an image. For example, at first boot, the leader node installs K3s and generates a certificate. Also, when each worker first boots, it connects to the leader and fetches the configuration required to join the cluster. Ubuntu has a mechanism for this (called cloud-init), but rather than figure out a new system I went with the old school self-disabling init-script to do the 'first boot' activities.
Setting up the Jetsons & my one x86 machine
Unlike the Raspberry Pis, the Jetson AGX's & x86 machines have internal storage that they boot from. While the Jetson nano does boot from a microsd card, the images available are installer images that require user interaction to set up. Thankfully, since I wrote everything down in a shell script, it was fairly simple to install the same packages and do the same setup on the Raspberry Pis.
By default, K3s uses containerd to execute its containers. I found another interesting blog post on using K3s on Jetsons, and the main changes that I needed for the setup is to switch from containerd to docker and to configure docker to use the 'nvidia' runtime as the default.
Getting the cluster to work
So, despite pre-baking the images, and having scripts to install 'everything,' I ended up running into a bunch of random problems along the way. These spanned everything from hardware to networking to my software setup.
The leader node started pretty much as close to perfect as possible, and one of the two workers Raspberry Pis came right up. The second worker Pi kept spitting out malformed packets on the switch – and I'm not really sure what's going on with that one – but the case did melt a little bit, which makes me think there might have been a hardware issue with that one node. I did try replacing the network cable and putting it into a different port, but got the same results. When I replaced it with a different Pi everything worked just fine, so I'll debug the broken node when I've got some spare time.
Minio Generate Random Access Key Generator
I also had some difficulty with my Jetson Nano not booting. At first, I thought maybe the images I was baking were no good, but then I tried the stock image along with a system reset, and that didn't get me any further. Eventually I tried a new microsd card along with the stock image and shorting out pin 40 and it booted like a champ.
On the networking side, I have a fail-over configured for my home network. However, it seems that despite my thinking I had my router configured to fail-over only if the primary connection has an outage and not do any load-balancing otherwise, I kept getting random connection issues. Once I disabled the fail-over connections the networking issues disappeared. I'm not completely sure what's going on with this part, but for now, I can just manually do a failover if sonic goes out.
On the software side, Avahi worked fine on all of the ARM boards but for some reason doesn't seem to be working on the x86 node The only difference that I could figure was that the x86 node has a static lease configured with the DHCP server, but I don't think that would cause this issue. While having local DNS between the worker nodes would be useful, this was getting near the end of the day, so I just added the leader to the x86's node's host files and called it a day. The software issues lead us nicely into the self caused issues I had trying to get persistent volumes working.
Getting persistent volumes working
One of the concepts I'm interested in playing with is fault tolerance. One potential mechanism for this is using persistent volumes to store some kind of state and recovering from them. In this situation we want our volumes to remain working even if we take a node out of service, so we can't just depend on local volume path provisioning to test this out.
There are many different projects that could provide persistent volumes on Kubernetes. My first attempt was with GlusterFS; however, the Gluster Kubernetes project has been 'archived.' So after some headaches, I moved on to trying Rook and Ceph. Getting Rook and Ceph running together ended up being quite the learning adventure; both Kris and Duffy jumped on a video call with me to help figure out what was going on. After a lot of debugging – they noticed that it was an architecture issue – namely, many of the CSI containers were not yet cross-compiled for ARM. We did a lot of sleuthing and found unofficial multi-arch versions of these containers. Since then, the rasbernetes project has started cross-compiling the CSI containers, I've switched to using as it's a bit simpler to keep track of.
Adding an object store
During my first run of Apache Spark on the new cluster, I was reminded of the usefulness of an object-store. I'm used to working in an environment where I have an object store available. Thankfully MinIO is available to provide an S3 compatible object store on Kube. It can be backed by the persistent volumes I set up using Rook & Ceph. It can also use local storage, but I decided to use it as a first test of the persistent volumes. Once I had fixed the issues with Ceph, MinIO deployed relatively simply using a helm chart.
While MinIO does build docker containers for arm64 and amd64, it gives them seperate tags. Since I've got a mix of x86 machines and arm machines in the same cluster I ended up using an un-official multi-arch build. I did end up pinning it to the x86 machine for now, since I haven't had the time to recompile the kernels on the arm machines to support rbd.
Getting kubectl working from my desktop
Once I had K3s set up, I wanted to be able to access it from my desktop without having to SSH to a node in the cluster. The K3s documentation says to copy /etc/rancher/k3s/k3s.yaml
from the cluster to your local ~/.kube/config
and replace the string localhost with the ip/DNS of the leader. Since I had multiple existing clusters I copied the part under each top-level key to the corresponding key, while changing the 'default' string to k3s when copying so that I could remember the context better. The first time I did this I got the whitespace mixed up which lead to Error in configuration: context was not found for specified context: k3s
– but after I fixed my YAML everything worked :)
Setting up a VPN solution
While shelter in place has made accessing my home network remotely less important, I do still occasionally get out of the house while staying within my social bubble. Some of my friends from University/Co-Op are now at a company called tailscale, which does magic with WireGuard to allow even double-natted networks to have VPNs. Since I was doing this part as an afterthought, I didn't have tailscale installed on all of the nodes, so I followed the instructions to enable subnets (note: I missed enabling the 'Enable subnet routes' in the admin console the first time) and have my desktop act as a 'gateway' host for the K8s cluster when I'm 'traveling.' With tailscale, set up I was able to run kubectl from my laptop at Nova's place :)
Josh Patterson has a blog post on using tailscale with RAPIDS.
Conclusion & alternatives
The setup process was a bit more painful than I expected, but it was mostly due to my own choices. In retrospect, building images and flashing them was relatively slow with the emulation required on my old desktop. It would have been much easier to do a non-distributed volume deployment, like local volumes. but I want to set up PVs that I can experiment with using for fault recovery. Nova pointed out that I could have set up sshfs or NFS and could have gotten PVs working with a lot less effort, but by the time we had that conversation the sunk cost fallacy had me believing just one more 'quick fix' was needed and then it would all magically work. Instead of K3s I could have used kubeadm but that seemed relatively heavyweight. Instead of installing K3s 'manually' the k3sup project could have simplified some of this work. However, since I have a mix of different types of nodes, I wanted a bit more control.
Now that the cluster is set up, I'm going to test the cluster out some more with Apache Spark, the distributed computing program I'm most familiar with. Once we've made sure the basics are working with Spark, I'm planning on exploring how to get dask running. You can follow along with my adventures on my YouTube channel over here, or subscribe to the mailing list to keep up to date when I write a new post.
Blobs are a common abstraction for storing unstructured data on Cloud storageservices and accessing them via HTTP. This guide shows how to work withblobs in the Go CDK.
The blob
package supports operations like reading and writing blobs (using standardio
package interfaces), deleting blobs, and listing blobs in a bucket.
Subpackages contain driver implementations of blob for various services,including Cloud and on-prem solutions. You can develop your applicationlocally using fileblob
, then deploy it to multiple Cloud providers withminimal initialization reconfiguration.
Opening a Bucket🔗
The first step in interacting with unstructured storage isto instantiate a portable *blob.Bucket
for your storage service.
The easiest way to do so is to use blob.OpenBucket
and a service-specific URLpointing to the bucket, making sure you 'blank import' the driver package tolink it in.
See Concepts: URLs for general background and the guide below for URL usagefor each supported service.
Alternatively, if you needfine-grained control over the connection settings, you can call the constructorfunction in the driver package directly.
You may find the wire
package useful for managing your initialization codewhen switching between different backing services.
See the guide below for constructor usage for each supported service.
Prefixed Buckets🔗
You can wrap a *blob.Bucket
to always operate on a subfolder of the bucketusing blob.PrefixedBucket
:
Alternatively, you can configure the prefix directly in the blob.OpenBucket
URL:
Using a Bucket🔗
Once you have opened a bucket for the storage provider you want, you canstore and access data from it using the standard Go I/O patterns describedbelow. Other operations like listing and reading metadata are documented in theblob
package documentation.
Writing Data to a Bucket🔗
To write data to a bucket, you create a writer, write data to it, and thenclose the writer. Closing the writer commits the write to the provider,flushing any buffers, and releases any resources used while writing, so youmust always check the error of Close
.
The writer implements io.Writer
, so you can use any functions that takean io.Writer
like io.Copy
or fmt.Fprintln
.
In some cases, you may want to cancel an in-progress write to avoid the blobbeing created or overwritten. A typical reason for wanting to cancel a writeis encountering an error in the stream your program is copying from. To aborta write, you cancel the Context
you pass to the writer. Again, you mustalways Close
the writer to release the resources, but in this case you canignore the error because the write's failure is expected.
Reading Data from a Bucket🔗
Once you have written data to a bucket, you can read it back by creating areader. The reader implements io.Reader
, so you can use any functionsthat take an io.Reader
like io.Copy
or io/ioutil.ReadAll
. You mustalways close a reader after using it to avoid leaking resources.
Many storage providers provide efficient random-access to data in buckets. Tostart reading from an arbitrary offset in the blob, use NewRangeReader
.
Deleting a Bucket🔗
You can delete blobs using the Bucket.Delete
method.
Other Usage Samples🔗
Supported Storage Services🔗
Google Cloud Storage🔗
Google Cloud Storage (GCS) URLs in the Go CDK closely resemble the URLsyou would see in the gsutil
CLI.
blob.OpenBucket
will use Application Default Credentials; if you haveauthenticated via gcloud auth login
, it will use those credentials. SeeApplication Default Credentials to learn about authenticationalternatives, including using environment variables.
Full details about acceptable URLs can be found under the API reference forgcsblob.URLOpener
.
GCS Constructor🔗
The gcsblob.OpenBucket
constructor opens a GCS bucket. You must firstcreate a *net/http.Client
that sends requests authorized by Google CloudPlatform credentials. (You can reuse the same client for anyother API that takes in a *gcp.HTTPClient
.) You can find functions in thegocloud.dev/gcp
package to set this up for you.
S3🔗
Minio Generate Random Access Keyboard
S3 URLs in the Go CDK closely resemble the URLs you would see in the AWS CLI.You should specify the region
query parameter to ensure your applicationconnects to the correct region.
blob.OpenBucket
will create a default AWS Session with theSharedConfigEnable
option enabled; if you have authenticated with the AWS CLI,it will use those credentials. See AWS Session to learn about authenticationalternatives, including using environment variables.
Full details about acceptable URLs can be found under the API reference fors3blob.URLOpener
.
S3 Constructor🔗
The s3blob.OpenBucket
constructor opens an S3 bucket. You must firstcreate an AWS session with the same region as your bucket:
Minio Generate Random Access Key Code
S3-Compatible Servers🔗
Minio Generate Random Access Key
The Go CDK can also interact with S3-compatible storage servers thatrecognize the same REST HTTP endpoints as S3, like Minio, Ceph, orSeaweedFS. You can change the endpoint by changing the Endpoint
fieldon the *aws.Config
you pass to s3blob.OpenBucket
. If you are usingblob.OpenBucket
, you can switch endpoints by using the S3 URL using queryparameters like so:
See aws.ConfigFromURLParams
for more details on supported URL options for S3.
Azure Blob Storage🔗
Azure Blob Storage URLs in the Go CDK allow you to identify Azure Blob Storage containerswhen opening a bucket with blob.OpenBucket
. Go CDK uses the environmentvariables AZURE_STORAGE_ACCOUNT
, AZURE_STORAGE_KEY
, andAZURE_STORAGE_SAS_TOKEN
to configure the credentials. AZURE_STORAGE_ACCOUNT
is required, along with one of the other two.
Minio Generate Random Access Key Free
Full details about acceptable URLs can be found under the API reference forazureblob.URLOpener
.
Azure Blob Constructor🔗
The azureblob.OpenBucket
constructor opens an Azure Blob Storage container.azureblob
operates on Azure Storage Block Blobs. You must first createAzure Storage credentials and then create an Azure Storage pipeline beforeyou can open a container.
Local Storage🔗
The Go CDK provides blob drivers for storing data in memory and on the localfilesystem. These are primarily intended for testing and local development,but may be useful in production scenarios where an NFS mount is used.
Local storage URLs take the form of either mem://
or file:///
URLs.Memory URLs are always mem://
with no other information and always create anew bucket. File URLs convert slashes to the operating system's native fileseparator, so on Windows, C:foobar
would be written asfile:///C:/foo/bar
.
Local Storage Constructors🔗
You can create an in-memory bucket with memblob.OpenBucket
:
Minio Generate Random Access Keyword
You can use a local filesystem directory with fileblob.OpenBucket
: