Deploying Weights and Biases on GCP and Istio

aesher9o1
3 min readDec 7, 2022

--

A lot goes into training a model: cleaning your data, versioning it, splitting your data for training and validation, and then the painstaking process of training your model and sharing the findings with your team members.

If you work in an NLP-focused company like mine, which relies on ML models and researcher collaboration to share data and models, the entire lifecycle management process can quickly become a disaster.

Until we arrived at weights and biases.

Keeping marketing jargons aside, Let me quickly walk you through how to deploy it on your private cloud.

Infra required:
1. Redis
2. Pub/Sub
3. Cloud Storage
4. MySQL
5. Kubernetes cluster

Notice: We will not be focusing on the security of the resources here, it would help you a lot to deploy these resources with appropriate security compliances as per your company.

Deploying Redis

Head over to the memory store, where you should be able to create an instance. Enable AUTH for the instance.
Head over to the security section of the instance, and you should be able to see the AUTH string for your Redis instance.

REDIS: redis://:<REDIS_PASSWORD>@<REDIS_IP>:<REDIS_PORT>/0

Save this string to be used later.

Creating a service account with required permissions

Go to service accounts and make a service account that has the right permissions to use pub/sub and cloud logging.

Creating a Pub/Sub Queue and cloud bucket

Create a pub/sub queue and a cloud bucket for W&B to access.

Give the service account you created admin rights to your bucket.

We will then need to setup a notification channel on pub/sub for W&B to know about the changes in the bucket. That you can do by running the following command:

gcloud storage buckets notifications create gs://BUCKET_NAME --topic=TOPIC_NAME

Creating your SQL server

Go ahead and create an SQL instance for W&B to use. Make sure to change the network settings so that the pods in your K8s cluster are able to communicate to the SQL server

You should then have a user with a password to access this SQL instance.

 MYSQL: mysql://<MYSQL_USER>:<MYSQL_PASSWORD>@<MYSQL_DATABASE>/<DATABASE_NAME>

Deploying your kubernetes service

apiVersion: v1
kind: ConfigMap
metadata:
name: wandb-config
data:
LICENSE: <W&B_ENTERPRISE_LICENSE>
MYSQL: mysql://<MYSQL_USER>:<MYSQL_PASSWORD>@<MYSQL_DATABASE>/<DATABASE_NAME>
BUCKET: gs://<BUCKET_NAME>
HOST: "https://wandb.quillbot.dev"
REDIS: redis://:<REDIS_PASSWORD>@<REDIS_IP>:<REDIS_PORT>/0
BUCKET_QUEUE: pubsub:/<GOOGLE_PROJECT_NAME>/<TOPIC_NAME>/<SUBSCRIPTION_NAME>
LOGGING_ENABLED: "true"
SLACK_CLIENT_ID: "858162898657.1612441232960"
SLACK_SECRET: 366951cdad42a3a561f97de0951a9930
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: wandb
spec:
selector:
matchLabels:
app: wandb
template:
metadata:
labels:
app: wandb
spec:
volumes:
- name: google-cloud-key
secret:
secretName: service-account-credentials
containers:
- name: wandb
volumeMounts:
- mountPath: /var/secrets/google
name: google-cloud-key
image: wandb/local
envFrom:
- configMapRef:
name: wandb-config
env:
- name: GOOGLE_APPLICATION_CREDENTIALS
value: /var/secrets/google/key.json
resources:
limits:
memory: "1G"
cpu: "500m"
ports:
- containerPort: 8080
---
apiVersion: v1
kind: Service
metadata:
name: wandb
spec:
type: ClusterIP
selector:
app: wandb
ports:
- port: 80
targetPort: 8080
---
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
name: dev-gateway
spec:
selector:
istio: ingressgateway # use Istio default gateway implementation
servers:
- port:
number: 443
name: https
protocol: HTTPS
tls:
mode: SIMPLE
credentialName: tls-certs
hosts:
- "*"
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: wandb-ingress
spec:
hosts:
- "*"
gateways:
- dev-gateway
http:
- route:
- destination:
host: wandb
port:
number: 80

--

--

aesher9o1

Sometimes it is the people no one can imagine anything of, do the things no one can imagine.