Deploying Weights and Biases on GCP and Istio

3 min readDec 7, 2022

A lot goes into training a model: cleaning your data, versioning it, splitting your data for training and validation, and then the painstaking process of training your model and sharing the findings with your team members.

If you work in an NLP-focused company like mine, which relies on ML models and researcher collaboration to share data and models, the entire lifecycle management process can quickly become a disaster.

Until we arrived at weights and biases.

Keeping marketing jargons aside, Let me quickly walk you through how to deploy it on your private cloud.

Infra required:
1. Redis
2. Pub/Sub
3. Cloud Storage
4. MySQL
5. Kubernetes cluster

Notice: We will not be focusing on the security of the resources here, it would help you a lot to deploy these resources with appropriate security compliances as per your company.

Deploying Redis

Head over to the memory store, where you should be able to create an instance. Enable AUTH for the instance.
Head over to the security section of the instance, and you should be able to see the AUTH string for your Redis instance.

REDIS: redis://:<REDIS_PASSWORD>@<REDIS_IP>:<REDIS_PORT>/0

Save this string to be used later.

Creating a service account with required permissions

Go to service accounts and make a service account that has the right permissions to use pub/sub and cloud logging.

Creating a Pub/Sub Queue and cloud bucket

Create a pub/sub queue and a cloud bucket for W&B to access.

Give the service account you created admin rights to your bucket.

We will then need to setup a notification channel on pub/sub for W&B to know about the changes in the bucket. That you can do by running the following command:

gcloud storage buckets notifications create gs://BUCKET_NAME --topic=TOPIC_NAME

Creating your SQL server

Go ahead and create an SQL instance for W&B to use. Make sure to change the network settings so that the pods in your K8s cluster are able to communicate to the SQL server

You should then have a user with a password to access this SQL instance.

 MYSQL: mysql://<MYSQL_USER>:<MYSQL_PASSWORD>@<MYSQL_DATABASE>/<DATABASE_NAME>

Deploying your kubernetes service

apiVersion: v1
kind: ConfigMap
metadata:
  name: wandb-config
data:
  LICENSE: <W&B_ENTERPRISE_LICENSE>
  MYSQL: mysql://<MYSQL_USER>:<MYSQL_PASSWORD>@<MYSQL_DATABASE>/<DATABASE_NAME>
  BUCKET: gs://<BUCKET_NAME>
  HOST: "https://wandb.quillbot.dev"
  REDIS: redis://:<REDIS_PASSWORD>@<REDIS_IP>:<REDIS_PORT>/0
  BUCKET_QUEUE: pubsub:/<GOOGLE_PROJECT_NAME>/<TOPIC_NAME>/<SUBSCRIPTION_NAME>
  LOGGING_ENABLED: "true"
  SLACK_CLIENT_ID: "858162898657.1612441232960"
  SLACK_SECRET: 366951cdad42a3a561f97de0951a9930
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: wandb
spec:
  selector:
    matchLabels:
      app: wandb
  template:
    metadata:
      labels:
        app: wandb
    spec:
      volumes:
        - name: google-cloud-key
          secret:
            secretName: service-account-credentials
      containers:
        - name: wandb
          volumeMounts:
            - mountPath: /var/secrets/google
              name: google-cloud-key
          image: wandb/local
          envFrom:
            - configMapRef:
                name: wandb-config
          env:
            - name: GOOGLE_APPLICATION_CREDENTIALS
              value: /var/secrets/google/key.json
          resources:
            limits:
              memory: "1G"
              cpu: "500m"
          ports:
            - containerPort: 8080
---
apiVersion: v1
kind: Service
metadata:
  name: wandb
spec:
  type: ClusterIP
  selector:
    app: wandb
  ports:
    - port: 80
      targetPort: 8080
---
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: dev-gateway
spec:
  selector:
    istio: ingressgateway # use Istio default gateway implementation
  servers:
    - port:
        number: 443
        name: https
        protocol: HTTPS
      tls:
        mode: SIMPLE
        credentialName: tls-certs
      hosts:
        - "*"
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: wandb-ingress
spec:
  hosts:
    - "*"
  gateways:
    - dev-gateway
  http:
    - route:
        - destination:
            host: wandb
            port:
              number: 80