A lot goes into training a model: cleaning your data, versioning it, splitting your data for training and validation, and then the painstaking process of training your model and sharing the findings with your team members.
If you work in an NLP-focused company like mine, which relies on ML models and researcher collaboration to share data and models, the entire lifecycle management process can quickly become a disaster.
Until we arrived at weights and biases.
Keeping marketing jargons aside, Let me quickly walk you through how to deploy it on your private cloud.
3. Cloud Storage
5. Kubernetes cluster
Notice: We will not be focusing on the security of the resources here, it would help you a lot to deploy these resources with appropriate security compliances as per your company.
Head over to the memory store, where you should be able to create an instance. Enable AUTH for the instance.
Head over to the security section of the instance, and you should be able to see the AUTH string for your Redis instance.
Save this string to be used later.
Creating a service account with required permissions
Go to service accounts and make a service account that has the right permissions to use pub/sub and cloud logging.
Creating a Pub/Sub Queue and cloud bucket
Create a pub/sub queue and a cloud bucket for W&B to access.
Give the service account you created admin rights to your bucket.
We will then need to setup a notification channel on pub/sub for W&B to know about the changes in the bucket. That you can do by running the following command:
gcloud storage buckets notifications create gs://BUCKET_NAME --topic=TOPIC_NAME
Creating your SQL server
Go ahead and create an SQL instance for W&B to use. Make sure to change the network settings so that the pods in your K8s cluster are able to communicate to the SQL server
You should then have a user with a password to access this SQL instance.
Deploying your kubernetes service
- name: google-cloud-key
- name: wandb
- mountPath: /var/secrets/google
- name: GOOGLE_APPLICATION_CREDENTIALS
- containerPort: 8080
- port: 80
istio: ingressgateway # use Istio default gateway implementation