Google Cloud Storage Upload Guide
After training your nanochat model on Lambda Labs, use this guide to upload all weights and artifacts to Google Cloud Storage.
Quick Start
# After training completes, SSH to your Lambda instance
ssh ubuntu@<INSTANCE_IP>
# Navigate to project directory
cd ~/nanochatAquaRat
# Run upload script
bash scripts/upload_to_gcs.sh --bucket gs://your-bucket-name
The script will:
- Check/install gcloud CLI if needed
- Verify authentication and bucket access
- Show what will be uploaded and ask for confirmation
- Upload all artifacts with progress
- Ask if you want to terminate the Lambda instance
Prerequisites
1. Create a GCS Bucket
# From your local machine
gcloud storage buckets create gs://your-bucket-name \
--location=us-central1 \
--uniform-bucket-level-access
Or create via console: https://console.cloud.google.com/storage/create-bucket
2. Set Up Authentication
Option A: Service Account (Recommended for Automation)
On your local machine:
# Create service account
gcloud iam service-accounts create nanochat-uploader \
--display-name="Nanochat Model Uploader"
# Grant storage permissions
gcloud projects add-iam-policy-binding YOUR_PROJECT_ID \
--member="serviceAccount:nanochat-uploader@YOUR_PROJECT_ID.iam.gserviceaccount.com" \
--role="roles/storage.objectCreator"
# Create and download key
gcloud iam service-accounts keys create ~/nanochat-key.json \
--iam-account=nanochat-uploader@YOUR_PROJECT_ID.iam.gserviceaccount.com
Copy key to Lambda instance:
scp ~/nanochat-key.json ubuntu@<INSTANCE_IP>:~/
On Lambda instance:
gcloud auth activate-service-account --key-file=~/nanochat-key.json
Option B: User Account (Simpler for Manual Use)
On Lambda instance:
gcloud auth login
# Follow the prompts in your browser
Usage
Basic Upload
bash scripts/upload_to_gcs.sh --bucket gs://my-models
Custom Run Name
bash scripts/upload_to_gcs.sh \
--bucket gs://my-models \
--run-name depth20-experiment1
Exclude Large Dataset Files
bash scripts/upload_to_gcs.sh \
--bucket gs://my-models \
--exclude-data
Dry Run (Preview Only)
bash scripts/upload_to_gcs.sh \
--bucket gs://my-models \
--dry-run
Auto-Terminate After Upload
bash scripts/upload_to_gcs.sh \
--bucket gs://my-models \
--auto-terminate
What Gets Uploaded
From ~/.cache/nanochat/:
| Directory | Contents | Typical Size |
|---|---|---|
checkpoints/ |
Model weights (.pt, .pkl files) | 500MB - 2GB |
report/ |
Training reports and markdown summaries | 1-10MB |
tokenizer/ |
BPE tokenizer files | 10-50MB |
eval_bundle/ |
Evaluation datasets | 50-200MB |
aqua/ |
AQuA-RAT dataset (optional) | 100-500MB |
mechanistic_interpretability/ |
DeepMind interp tools | 10-100MB |
Total: Typically 1-5 GB per training run
Upload Structure
Files are organized in GCS as:
gs://your-bucket/
βββ runs/
βββ aquarat-20251023-143022/
β βββ checkpoints/
β β βββ base_final.pt
β β βββ mid_final.pt
β β βββ sft_final.pt
β β βββ rl_final.pt
β βββ report/
β β βββ report.md
β βββ tokenizer/
β βββ ...
βββ depth20-experiment1/
βββ ...
Download Weights Later
Download Entire Run
gsutil -m rsync -r \
gs://your-bucket/runs/aquarat-20251023-143022/ \
./local_checkpoints/
Download Just Checkpoints
gsutil -m cp -r \
gs://your-bucket/runs/aquarat-20251023-143022/checkpoints/ \
./checkpoints/
Download Single File
gsutil cp \
gs://your-bucket/runs/aquarat-20251023-143022/checkpoints/rl_final.pt \
./rl_final.pt
Cost Considerations
Storage Costs
- Standard storage: ~$0.02/GB/month
- Nearline storage (30+ days): ~$0.01/GB/month
- Coldline storage (90+ days): ~$0.004/GB/month
Example: 2GB model stored for 1 month = $0.04
Network Egress
- Upload (ingress): Free
- Download to same region: Free
- Download to internet: ~$0.12/GB
Tip: Keep your GCS bucket in the same region as your compute for free transfers.
Lifecycle Management
Auto-delete or move to cheaper storage after 90 days:
cat > lifecycle.json << EOF
{
"lifecycle": {
"rule": [
{
"action": {"type": "Delete"},
"condition": {"age": 90}
}
]
}
}
EOF
gsutil lifecycle set lifecycle.json gs://your-bucket
Troubleshooting
"gcloud: command not found"
The script auto-installs gcloud on Linux. If it fails:
curl https://sdk.cloud.google.com | bash
exec -l $SHELL
"Permission denied" Error
Check your service account has roles/storage.objectCreator:
gcloud projects get-iam-policy YOUR_PROJECT_ID \
--flatten="bindings[].members" \
--filter="bindings.members:serviceAccount:nanochat-uploader*"
Upload Interrupted
The script uses gsutil rsync, so re-running will resume:
bash scripts/upload_to_gcs.sh --bucket gs://your-bucket
# Will skip already-uploaded files
Verify Upload
# List all files in the run
gsutil ls -r gs://your-bucket/runs/your-run-name/
# Check specific checkpoints
gsutil ls gs://your-bucket/runs/your-run-name/checkpoints/
Integration with Lambda Launcher
You can add GCS credentials to the automated launcher:
# In scripts/launch_lambda_training.py
# Add to the cloud-init user-data:
write_files:
- path: /home/ubuntu/.config/gcloud/application_default_credentials.json
content: |
{your service account key JSON}
Or pass as environment variable:
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/key.json
python scripts/launch_lambda_training.py \
--inject-env GOOGLE_APPLICATION_CREDENTIALS \
...
Best Practices
- Name runs descriptively: Use
--run-name depth20-lr1e4-batch32 - Exclude data when iterating: Use
--exclude-datato save bandwidth - Dry run first: Always use
--dry-runto preview - Service accounts for automation: Easier than user auth
- Regional buckets: Match Lambda instance region when possible
- Lifecycle policies: Auto-archive old models
- Download to Lambda: If re-training, download previous checkpoints to Lambda first
Security Notes
- Service account keys are sensitive - treat like passwords
- Use least-privilege IAM roles (don't grant
roles/owner) - Rotate service account keys regularly
- Consider Workload Identity if using GKE
- Don't commit keys to git (add to
.gitignore)
Support
- GCS Documentation: https://cloud.google.com/storage/docs
- gsutil Reference: https://cloud.google.com/storage/docs/gsutil
- IAM Permissions: https://cloud.google.com/storage/docs/access-control/iam-permissions
Quick Reference:
# Upload
bash scripts/upload_to_gcs.sh --bucket gs://my-bucket
# Download
gsutil -m cp -r gs://my-bucket/runs/NAME/checkpoints/ ./
# List runs
gsutil ls gs://my-bucket/runs/
# Delete old run
gsutil -m rm -r gs://my-bucket/runs/old-run/