After investigation, we've narrowed the issue to when we ran a mandatory GCP certificate migration across our regions. To prevent downtime, we pre-provisioned extra capacity ahead of the cutover, but one of our autoscaler partners cleaned up those machines before the migration window hit. The region is now stable, and we've put mechanisms in place to make sure this doesn't happen again.