I wouldn't be surprised if they encrypt them at rest, but at some point the weights have to be loaded into vram.
Newer NVidia cards (H100 and up) support both in-memory model encryption and ‘trusted’ execution environment/remote attestation, not sure how widely used in frontier model deployments, but at least vendor claimed perf overhead is ‘3%’ [0]
[0] https://www.spheron.network/blog/confidential-gpu-computing-...
Newer NVidia cards (H100 and up) support both in-memory model encryption and ‘trusted’ execution environment/remote attestation, not sure how widely used in frontier model deployments, but at least vendor claimed perf overhead is ‘3%’ [0]
[0] https://www.spheron.network/blog/confidential-gpu-computing-...