Proposed that larger instance sizes would provide non-linear performance benefits for the service's highly-variable workload, where response complexity varies significantly. Load testing verified this approach: larger instances reduced the probability of any single host receiving a disproportionate number of expensive requests simultaneously, while also improving on-host cache utilization, allowing for higher CPU utilization while maintaining latency SLAs. Implemented enhanced CPU monitoring to verify that larger instances prevented subsecond CPU spikes that would break latency SLAs.