On the PyTorch side, the inference setup mirrored that of MLX. A pre-trained BERT model, sourced from the Hugging Face Transformers library, was loaded into a PyTorch environment. Similarly, synthetic input data, comprising input_ids, token_type_ids, and attention_mask tensors, was generated randomly. The PyTorch BERT model processed this synthetic input data, and the inference time was recorded.