GPT-2 from fine-tuning to production

Wasiq

Wasiq Malik

• Fine-tuned GPT-2 to generate full-length text articles based on an input prompt and its semantics.
• Built training/inference pipelines using huggingface transformers and a REST API using cortex.dev
• Deployed on an AWS EKS cluster with V100 GPUs, resulting in a 2-4s inference time.
• Provided support for production bugs and maintenance after go-live for writeme.ai
Like this project

Posted Apr 28, 2024

Finetuned GPT-2 Large and XL, deployed for inference on AWS EKS cluster with V100 GPUs.