GPT-2 from fine-tuning to production

Wasiq Malik

• Fine-tuned GPT-2 to generate full-length text articles based on an input prompt and its semantics.
• Built training/inference pipelines using huggingface transformers and a REST API using cortex.dev
• Deployed on an AWS EKS cluster with V100 GPUs, resulting in a 2-4s inference time.
• Provided support for production bugs and maintenance after go-live for writeme.ai
Like this project
0

Posted Apr 28, 2024

Finetuned GPT-2 Large and XL, deployed for inference on AWS EKS cluster with V100 GPUs.

Soccer Player Identification in Live Streams
Soccer Player Identification in Live Streams
Collision Monitor System for iw.hub
Collision Monitor System for iw.hub