• Created a synthetic data generation pipeline, expanding the dataset to over 10,000 examples per category.
• Ensured diversity across multiple domains while maintaining consistency in fallacy representation.
2. Model Fine-tuning:
• Utilized the Anyscale platform to fine-tune Llama 2 and 3 models on the custom dataset.
• Implemented systematic approaches for model training and evaluation.
3. Tool Development:
• Developed Python scripts for data generation, validation, and model testing.
RESULTS AND IMPACT
The project yielded significantly improved accuracy in fallacy detection and generation. Potential applications include educational tools, content moderation, and argument analysis systems.