Ajay K
Fine tuning Domain Specific dataset (Personal Dataset) on Gemma 2B Model
This repository contains the code and data for fine-tuning a small language model (SLM) for the specific domain of Indian history. The project demonstrates how to adapt a pre-trained language model to better understand and generate text relevant to this historical context.
Large language models (LLMs) have shown remarkable capabilities in natural language processing tasks. However, fine-tuning them for specific domains remains crucial to unlock their full potential. This project explores the adaptation of the GEMMA model for analyzing Indian history, utilizing a dedicated Indian history dataset. We employ techniques like BitsAndBytes quantization and LoraConfig customization to optimize the model for causal language modeling tasks within this domain.
git clone https://github.com/AjayK47/Gemma-Model-Finetuning-Using-Lora.git
dataset-preprocessing.ipynb
gemma-finetuned-model-inference.ipynb
gemma-it-finetuned.ipynb
Install the dependencies using the following command:
The project follows these key steps:
This project is licensed under the MIT License.