NEET_BioBERT / README.md
Neural-Hacker's picture
Update README.md
10df994 verified
metadata
license: mit
datasets:
  - sweatSmile/neet-biology-qa
language:
  - en
base_model:
  - distilbert/distilbert-base-uncased
pipeline_tag: question-answering
library_name: transformers
tags:
  - neet
  - biology
  - exam
  - bio

DistilBERT NEET Biology MCQ Classifier (NEET_BioBERT)

This model is a fine-tuned version of DistilBERT (base uncased) specifically trained to classify the correct option for NEET-style multiple-choice biology questions. It selects the best answer among four choices (A, B, C, D).


Training Data

Source: sweatSmile / NEET Biology QA Dataset

Domain: NEET (Undergraduate Medical Entrance Exam) – Biology

Format: Each question has 4 options with one correct answer

Dataset Size: 793 questions

Split: 80% train / 20% validation


Training Configuration

Base Model: distilbert-base-uncased

Epochs: 10

Batch Size: 4

Learning Rate: 5e-5

Weight Decay: 0.01

Task Type: Multiple Choice Classification


Results

Validation Accuracy 72.96% (~73%)

Final Training Loss ~0.35


Limitations

Trained on a relatively small dataset (793 questions).

Limited to NEET-level biology content; not suitable for physics or chemistry.

Does not support:

Assertion-reasoning questions

Diagram-based questions

Paragraph/Case study type questions


Intended Use

Educational Research

AI-powered NEET Biology assistants

MCQ practice evaluation

Baseline model for future fine-tuning with larger datasets


NOTE:

Not recommended as a final exam-ready solution without further fine-tuning and validation.


License: MIT