4nkh
/

theme_model

Text Classification

theme_detection

entrepreneurship

startup success

json automation

Model card Files Files and versions

theme_model / README.md

4nkh's picture

Update README.md

91ad21d verified 12 days ago

|

history blame contribute delete

1.82 kB

	---
	license: apache-2.0
	datasets:
	- 4nkh/theme_data
	language:
	- en
	metrics:
	- precision
	- f1
	- recall
	- accuracy
	base_model:
	- google-bert/bert-base-uncased
	pipeline_tag: text-classification
	library_name: transformers
	tags:
	- multi-label
	- theme_detection
	- mentorship
	- entrepreneurship
	- startup success
	- json automation
	---
	# Theme classification model (multi-label)

	This repository contains a fine-tuned BERT model for classifying short texts into community-oriented themes. The model was trained locally and pushed to the Hugging Face Hub.

	Model details

	- Model architecture: bert-base-uncased (fine-tuned)
	- Problem type: multi-label classification
	- Labels: `mentorship`, `entrepreneurship`, `startup success`
	- Training data: `train_theme.jsonl` (included)
	- Final evaluation (example run):
	- eval_loss: 0.1822
	- eval_micro/f1: 1.0
	- eval_macro/f1: 1.0

	Usage

	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification
	import torch

	repo = "4nkh/theme_model"
	tokenizer = AutoTokenizer.from_pretrained(repo)
	model = AutoModelForSequenceClassification.from_pretrained(repo)

	texts = ["Our co-op paired first-time founders with veteran shop owners to troubleshoot setbacks."]
	inputs = tokenizer(texts, truncation=True, padding=True, return_tensors="pt")
	with torch.no_grad():
	outputs = model(**inputs)
	logits = outputs.logits
	probs = torch.sigmoid(logits)
	preds = (probs >= 0.5).int()
	print('probs', probs.numpy(), 'preds', preds.numpy())
	```

	Notes

	- This model uses a threshold of 0.5 for multi-label predictions. Adjust thresholds per-class as needed.
	- If you want to re-train or fine-tune further, see `train_theme_model.py` in this folder.

	License

	Specify your license here (e.g., Apache-2.0) or remove this section if you prefer a different license.