Tanaos – Train task specific LLMs without training data, for offline NLP and Text Classification

tanaos-intent-classifier-v1: A small but performant intent classification model

This model was created by Tanaos with the Artifex Python library.

This is an intent classification model based on FacebookAI/roberta-base and fine-tuned on a synthetic dataset to classify text into one of 12 different intent categories:

Intent Description
greeting Greeting or saying hello.
farewell Saying goodbye or farewell.
thank_you Expressing gratitude or thanks.
affirmation Agreeing or confirming something.
negation Disagreeing or denying something.
small_talk Engaging in casual or light conversation with no specific purpose.
bot_capabilities Inquiries about the bot's features or abilities.
feedback_positive Providing positive feedback about the bot, service, or experience.
feedback_negative Providing negative feedback about the bot, service, or experience.
clarification Asking for clarification or more information about a previous statement or question.
suggestion Offering a suggestion or recommendation for improvement.
language_change Requesting a change in the language being used by the bot or information about language options.

These categories were chosen to cover a wide range of common user intents in chatbot and virtual assistant interactions, regardless of the specific application domain, in order to create a versatile and general-purpose intent classification model, applicable across various industries and use cases.

This model does not have an explicit fallback or unknown category. Instead, it is trained to always classify input text into one of the predefined intent categories, even if the input does not clearly belong to any of them. We advice users to interpret text inputs leading to predictions with a score field lower than 0.5 as implicitly belonging to the fallback or unknown intent category.

How to Use

Via the Tanaos API

Get sub-100ms end-to-end latency (faster than running locally), ideal for real-time applications or if you don't want to self-host, with the Tanaos API. Use it for free:

  1. Sign up for a free account at https://platform.tanaos.com/
  2. Create a free API Key from the API Keys section
  3. Replace <YOUR_API_KEY> in the code below with your API Key and use this snippet:
import requests

session = requests.Session()

ic_out = session.post(
    "https://slm.tanaos.com/models/intent-classification",
    headers={
        "X-API-Key": "<YOUR_API_KEY>",
    },
    json={
        "text": "Hey there, how are you doing?"
    }
)

print(ic_out.json()["data"])
# >>> [{'label': 'greeting', 'score': 0.9955}]

Via the Artifex library (pip install artifex)

from artifex import Artifex

intent_classifier = Artifex().intent_classifier

print(intent_classifier("Hey there, how are you doing?"))
# >>> [{'label': 'greeting', 'score': 0.9955}]

Via the Transformers library

from transformers import pipeline

clf = pipeline("text-classification", model="tanaos/tanaos-intent-classifier-v1")

print(clf("Hey there, how are you doing?"))
# >>> [{'label': 'greeting', 'score': 0.9955}]

How to fine-tune (without training data)

Use the Artifex library to fine-tune the model to any language other than English, to custom domains or intent categories by generating synthetic training data on-the-fly. Install Artifex with

pip install artifex

Fine-tune to any language

from artifex import Artifex

intent_classifier = Artifex().intent_classifier

model_output_path = "./output_model/"

intent_classifier.train(
    domain="asistente de oficina",
    classes={
        "send_email": "Intención de enviar un correo electrónico a alguien.",
        "schedule_meeting": "Intención de programar una reunión con alguien.",
        "cancel_meeting": "Intención de cancelar una reunión programada previamente.",
        "reschedule_meeting": "Intención de cambiar la fecha o la hora de una reunión programada previamente.",
    },
    language="spanish",
    output_path=model_output_path
)

intent_classifier.load(model_output_path)
print(intent_classifier("Por favor, envía un correo electrónico a Juan para programar una reunión."))

# >>> [{'label': 'send_email', 'score': 0.9965}]

Fine-tune to custom domain or intent classes

from artifex import Artifex

intent_classifier = Artifex().intent_classifier

model_output_path = "./output_model/"

intent_classifier.train(
    domain="customer support",
    classes={
        "report_issue": "Intent to report a problem or issue with a product or service.",
        "request_refund": "Intent to request a refund for a product or service.",
        "ask_product_info": "Intent to ask for more information about a product or service.",
    },
    output_path=model_output_path
)

intent_classifier.load(model_output_path)
print(intent_classifier("I would like to report a problem with my recent order."))

# >>> [{'label': 'report_issue', 'score': 0.9933}]

Model Description

  • Base model: FacebookAI/roberta-base
  • Task: Text classification (intent classification)
  • Languages: English
  • Fine-tuning data: A synthetic, custom dataset of 10,000 utterances, each belonging to one of 12 different intent categories.

Training Details

This model was trained using the Artifex Python library

pip install artifex

by providing the following instructions and generating 10,000 synthetic training samples:

from artifex import Artifex


intent_classifier = Artifex().intent_classifier

intent_classifier.train(
    classes={
        "greeting": "Intent to greet or say hello.",
        "farewell": "Intent to say goodbye or farewell.",
        "thank_you": "Intent to express gratitude or thanks.",
        "affirmation": "Intent to agree or confirm something.",
        "negation": "Intent to disagree or deny something.",
        "small_talk": "Intent to engage in casual or light conversation with no specific purpose.",
        "bot_capabilities": "Inquiries about the bot's features or abilities.",
        "feedback_positive": "Intent to provide positive feedback about the bot, service, or experience.",
        "feedback_negative": "Intent to provide negative feedback about the bot, service, or experience.",
        "clarification": "Intent to ask for clarification or more information about a previous statement or question.",
        "suggestion": "Intent to offer a suggestion or recommendation for improvement.",
        "language_change": "Intent to request a change in the language being used by the bot or information about language options.",
    },
    num_samples=10000
)

Intended Uses

This model is intended to:

  • Classify user intents in chatbot and virtual assistant interactions.
  • Be used in various industries and applications, such as customer support, virtual assistants, and conversational agents.
  • Assist in routing user queries to appropriate responses or actions based on detected intent.

Not intended for:

  • Highly specialized domains where intents differ significantly from the predefined categories.
Downloads last month
521
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for tanaos/tanaos-intent-classifier-v1

Finetuned
(2080)
this model

Dataset used to train tanaos/tanaos-intent-classifier-v1