Introduction

Running SLMs in web browsers

This repository is part of playbook for experiments on fine tuning small language models using LoRA, exporting them to ONNX and running them locally using ONNX compatibale runtime like javascript(node js) and WASM (browser)

Before you start

To run NodeJS example (NodeJS + onnx-runtime, server side)

  • Simple run node app.js This is what you should see

NodeJS application showing paraphrasing screen NodeJS runtime memory usage

To run web browser based demo (WASM based in-browser inference)

  • Simply access web.html from a local server (example http://localhost:3000/web.html)

This is what you should see

Web browser showing memory usage when running onnx model using WASM Web browser memory usage

Citation

@misc{allal2024SmolLM,
      title={SmolLM - blazingly fast and remarkably powerful}, 
      author={Loubna Ben Allal and Anton Lozhkov and Elie Bakouch and Leandro von Werra and Thomas Wolf},
      year={2024},
}
Downloads last month
58
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for code2022/SmolLM2-135M-Instruct-Paraphrase

Quantized
(86)
this model

Dataset used to train code2022/SmolLM2-135M-Instruct-Paraphrase