---
library_name: peft
license: llama3.1
base_model:
- meta-llama/Llama-3.1-70B-Instruct
tags:
- Behavior
- HumanBehavior
- BehavioralScience
- FoundationModel
model-index:
- name: Be.FM-70B
  results: []
language:
- en
extra_gated_prompt: >-
  By submitting the access request, you accept [Be.FM Terms of Use](https://docs.google.com/document/d/10n7ccfUAf89yQhx5u1lF45o70JgsOEYNu8bDxtRKbHA/edit?usp=sharing). Please simultaneously submit an access request via the [Google Form](https://forms.gle/DAvxJYReqg7midQn9). 
extra_gated_fields:
  I confirm that I have read and accept the BeFM Terms of Use as linked above: checkbox
  I confirm that I will simultaneously submit an access request via the Google Form as linked above: checkbox
---

## Overview

**Be.FM 70B** is an open foundation model for human behavior modeling, built on Llama 3.1 70B and fine-tuned on diverse behavioral datasets. It is designed to enhance the understanding and prediction of human decision-making.

**Paper**: [Be.FM: Open Foundation Models for Human Behavior](https://arxiv.org/abs/2505.23058)  

---

## Usage

Be.FM 70B is fine-tuned with behavioral data using an Alpaca-style instruction format. For best performance, prompts should include structured instructions with relevant behavioral context (e.g., demographics, survey/experiment setup).

You can use the model with Hugging Face Transformers and PEFT on at least four 40GB+ GPUs with 8-bit quantization. For optimal performance, we recommend four A100-class GPUs with bf16 or fp16 support.

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel, PeftConfig

base_model_id = "meta-llama/Llama-3.1-70B-Instruct"
peft_model_id = "befm/Be.FM-70B"

tokenizer = AutoTokenizer.from_pretrained(base_model_id)
model = AutoModelForCausalLM.from_pretrained(base_model_id, load_in_8bit = True, device_map="auto")
config = PeftConfig.from_pretrained(peft_model_id)
model = PeftModel.from_pretrained(model, peft_model_id)
```

---

## Inference

For inference, you may use the following demo function:

```python
def generate_response(model, tokenizer, system_prompt, user_prompt):
    user = f"Instruction: {user_prompt}\n\nResponse:"
    full_prompt = f"{system_prompt}\n\n{user}"
    inputs = tokenizer(full_prompt, return_tensors="pt").to("cuda")
    with torch.no_grad():
        outputs = model.generate(**inputs, max_new_tokens=256)
    res = tokenizer.decode(outputs[0], skip_special_tokens=True)
    return res
```

More examples can be found in the appendix of [our paper](https://arxiv.org/pdf/2505.23058). 

---

## Citation, Terms of Use, and Feedback

```bibtex
@article{xie2025fm,
  title={Be. FM: Open Foundation Models for Human Behavior},
  author={Xie, Yutong and Li, Zhuoheng and Wang, Xiyuan and Pan, Yijun and Liu, Qijia and Cui, Xingzhi and Lo, Kuang-Yu and Gao, Ruoyi and Zhang, Xingjian and Huang, Jin and others},
  journal={arXiv preprint arXiv:2505.23058},
  year={2025}
}
```

By using this model, you agree to [Be.FM Terms of Use](https://docs.google.com/document/d/10n7ccfUAf89yQhx5u1lF45o70JgsOEYNu8bDxtRKbHA/edit?usp=sharing).

We welcome your feedback on model performance as you apply Be.FM to your work. Please share your feedback via the [form](https://forms.gle/M4XJn9ervWzE3ujb9).