1 4 7

Pedro Cabral

cabralski

AI & ML interests

AI/ML at Hapvida Notredame Intermédica. I love computer science.

Recent Activity

upvoted an article about 1 month ago

Transformers v5: Simple model definitions powering the AI ecosystem

liked a Space 3 months ago

enzostvs/deepsite

liked a model 8 months ago

madhurjindal/autonlp-Gibberish-Detector-492513457

View all activity

Organizations

None yet

upvoted an article about 1 month ago

Article

Transformers v5: Simple model definitions powering the AI ecosystem

Dec 1, 2025

•

270

liked a Space 3 months ago

DeepSite v3

🐳

16.3k

Generate any application by Vibe Coding

liked a model 8 months ago

madhurjindal/autonlp-Gibberish-Detector-492513457

Text Classification • 67M • Updated May 14, 2025 • 134k • • 66

liked a model 9 months ago

ronaldo-lage-pessoa/deberta-healthcare-pt-1024

Updated May 6, 2025 • 3

updated a dataset 9 months ago

cabralski/MedQA-ptBR

Viewer • Updated Apr 29, 2025 • 12.7k • 29

published a dataset 9 months ago

cabralski/MedQA-ptBR

Viewer • Updated Apr 29, 2025 • 12.7k • 29

New activity in cabralski/IndustryOR-PTBR 9 months ago

Improve dataset card: Add paper link, clarify title

#2 opened 9 months ago by

nielsr

liked a model 11 months ago

stepfun-ai/GOT-OCR-2.0-hf

Image-Text-to-Text • 0.6B • Updated Jan 31, 2025 • 11k • 222

liked a model 12 months ago

deepseek-ai/DeepSeek-R1

Text Generation • 685B • Updated Mar 27, 2025 • 356k • • 12.9k

liked a model about 1 year ago

microsoft/phi-4

Text Generation • 15B • Updated Nov 24, 2025 • 510k • 2.21k

upvoted a paper about 1 year ago

ORLM: Training Large Language Models for Optimization Modeling

Paper • 2405.17743 • Published May 28, 2024 • 3

updated a dataset about 1 year ago

cabralski/IndustryOR-PTBR

Viewer • Updated Apr 7, 2025 • 100 • 61

liked a dataset about 1 year ago

CardinalOperations/IndustryOR

Viewer • Updated Oct 11, 2025 • 100 • 278 • 19

reacted to singhsidhukuldeep's post with 🧠 about 1 year ago

Post

3747

Exciting breakthrough in AI: @Meta 's new Byte Latent Transformer (BLT) revolutionizes language models by eliminating tokenization!

The BLT architecture introduces a groundbreaking approach that processes raw bytes instead of tokens, achieving state-of-the-art performance while being more efficient and robust. Here's what makes it special:

>> Key Innovations
Dynamic Patching: BLT groups bytes into variable-sized patches based on entropy, allocating more compute power where the data is more complex. This results in up to 50% fewer FLOPs during inference compared to traditional token-based models.

Three-Component Architecture:
• Lightweight Local Encoder that converts bytes to patch representations
• Powerful Global Latent Transformer that processes patches
• Local Decoder that converts patches back to bytes

>> Technical Advantages
• Matches performance of Llama 3 at 8B parameters while being more efficient
• Superior handling of non-English languages and rare character sequences
• Remarkable 99.9% accuracy on spelling tasks
• Better scaling properties than token-based models

>> Under the Hood
The system uses an entropy model to determine patch boundaries, cross-attention mechanisms for information flow, and hash n-gram embeddings for improved representation. The architecture allows simultaneous scaling of both patch and model size while maintaining fixed inference costs.

This is a game-changer for multilingual AI and could reshape how we build future language models. Excited to see how this technology evolves!

3 replies

upvoted a paper about 1 year ago

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 376

reacted to julien-c's post with 🔥 about 1 year ago

Post

11300

After some heated discussion 🔥, we clarify our intent re. storage limits on the Hub

TL;DR:
- public storage is free, and (unless blatant abuse) unlimited. We do ask that you consider upgrading to PRO and/or Enterprise Hub if possible
- private storage is paid above a significant free tier (1TB if you have a paid account, 100GB otherwise)

docs: https://huggingface.co/docs/hub/storage-limits

We optimize our infrastructure continuously to scale our storage for the coming years of growth in Machine learning, to the benefit of the community 🔥

cc: @reach-vb @pierric @victor and the HF team

29 replies

Pedro Cabral

AI & ML interests

Recent Activity

Organizations

cabralski's activity

Transformers v5: Simple model definitions powering the AI ecosystem

DeepSite v3

Improve dataset card: Add paper link, clarify title