Post
58
Ai4Privacy has been working on this for the past year. π
Today we're releasing the PII Masking 2M Series, the world's largest open source privacy masking dataset. (Again. ππ)
π’ 2M+ synthetic examples
π 32 locales across Europe
π·οΈ 98 entity types
π₯π¬π¦πΌπ 5 industry verticals: Health, Finance, Digital, Work, Location
β 1M+ entries freely available on Hugging Face
Every example is 100% synthetic. No real personal data. Built so you can train and evaluate PII detection models without the legal headaches. π
Thank you for 15,000,000+ downloads across our datasets, models, and libraries. This one's for you. β€οΈ
hashtag#privacy hashtag#ai hashtag#opensource hashtag#nlp hashtag#gdpr hashtag#pii hashtag#huggingface hashtag#machinelearning
Today we're releasing the PII Masking 2M Series, the world's largest open source privacy masking dataset. (Again. ππ)
π’ 2M+ synthetic examples
π 32 locales across Europe
π·οΈ 98 entity types
π₯π¬π¦πΌπ 5 industry verticals: Health, Finance, Digital, Work, Location
β 1M+ entries freely available on Hugging Face
Every example is 100% synthetic. No real personal data. Built so you can train and evaluate PII detection models without the legal headaches. π
Thank you for 15,000,000+ downloads across our datasets, models, and libraries. This one's for you. β€οΈ
hashtag#privacy hashtag#ai hashtag#opensource hashtag#nlp hashtag#gdpr hashtag#pii hashtag#huggingface hashtag#machinelearning