SindBERT, the Sailor: Charting the Seas of Turkish NLP 事件
PRODUCT_LAUNCH2026-06-02影响: MEDIUM
SindBERT, the Sailor: Charting the Seas of Turkish NLP arXiv:2510.21364v2 Announce Type: replace Abstract: Transformer models have revolutionized NLP, yet many morphologically rich languages remain underrepresented in large-scale pre-training efforts. With SindBERT, we set out to chart the seas of Turkish NLP, providing the first large-scale RoBERTa-based encoder for Turkish. Trained from scratch on 312~GB of Turkish text (mC4, OSCAR23, Wikipedia), SindBERT is released in both base and large co
相关人物
暂无数据
相关产品查看全部 (10)
相关报道查看全部 (1)
SindBERT, the Sailor: Charting the Seas of Turkish NLP
ArXiv CS.CL2026-06-02