langdetect是一个Python模块,可以判断字符串的语言,支持P ython2和3。
它支持的语言如下:
1 2 3 |
af, ar, bg, bn, ca, cs, cy, da, de, el, en, es, et, fa, fi, fr, gu, he, hi, hr, hu, id, it, ja, kn, ko, lt, lv, mk, ml, mr, ne, nl, no, pa, pl, pt, ro, ru, sk, sl, so, sq, sv, sw, ta, te, th, tl, tr, uk, ur, vi, zh-cn, zh-tw |
安装langdetect:
1 |
pip install langdetect |
检测语言:
1 2 3 4 5 6 7 8 9 |
>>> from langdetect import detect >>> from langdetect import detect_langs >>> >>> detect("Hello World") 'en' >>> detect("你好 世界") 'zh-cn' >>> detect_langs("你好 世界") # 概率 [zh-cn:0.7142834493631727, ko:0.2857155059179153] |
langdetect源代码 https://github.com/Mimino666/langdetect