* 1) Default mode, attempt to cut the sentence into the most accurate segmentation, which is suitable for text analysis;
* 2) Full mode, break the words of the sentence into words scanned, which is suitable for search engines.
Usage
Usage
========
* Fully automatic installation: `easy_install jieba` or `pip install jieba`
* Semi-automatic installation: Download http://pypi.python.org/pypi/jieba/ , after extracting run `python setup.py install`
@ -163,7 +163,7 @@ Function 2): Add a custom dictionary
* Developers can specify their own custom dictionary to include in the jieba thesaurus. jieba has the ability to identify new words, but adding your own new words can ensure a higher rate of correct segmentation.
* Usage:`jieba.load_userdict(file_name) # file_name is a custom dictionary path`
* The dictionary format is the same as that of `dict.txt`: one word per line; each line is divided into two parts, the first is the word itself, the other is the word frequency, separated by a space
* The dictionary format is the same as that of `analyse/idf.txt`: one word per line; each line is divided into two parts, the first is the word itself, the other is the word frequency, separated by a space