Jieba: Effective Chinese Text Segmentation
├── Introduction
│   └── Overview of Jieba
├── Setting Up the Environment
│   ├── Installing Jieba
│   └── Importing Libraries
├── Core Functionalities of Jieba
│   ├── Tokenization
│   ├── Adding Custom Words
│   └── Keyword Extraction
├── Practical Examples
│   └── Implementing Jieba in Text Processing
└── Conclusion
    └── Applications and Extensions

1. Introduction

Overview of Jieba

2. Setting Up the Environment

Installing Jieba

pip install jieba

Importing Libraries

import jieba

3. Core Functionalities of Jieba

Tokenization

pythonCopy code
text = "結巴斷詞是中文斷詞的Python開源工具。"
tokens = jieba.cut(text)
print(list(tokens))

Adding Custom Words

jieba.add_word('結巴斷詞')

Keyword Extraction