ckip_transformers.nlp.driver module¶
This module implements the CKIP Transformers NLP drivers.
-
class
ckip_transformers.nlp.driver.
CkipWordSegmenter
(model_name: Optional[str] = 'ckiplab/bert-base-chinese-ws', tokenizer_name: Optional[str] = None)[source]¶ Bases:
ckip_transformers.nlp.util.CkipTokenClassification
The word segmentation driver.
- Parameters
model_name (
str
, optional, defaults to'ckiplab/bert-base-chinese-ws'
) – The pretrained model name.tokenizer_name (
str
, optional, defaults to model_name) – The pretrained tokenizer name.
-
__call__
(input_text: List[str], *, max_length: Optional[int] = None) → List[List[str]][source]¶ Call the driver.
- Parameters
input_text (
List[str]
) – The input sentences. Each sentence is a string.max_length (
int
, optional) – The maximum length of the sentence, must not longer then the maximum sequence length for this model (i.e.tokenizer.model_max_length
).
- Returns
List[List[NerToken]]
– A list of list of words (str
).
-
class
ckip_transformers.nlp.driver.
CkipPosTagger
(model_name: Optional[str] = 'ckiplab/bert-base-chinese-pos', tokenizer_name: Optional[str] = None)[source]¶ Bases:
ckip_transformers.nlp.util.CkipTokenClassification
The part-of-speech tagging driver.
- Parameters
model_name (
str
, optional, defaults to'ckiplab/bert-base-chinese-pos'
) – The pretrained model name.tokenizer_name (
str
, optional, defaults to model_name) – The pretrained tokenizer name.
-
__call__
(input_text: List[List[str]], *, max_length: Optional[int] = None) → List[List[str]][source]¶ Call the driver.
- Parameters
input_text (
List[List[str]]
) – The input sentences. Each sentence is a list of strings (words).max_length (
int
, optional) – The maximum length of the sentence, must not longer then the maximum sequence length for this model (i.e.tokenizer.model_max_length
).
- Returns
List[List[NerToken]]
– A list of list of POS tags (str
).
-
class
ckip_transformers.nlp.driver.
CkipNerChunker
(model_name: Optional[str] = 'ckiplab/bert-base-chinese-ner', tokenizer_name: Optional[str] = None)[source]¶ Bases:
ckip_transformers.nlp.util.CkipTokenClassification
The named-entity recognition driver.
- Parameters
model_name (
str
, optional, defaults to'ckiplab/bert-base-chinese-ner'
) – The pretrained model name.tokenizer_name (
str
, optional, defaults to model_name) – The pretrained tokenizer name.
-
__call__
(input_text: List[str], *, max_length: Optional[int] = None) → List[List[ckip_transformers.nlp.util.NerToken]][source]¶ Call the driver.
- Parameters
input_text (
List[str]
) – The input sentences. Each sentence is a string.max_length (
int
, optional) – The maximum length of the sentence, must not longer then the maximum sequence length for this model (i.e.tokenizer.model_max_length
).
- Returns
List[List[NerToken]]
– A list of list of entities (NerToken
).