ckip_transformers.nlp.util module

This module implements the utilities for CKIP Transformers NLP drivers.

class ckip_transformers.nlp.util.CkipTokenClassification(model_name: str, tokenizer_name: Optional[str] = None)[source]

Bases: object

The base class for token classification task.

Parameters
  • model_name (str) – The pretrained model name (e.g. 'ckiplab/bert-base-chinese-ws').

  • tokenizer_name (str, optional, defaults to model_name) – The pretrained tokenizer name (e.g. 'bert-base-chinese').

__call__(input_text: Union[List[str], List[List[str]]], *, max_length: Optional[int] = None)[source]

Call the driver.

Parameters
  • input_text (List[str] or List[List[str]]) – The input sentences. Each sentence is a string or a list of string.

  • max_length (int) – The maximum length of the sentence, must not longer then the maximum sequence length for this model (i.e. tokenizer.model_max_length).

class ckip_transformers.nlp.util.NerToken(word: str, ner: str, idx: Tuple[int, int])[source]

Bases: tuple

A named-entity recognition token.

property word

str, the token word.

property ner

str, the NER-tag.

property idx

Tuple[int, int], the starting / ending index in the sentence.

__getnewargs__()

Return self as a plain tuple. Used by copy and pickle.

static __new__(_cls, word: str, ner: str, idx: Tuple[int, int])

Create new instance of NerToken(word, ner, idx)

__repr__()

Return a nicely formatted representation string