o
�J�h�� � @ s6 d Z ddlZddlZddlZddlZddlmZ ddlmZm Z m
Z
mZmZm
Z
mZ ddlmZmZmZmZmZmZmZmZmZmZmZmZmZ ddlmZmZm Z m!Z! e!�"e#�Z$dZ%d Z&d
Z'G dd� d�Z(G d
d� de(�Z)dd� Z*dd� Z+dd� Z,dd� Z-dd� Z.de
e/ de/fdd�Z0e e�G dd� de��Z1dS )z�
Tokenization classes for python tokenizers. For fast tokenizers (provided by HuggingFace's tokenizers library) see
tokenization_utils_fast.py
� N)�OrderedDict)�Any�Dict�List�Optional�Tuple�Union�overload� )
�ENCODE_KWARGS_DOCSTRING�'ENCODE_PLUS_ADDITIONAL_KWARGS_DOCSTRING�INIT_TOKENIZER_DOCSTRING�
AddedToken�
BatchEncoding�EncodedInput�EncodedInputPair�PreTokenizedInput�PreTokenizedInputPair�PreTrainedTokenizerBase� TextInput�
TextInputPair�TruncationStrategy)�PaddingStrategy�
TensorType�add_end_docstrings�loggingzspecial_tokens_map.jsonzadded_tokens.jsonztokenizer_config.jsonc @ sL e Zd ZdZdd� Zdd� Zdefdd�Zd ed
ee fdd�Z d
d� Z
dS )�Triez�
Trie in Python. Creates a Trie out of a list of words. The trie is used to split on `added_tokens` in one pass
Loose reference https://en.wikipedia.org/wiki/Trie
c G s"