Module tokenization
This module allows you to tokenize dictionaries for better results.
Functions
def build_2D_substitute_matrix(dictionary, alphabet, substitute_dict)- 
build_2D_substitute_matrix()initiate and fill a 2 dimension matrix (dict of dict object) by browsing the dictionary.- dictionary (list): the input dictionary (after processing)
 - alphabet (list): the used alphabet (from input file or from dictionary)
- substitute_dict (dict): the substituted characters indexed by single substitution character
 
 - return (dict): the matrix representing the probability of letter chaining each other
 
 def check_tokenizable(dictionary)- 
check_tokenizable()checks if the dictionary contains any word with a digit or an uppercase character.- dictionary (list): the input dictionary (after processing)
 - return (bool) False if any digit or uppercase character, True otherwise
 
 def find_max(matrix, alphabet)- 
find_max()finds the most frequent character sequence.- matrix (dict): the matrix representing the probability of letter chaining each other
 - alphabet (list): the used alphabet (from input file or from dictionary)
 - return (tuple): the most frequent consecutive character sequence
 
 def plot_2D_matrix(matrix, alphabet, filename)- 
plot_2D_matrix()plot the matrix in a diagram using matplotlib.- matrix (dict): the matrix representing the probability of letter chaining each other
 - alphabet (list): the used alphabet (from input file or from dictionary)
 - filename (str): the name of the file to plot in
- return (None)
 
 
 def print_2D_matrix(matrix, alphabet)- 
print_2D_matrix()print the matrix row by row.s- matrix (dict): the matrix representing the probability of letter chaining each other
 - alphabet (list): the used alphabet (from input file or from dictionary)
- return (None)
 
 
 def reverse_substitution(word, substitute_dict)- 
reverse_substitution()decode a word from substitute to human readable.- word (str): the word to decode back
- substitute_dict (dict): the substituted characters indexed by single substitution character
 - return (str): the decoded word
 
 
 - word (str): the word to decode back
 def write_substitute_dictionary(dictionary, substitute_dict, filename)- 
write_substitute_dictionary()writes the dictionary in a file with substitutions.- dictionary (list): the input dictionary (after processing)
- substitute_dict (dict): the substituted characters indexed by single substitution character
 
 - filename (str): the name of the file to open (
writemode) - return (None)
 
 - dictionary (list): the input dictionary (after processing)