Create a dictionary using Binary Search Tree in python

Am C
4 min readDec 3, 2022

Here we are implementing a dictionary. Each entity in the dictionary consists of a word and its meaning.
A file with a list of words and its meaning is given as input.

In order to improve the time complexity, a Binary Search Tree is used for the dictionary implementation. To learn more about binary search trees, visit this link.

The aim is to implement the dictionary using Binary Search Tree.

The data for the dictionary is read from the input file and is inserted into the Binary Search Tree (BST).

We can implement this by creating functions for each task mentioned below.

(Here I have explained only the important functions and classes and how it’s implemented.

The code for the project is available at: https://gitlab.com/amalragc/python-dictionary)

1. Create a dictionary by reading the data from insert_file.txt , one by one, and inserting them into the BST

sample insert_file.txt given below.

cat / small domesticated carnivorous mammal from cat family
camel / long-necked ungulate mammal
air / the invisible gaseous substance
sky / region of the atmosphere
dog / a domesticated carnivorous mammal
cell / smallest structural and functional unit of an organism

We can create a class called FileOps for implementing all the functions related to file operations. The first function we are creating under FileOps is, read_from_file. This function is used to read the contents of a file and return those contents.

class FileOps:
def read_from_file(filepath):
file_contents = None
with open(filepath) as file:
file_contents = file.readlines()
file.close()
return file_contents

Next, we need to create a Binary Search Tree for storing these words and meanings that we read from the file. For all the operations related to the Binary tree, we are creating a class called BinaryTreeOps.

class BinaryTreeOps:
def constructBST(keys):
root = None
for key in keys:
root = BinaryTreeOps.insert(root, key)
return root
def insert(root, key):
curr = root
parent = None
if root is None:
return Node(key)
while curr:
parent = curr
if key < curr.data:
curr = curr.left
else:
curr = curr.right
if key < parent.data:
parent.left = Node(key)
else:
parent.right = Node(key)
return root

2. For searching, we need to read the search word from seacrh_file.txt with the tag “SearchWord:”

sample seacrh_file.txt given below.

SearchWord: cat
SearchWord: sky
SearchWord: shake
SubString: ca
SearchWord: aana
SubString: at

Here, for reading the contents of seacrh_file.txt,we can use the FileOps.read_from_file function which we defined earlier.

For isolating each search word and substrings to be searched, we can define a new function in the class FileOps called process_search_prompts_file. This will isolate the word, and perform a search on that word. The search function is mentioned in the next point.

def process_search_prompts_file(file_contents, root):
for i in file_contents:
if(i.__contains__("SearchWord")):
word = FileOps.isolate_search_word(i)
r = BinaryTreeOps.search(root, word)
search_word_results.append(r)
if(i.__contains__("SubString:")):
word = FileOps.isolate_search_word(i)
substring_prompts.append(word)
def isolate_search_word(search_input):
search_input = search_input.replace(" ", "")
prompt = search_input.split(":")
word = prompt[1].replace(" ", "")
word = prompt[1].replace("\n", "")
return word

3. For searching, the search function should read each word and check whether the word is present in the BST dictionary or not.

For this, we can define a new function called search under the BinaryTreeOps class.

def search(root, key):
while root != None:
prompt = root.data.split("/")
word = prompt[0]
# if key > root.data:
if key > word:
root = root.right
# elif key < root.data:
elif key < word:
root = root.left
else:
meaning = root.data.split("/")
meaning = meaning[1]
result_word_and_meaning = "{} - {}".format(key, meaning)
return result_word_and_meaning
result_not_found = "{} - not found".format(key)
return result_not_found

If yes it should add the word and its meaning to output_file.txt. If not it should show the word name “not found” in place of the description.

For writing the results to file, we can define a new function called write_to_file under the class FileOps.

def write_to_file(filename, data):
with open(filename, 'a') as file:
for line in data:
file.write(line)
file.close()

4. Substring search: For a given sub-string, list all the words with the substrings in it.

The program should read the input from the file seacrh_file.txt (sample mentioned above) with the tag “SubString:” and list all the words starting with the given substring. The result should be stored in a file output_file.txt.

sample input: SubString: ca

For searching substrings present, we can define a new function called search_substring under the class FileOps. This will return all the words which contain the substring.

def search_substring(value: str):
words_containing_substring = []
for i in words_in_dictionary:
if i.__contains__(value):
words_containing_substring.append(i)
return words_containing_substring

Here I have explained only the important functions and classes and how it’s implemented.

The code for the project is available at: https://gitlab.com/amalragc/python-dictionary

Run the below command to execute the project.

python3 python_dictionary.py

--

--

Am C

hola. This is mostly just self talk. ig: @_amc.__