Friday, October 23, 2009

20.6 Trees




I l@ve RuBoard










20.6
Trees



Suppose we want to create an alphabetized list of
the words that appear in a file. We could use a linked list, but
searching a linked list is slow because we must check each element
until we find the correct insertion point. By using a data type
called a tree, we can reduce the number of
comparisons tremendously. A binary tree
structure
looks like Figure 20-12.

Each box is
called a node of the tree.


The box at the
top is the root and the boxes at the bottom are
the leaves.[1] Each node contains two pointers:
a left pointer and a right pointer, which point to the left and right
subtrees.


[1] Programming trees
are written with the root at the top and the leaves at the bottom.
Common sense tells you that this is upside down. In case you
haven't noticed, common sense has very little to do
with programming.




Figure 20-12. Tree search



The structure for a tree is:



class tree {
private:
class node {
public:
string data; // Word for this tree
private:
node *right; // Tree to the right
node *left; // Tree to the left
friend class tree;
};
public:
node *root; // Top of the tree (the root)

tree( ) {root = NULL;};
// ... Other member function
};


Trees are often used for storing a
symbol table (a list of variables used in a program). In this chapter
we will use a tree to store a list of words and then to print the
list alphabetically. The advantage of a tree over a linked list is
that searching a tree takes considerably less time.



In this example, each node stores a single word. The left subtree
stores all the words less than the current word, and the right
subtree stores all the words greater than the current word.



For example, Figure 20-13 shows how we descend the
tree to look for the word "orange."
We would start at the root,
"lemon." Because
"orange" >
"lemon," we would descend the right
link and go to "pear." Because
"orange" <
"pear," we descend the left link,
where we find "orange."




Figure 20-13. Tree



Recursion is extremely useful with trees. Our rules for recursion are
1) the function must make things simpler and 2) there must be some
endpoint.




The algorithm for inserting a word in a tree is:




  1. If this is a null tree (or subtree), create a one-node tree with this
    word.


  2. If this node contains the word, do nothing.


  3. Otherwise, enter the word in the left or right subtree, depending on
    the value of the word.



Does this algorithm satisfy our recursion rules? The function has two
definite endpoints:




  • A match is found.


  • We have a null node.



Otherwise, we enter the word into a subtree (which is simpler than
the whole tree).



To see how this works, consider what happens when we insert the word
"fig" into the tree. First we check
the word "fig" against
"lemon."
"Fig" is smaller, so we go to
"apple." Because
"fig" is bigger, we go to
"grape." Because
"fig" is smaller than
"grape," we try the left link. It
is NULL, so we create a new node.



The function to enter a value into a tree is:



void tree::enter_one(node *&tree_node, const string& word) 
{
int result; // Result of strcmp

// See if we have reached the end
if (tree_node == NULL) {
tree_node = new node;

tree_node->left = NULL;
tree_node->right = NULL;
tree_node->word = word;
}
if (tree_node->data == word)
return;

if (tree_node->data < word)
enter_one(tree_node->right, word);
else
enter_one(tree_node->left, word);
}


The function to start this process is:



void tree::enter(char *word) {
enter_one(root, word);
};


This function passes a pointer to the root of the tree to
enter_one. If the root is NULL,
enter_one creates the node. Because we are changing the
value of a pointer, we must pass a reference to the pointer.









    I l@ve RuBoard



    No comments: