TreeParser

TreeParser is a standalone Python program that helps to teach grammatical Phrase Structure Rules (PSR's) and how they relate to sentence structure.

The tool can also be used by linguists to produce and generate grammatical trees to be included in a linguistics paper. Colours of nodes and lines may be set according to preference.

The visual tool allows the user to build up grammar PSRs and apply them to anything from a single phrase to a complete sentence. It allows for several languages, each in a separate tab and a large number of example phrases or clauses for each language. Right-to-left languages have the PSRs, the sentences and tree structures all displayed right to left.

Getting Started

Download and unzip TreeParser. TreeParser may be downloaded from here.

Start with a simple noun phrase NP. Sentences or phrases are entered as an interlinear display with tabs between each word. Below each word is its grammatical category, such as N for a noun, A for an adjective, AP for an adjective phrase, NP for noun phrase, D for determiner and so on. Grammatical categories are flexible and may differ depending on grammatical model used. So a simple phrase of "the little dog" may be typed in as:

Now we enter a simple PSR for an English NP

NP <- (D) (A)* N

The round brackets indicate that the element is optional, since "dog" on its own is a valid NP. The asterix * indicates that the element may be repeated because we know that "the hungry little brown dog" with three adjectives is also a valid NP in English. You will notice that the program immediately parses the phrase based on the PSR and produces a small tree diagram.

Now, we realize that this isn't quite correct. An adjective cannot be directly inserted into an NP but is itself the head of an adjective phrase AP. In the AP "very little" we see that it can also have a degree Deg.

So to handle this correctly, we must enter a simple PSR for an English AP

AP <- (Deg) A

and the NP must be modified to accommodate the AP.

NP <- (D) (APD)* N

This now will produce:

Simple corrected English noun phrase display

The User Interface

Control Bar

The control bar consists of seven items.

The "+" button adds a new example sentence or phrase and the "-" button deletes the current one.
The "<-" button moves to the previous example sentence and the "->" goes to the next one.
The "Copy" and "RTF" buttons copy the tree diagram as either a JPG image or RTF object to the clipboard so that it may be pasted into other programs (see below).
The "Parsed" checkbox toggles display of the Parsed Sentence field (see below). The Sentence and Parsed Sentence fields may also be shown and hidden by dragging the splitter bar between them up and down.

Formatting the Tree Diagram

The font and text size of each of the terminal nodes may be changed by selecting a font and size from the formatting toolbar.

The overall orientation may be changed from left-to-right to right-to-left by checking the RTL checkbox.

Colours of each type of node or the lines may be changed by clicking one of the "Terminal Nodes", "Other Nodes" or "Lines" buttons and selecting a new colour.

Using the Tree Diagram in other programs

The tree diagram may be copied to the clipboard as either a JPG image or an RTF object, which may be pasted into another program such as MS Word. To do this, press either the Copy or RTF buttons on the Control Bar. Pasting the tree diagram as an RTF object has the advantage of being able to edit the object in the other program.

PSR Advanced Features

There are a few additional things to say about PSRs in TreeParser.

Alternation

The | (or) operator allows for one category or another, so specifying D|Q would allow either a determiner or a quantifier to occur at that place in the PSR.

(WH|S'|PP)

allows for optionally one of a WH question word, an embedded clause S' or a prepositional phrase PP

Grammatical Relations

Grammatical relations are represented in square brackets, so a PSR for an English clause of:

S <- NP[Su] VP (S')

The NP will still match any NP in the sentence but will display NP[Su] in the tree diagram.

Features

Other features may also be represented in square brackets. For example, if one wanted to indicate that a PP is possessive, one might include it in a PSR by PP[Poss]:

NP <- (D|QP) (AP)* N (PP[Poss])

Conjunctions

Handling conjunctions requires breaking the rules governing the creation of PSRs, specifically that a PSR must have only one head and that head may not be a phrasal category. In the case of a simple conjunction such as the word "and" in the English NP "John and Bob", the PSR must actually consist of:

NP <- NP C NP.

This is totally non-conventional. Likewise, for conjoining other constituents, there must be other similar PSRs for S, VP, AP, PP, AdvP, etc. TreeParser will handle these types of PSRs but I feel it is a bit of a hack. If anyone has a better way of handling them I would love to hear it.

Parsed Sentence

Checking the "Parsed" checkbox on the toolbar shows the Parsed Sentence field at the bottom of the display where the result of parsing the current sentence is stored before building the tree diagram. It is possible to edit this field and press the "Build" button to have TreeParser build the modified tree diagram. This parsed sentence may also be copied and pasted into the LingTree application in order to benefit from its formatting ability.

Currently TreeParser does not use the slash and backslash codes in its parsed sentence as can be done in LingTree. The expectation is that nodes that have no children are terminal nodes and will be coloured accordingly. Adding these codes may be considered in further development to increase the compatibility between these two applications.