Skip to content

ItsdbTreebanking_ItsdbModeling

FrancisBond edited this page Jul 19, 2005 · 19 revisions

Training a Stochastic Model

If you have treebanked a profile, and have Rob Malouf's maxent package (in particular the program estimate) installed, then you can train a scoring model which pet (PetTop) can use.

Select the treebanked profile (left-click), or profiles (click in the radio buttons) and then select Trees | Train from the menus. It will prompt you for the filename to put the scoring model in. The tradtion is something like corpus-version.mem. You should have the grammar used for treebanking loaded into the LKB (LkbTop). Training is normally fairly fast.

The scoring model is referenced in cheap's grammar.set:

;;;
;;; scoring mechanism (fairly embryonic, for now)
;;;
sm := "hinoki-050417.mem".

Scoring

You can compare the ranking of a given profile with a treebanked gold standard (assuming the same test-suite and grammar). The ranking can be changed by changing the scoring model in the parser.

To compare: select the gold standard (middle click), then the profile to be scored as the current database (left click); (make sure the current version of the grammar is loaded into the LKB).

Set: Trees | Switches | Implicit Ranks and Trees | Switches | Result Equivalence; and then go Trees | Score.

 ;; score results in .data. against ground truth in .gold.  operates in
 ;; several slightly distinct modes: (i) using the implicit parse ranking in
 ;; the order of `results' or (ii) using an explicit ranking from the `score'
 ;; relation an orthogonal dimension of variation is (a) scoring by result
 ;; identifier (e.g. within the same profile or against one that is comprised
 ;; of identical results) vs. (b) scoring by derivation equivalence (e.g.
 ;; when comparing best-first parser output against a gold standard).

In order to make the scoring faster, you should do a thinning normalize on the gold profile for comparison first. This thins (implicitly) to only those trees marked as good by the annotator, i.e. you thin out all dis-preferred trees. To get a 5-best comparison, play with the Scoring Beam value.

Clone this wiki locally