-
Notifications
You must be signed in to change notification settings - Fork 4
LexDbInternals
The LexDB uses a PostgreSQL database to provide a source of lexical items for client applications such as the LKB. This page provides some minimal documentation on the structure of the database.
The fld table stores user-defined the field definitions used in constructing the rev table below. The contents are set by the script install-lexdb. Field definitions cannot be altered once the LexDB has been created.
The rev table stores revisions of lexical items. It has the following structure. The first 4 fields, which hard-coded into every LexDB, have the following definitions:
Column | Type | Modifiers
---------------+--------------------------+-----------
name | text |
userid | text |
modstamp | timestamp with time zone |
dead | boolean |
Following these are the user-defined fields specific to a particular LexDB. These are obtained from the .fld file provided to the script install-lexdb. They are also stored in the public.fld table of the LexDB for later reference. The following are the user-defined fields used by the ERG LexDB:
type | text |
orthography | text |
keyrel | text |
altkey | text |
alt2key | text |
keytag | text |
altkeytag | text |
compkey | text |
ocompkey | text |
pronunciation | text |
complete | text |
semclasses | text |
preferences | text |
classifier | text |
selectrest | text |
jlink | text |
comments | text |
exemplars | text |
usages | text |
lang | text |
country | text |
dialect | text |
domains | text |
genres | text |
register | text |
confidence | real |
source | text |
The fields (name,userid,modstamp) provide the primary key. The field dead marks dead revisions.
The dfn table stores the mapping used to construct TDL lexical entries from fields of a revision in rev (the entry can then be processed by the client application in the same manner as entries obtained from a textual TDL lexicon file). See LexDbFieldMappings.
Column | Type | Modifiers
--------+------+-----------
mode | text | not null
slot | text | not null
field | text | not null
path | text |
type | text |
This table stored miscellaneos configuration settings and other data.
A sample public.meta is shown below:
var | val
-----------------------+-------------------------------
lexdb-version | 4.80
supported-psql-server | 7.4
supported-psql-server | 8.0
supported-psql-server | 8.1
user | foo
user | bar
mod_time | 2005-11-16 19:12:16.798424+00
A sample private meta is shown below:
var | val
------------+-------------------------------
filter | true
build_time | 2005-11-16 19:12:32.446169+00
mod_time | 2005-11-16 19:11:18.955607+00
-
user is set for each user for whom a private schema has been initialized;
-
filter is an SQL WHERE-clause which determines which rev entries are visible to a user's lex table;
-
mod-time stores the time at which data in the schema was last modified;
-
build_time stores the time at which the (private schema) lex table was last rebuilt.
The PostgreSQL database is created and owner by the database user lexdb. This user is the owner of the schema public. In order to make use of the LexDB a client must log on as a separate user. The first time such a user connects to the LexDB a private database schema will be initialized.
A private schema contains a private rev, rev_key and meta tables. When lexical items are modified by the user (or new items created) changes are stored in the private schema. When a user is happy with any changes the new rev (and associated rev_key) entries are then transferred to the public schema. (This requires a lexdb login authorization.)
A private schema also contains lex_cache and lex_key tables, and a lex view with the same structure as the rev table. The lex view contains full details of all entries directly accessible for lexical lookup (that is, for each named entry, the most recent revision to pass the filter). The lex_cache and lex_key tables contain cached details of these entries in order to allow efficient lookup.
Column | Type | Modifiers
-------------+--------------------------+-----------
name | text |
userid | text |
modstamp | timestamp with time zone |
orthography | text |
This table provides keys for the lookup of lexical items by component words. Eg. a revision with orthography 'a few' will be keyed on both 'a' and 'few'. Keys are in normalized (lower case) form as provided by the client application. (We do not use the PostgreSQL lower() function as it may differ to the equivalent function used in the client application.)
Column | Type | Modifiers
----------+--------------------------+-------------------------------------------------------------
name | text | not null
userid | text | not null default "current_user"()
modstamp | timestamp with time zone | not null default ('now'::text)::timestamp(6) with time zone
key | text | not null
Eg.
Column | Type | Modifiers
---------------+--------------------------+-----------
name | text |
userid | text |
modstamp | timestamp with time zone |
dead | boolean |
type | text |
orthography | text |
keyrel | text |
altkey | text |
alt2key | text |
keytag | text |
altkeytag | text |
compkey | text |
ocompkey | text |
pronunciation | text |
complete | text |
semclasses | text |
preferences | text |
classifier | text |
selectrest | text |
jlink | text |
comments | text |
exemplars | text |
usages | text |
lang | text |
country | text |
dialect | text |
domains | text |
genres | text |
register | text |
confidence | real |
source | text |
View definition:
SELECT rev_all.*
FROM lex_cache
JOIN rev_all USING (name, userid, modstamp);
Home | Forum | Discussions | Events