StructureDataset CLDF dataset with data and supplements for Barlow “Loss of colexification of ‘hand’ and ‘five’ in Austronesian languages”
CLDF Metadata: StructureDataset-metadata.json
Sources: sources.bib
property | value |
---|---|
dc:conformsTo | CLDF StructureDataset |
dc:license | https://creativecommons.org/licenses/by/4.0/ |
dcat:accessURL | https://github.com/cldf-datasets/barlowhandandfive |
prov:wasDerivedFrom | |
prov:wasGeneratedBy |
|
rdf:ID | barlowhandandfive |
rdf:type | http://www.w3.org/ns/dcat#Distribution |
Table values.csv
property | value |
---|---|
dc:conformsTo | CLDF ValueTable |
dc:extent | 6063 |
Name/Property | Datatype | Description |
---|---|---|
ID | string Regex: [a-zA-Z0-9_\-]+ |
Primary key |
Language_ID | string |
References languages.csv::ID |
Parameter_ID | string |
References parameters.csv::ID |
Value | string |
|
Code_ID | string |
References codes.csv::ID |
Comment | string |
|
Source | list of string (separated by ; ) |
References sources.bib::BibTeX-key |
Table forms.csv
property | value |
---|---|
dc:conformsTo | CLDF FormTable |
dc:extent | 2023 |
Name/Property | Datatype | Description |
---|---|---|
ID | string Regex: [a-zA-Z0-9_\-]+ |
Primary key |
Language_ID | string |
A reference to a language (or variety) the form belongs to References languages.csv::ID |
Parameter_ID | string |
A reference to the meaning denoted by the form References parameters.csv::ID |
Form | string |
The written expression of the form. If possible the transcription system used for the written form should be described in CLDF metadata (e.g. via adding a common property dc:conformsTo to the column description using concept URLs of the GOLD Ontology (such as phonemicRep or phoneticRep) as values). |
Segments | list of string (separated by ) |
|
Comment | string |
|
Source | list of string (separated by ; ) |
References sources.bib::BibTeX-key |
Contribution_ID | string |
Key of lexical dataset from which the form was taken. References contributions.csv::ID |
Glottocode_in_dataset |
string |
Glottocode assigned to the variety in the source dataset from which the form was selected |
Language_name_in_dataset |
string |
Name of the variety in the source dataset from which the form was selected |
Table languages.csv
This table lists each language-level languoid in Glottolog 5.0 classified as Austronesian. Languages are roughly sorted by genealogy and then geography, more or less reflecting the spread of Austronesian languages from Taiwan to Polynesia. This sorting is reflected by the numbers given in the “Number” column.
property | value |
---|---|
dc:conformsTo | CLDF LanguageTable |
dc:extent | 1274 |
Name/Property | Datatype | Description |
---|---|---|
ID | string Regex: [a-zA-Z0-9_\-]+ |
Primary key |
Name | string |
|
Macroarea | string |
|
Latitude | decimal ≥ -90 ≤ 90 |
|
Longitude | decimal ≥ -180 ≤ 180 |
|
Glottocode | string Regex: [a-z0-9]{4}[1-9][0-9]{3} |
|
ISO639P3code | string Regex: [a-z]{3} |
|
Number |
integer |
|
Melanesia |
string Valid choices: yes no |
Languages are classified as being in Melanesia if they are primarily spoken in PG, SB, VU, NC or the Western New Guinea provinces of ID. |
Table contributions.csv
Forms for this study (i.e., words for the concepts ‘five’ and ‘hand’ in Austronesian languages) were taken from the four datasets listed in this table.
property | value |
---|---|
dc:conformsTo | CLDF ContributionTable |
dc:extent | 4 |
Name/Property | Datatype | Description |
---|---|---|
ID | string Regex: [a-zA-Z0-9_\-]+ |
Primary key |
Name | string |
|
Description | string |
|
Contributor | string |
|
Citation | string |
Table parameters.csv
This dataset provides three kinds of parameters: 1) The two concepts ‘hand’ and ‘five’, with the corresponding forms listed in FormTable; 2) six parameters analyzing the colexification status for these two concepts in Austronesian languages, with values listed in ValueTable; and 3) one parameter replicating coding decisions about types of numeral systems, derived from Barlow (2023) but updated here to reflect changes in classifications between Glottolog versions 4.6 and 5.0, with values also listed in ValueTable.
property | value |
---|---|
dc:conformsTo | CLDF ParameterTable |
dc:extent | 9 |
Name/Property | Datatype | Description |
---|---|---|
ID | string Regex: [a-zA-Z0-9_\-]+ |
Primary key |
Name | string |
|
Description | string |
|
ColumnSpec | json |
Table codes.csv
property | value |
---|---|
dc:conformsTo | CLDF CodeTable |
dc:extent | 31 |
Name/Property | Datatype | Description |
---|---|---|
ID | string Regex: [a-zA-Z0-9_\-]+ |
Primary key |
Parameter_ID | string |
The parameter or variable the code belongs to. References parameters.csv::ID |
Name | string |
|
Description | string |
|
color |
string |
Table replacements.csv
This table lists coding decisions for “replacement events” for the words for ‘hand’ or ‘five’ in subgroups or single languages of the Austronesian family. For the concept ‘hand’, a row represents a probable loss of the inherited Proto-Austronesian form *qalima ‘hand’, whether in the individual history of a single language or in a protolanguage ancestral to multiple languages. For the concept ‘five’, a row represents a probable loss of the inherited Proto-Austronesian form *lima ‘five’.
Replacement events are considered taking a relatively conservative approach—that is, a replacement event is reconstructed to a protolanguage only if there is strong evidence for it and no apparent exceptions (such as a reflex of *qalima ‘hand’ found in one or more member languages of the given group).
property | value |
---|---|
dc:extent | 189 |
Name/Property | Datatype | Description |
---|---|---|
ID | string |
Primary key |
Replacement_Group |
string |
Replacement events can also be considered taking a more liberal approach—that is, replacement events can, in some cases, be reconstructed to higher-order protolanguages or to multiple protolanguages in an area, either when the apparent exceptions seem to be possibly due to subsequent borrowing or when the “replacement event” could be viewed as a single areal spread across multiple languages or language groups. The “conservative” replacement events listed here are grouped into “liberal” events via matching values for the Replacement_Group column. If there is no discrepancy between the more conservative and the more liberal approaches, an event will be in a replacement group of its own. |
Subgroup |
string |
|
Comment |
string |
|
Source |
string |
|
Concept | string |
References parameters.csv::ID |
Language_IDs | list of string (separated by ) |
References languages.csv::ID |