Here you find a description of the text of the letters of Descartes, the Text-Fabric model in general, and the node types, features of the this in particular.
See also
The corpus consists of letters, which are grouped in volumes. Letters are divided in paragraphs, some of which act as address, opener, closer, or postscriptum.
Letters may contain illustrations, symbols, and mathematical formulas.
We have added the concept of sentence.
A sentence is a piece of text within a paragraph that is
terminated by a .
.
Not all .
s act as sentence terminator, though, e.g. in
Kal. Aprilis
it marks an abbreviation.
We have tried to exclude most of these cases.
The purpose of adding sentences was to have a convenient division within paragraphs. This division can be used to display manageable chunks of the corpus.
It can also be used to detect parallel passages, i.e. similar sentences in the correspondence.
The Text-Fabric model views the text as a series of atomic units, called slots. In this corpus words are the slots.
On top of that, more complex textual objects can be represented as nodes. In this corpus we have node types for:
word, hi, figure, formula, sentence, head, opener, closer, postscriptum, address, p, page, letter, volume,
Note that slots are nodes themselves.
The type of every node is given by the feature
otype
.
Every node is linked to a subset of slots by
oslots
.
Nodes can be annotated with features. Relations between nodes can be annotated with edge features. See the table below.
Text-Fabric supports up to three customisable section levels. In this corpus we define them as: volume, letter, and p.
(Keep this under your pillow)
Node type word
This is the slot type, i.e. the nodes of type word
are the slots.
Basic unit corresponding to a word plus attached non-word stuff such as punctuation, brackets, etc.
feature | values | description |
---|---|---|
trans |
quaestionem |
the string that makes up a word, without punctuation |
punc |
, |
non-word characters after a word, including white-space |
isitalic |
1 |
indicates the word is in italics |
ismargin |
1 |
indicates the word is in the margin |
issub |
1 |
indicates the word is in subscript |
issup |
1 |
indicates the word is in superscript |
typ |
empty formula |
indicates the kind of word |
typ
=empty
: deliberately empty word, i.e.trans
is empty or absent; however,punc
may contain something, typically a space.- the *
is
*xxx features have only one possible value:1
. They can also be absent.
Node type hi
Stretches of text with special formatting.
This node type has no special features.
All words belonging to hi
nodes have their special formatting
recorded in the is...
features, listed under
word.
Nodes of this type may be nested. They may also overlap without proper nesting. However, as the data is generated from XML, it is likely that all overlap occurs in the form of nesting.
Node type figure
Figures come in two kinds: symbols and illustrations. They are represented by an image. These nodes have an empty slot, which links them to textual positions.
feature | values | description |
---|---|---|
typ |
symbol illustration |
the kind of image |
url |
cossic1.png AT1-101a.gif |
file name of the image |
Node type formula
Mathematical formula in TeX notation. They will be typeset by MathJax when being displayed. Note that in the Text-Fabric browser MathJax 3 is used, while in notebooks running JupyterLab 3.5 MathJax 2 still rules.
The TeX code sits in the trans
feature of a single slot
with typ
= formula
that belongs to the formula
node.
It also is contained, without the surrounding $
s, in the feature
notation
of the formula
node.
This gives you the opportunity to view the source code of formulas.
feature | values | description |
---|---|---|
notation |
TeX |
notation method of the formula |
tex |
A\over B |
TeX source code of a formula |
Node type sentence
Sentence, i.e. a part in a paragraph terminated by a full stop.
.
that are used for other purposes do not count as a full stop,
e.g. in abbreviations and numbers.
feature | values | description |
---|---|---|
n |
1 2 |
sequence number of a sentence within the paragraph. |
Node type head
Contains a paragraph at the start of a letter, acting as a header line.
Node type opener
Contains paragraphs at the start of a letter, the salutation.
Node type closer
Contains paragraphs at the end of a letter, the sender.
Node type postscriptum
Contains paragraphs at the end of a letter, between closer, and address, containing a postscript.
Node type address
Contains paragraphs at the end of a letter, after the closer, containing the address of the recipient.
Node type p
Section level 3.
Paragraph.
feature | values | description |
---|---|---|
n |
1 2 |
sequence number of a paragraph within the letter |
level |
2 3 |
level of a paragraph when it acts like a heading |
Node type page
Page in the printed edition.
feature | values | description |
---|---|---|
n |
1 2 |
sequence number of a page within the volume |
Node type letter
Section level 2.
Letter, identified by id
.
There is various metadata attached to letters,
such as senders, recipients, dates, locations.
feature | values | description |
---|---|---|
id |
1049 |
identifier of a letter |
alt_id |
AM1-005-002,AT,EJB010 |
alternative identifiers of a letter |
alt_date |
1639 |
alternative date of a letter |
cert |
recipientloc:cert=high,senderloc:cert=high |
indication of certitude per feature |
date |
1619-01-24 |
date of a letter |
intermediary |
Plempius:Vopiscus-Fortunatus:1601-1671 |
intermediary in the transmission of a letter |
language |
fr , la , nl , fr la |
language identifier(s) of a letter |
resp |
recipientloc:resp=EJB,senderloc:resp=EJB |
indication of responsibility for the value of a feature (EJB = Erik-Jan Bos) |
recipient |
Beeckman:Isaac:1588-1637 |
recipient of a letter |
recipientloc |
Middelburg, NL |
location of the recipient of a letter |
sender |
Descartes:Rene:1596-1650 |
sender of a letter |
senderloc |
Egmond aan den Hoef, NL |
location of the sender of a letter |
Node type volume
Section level 1.
Paragraph.
feature | values | description |
---|---|---|
n |
1 2 |
sequence number of a volume in the corpus. |
When we say that a feature is absent for a node, we mean that the node has no value
for that feature. For example, if the feature isitalic
is absent for node n
, then
F.isitalic.v(n)
results in the Python value None
, not the string 'None'
.
In queries, you can test for absence by means of #
:
word trans#
gives all lines where the feature trans
is absent.
See also search templates.
The following text formats are defined (you can also list them with T.formats
).
format | description |
---|---|
text-orig-full |
the full text of all words |
layout-orig-full |
the full text of all words, with special formatting indicating special characteristics of the text. |
The formats with text
result in strings that are plain text, without additional formatting.
The formats with layout
result in pieces HTML with CSS-styles;
the richness of layout enables us to code more information
in the plain representation, e.g. italic words or marginal words.