Skip to content

Commit

Permalink
Merge pull request #1824 from keymanapp/auto/developer-help-18.0.178-…
Browse files Browse the repository at this point in the history
…alpha/TC-18.0.178

auto: Keyman for developer help deployment
  • Loading branch information
keyman-status authored Jan 27, 2025
2 parents 8c1017f + 795b744 commit 846f7b0
Show file tree
Hide file tree
Showing 4 changed files with 22 additions and 22 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -23,16 +23,16 @@ the specification itself.
There are three ways - all of them optional - to extend and customize
the word-breaking rules themselves:

* If you need to prevent splits in very specific scenarios and/or add splits in other specific scenarios, you may specify [context-based rules](#rules) to obtain the desired behavior.
* If you need to prevent splits in very specific scenarios and/or add splits in other specific scenarios, you may specify [context-based rules](#toc-custom-word-breaking-rules) to obtain the desired behavior.
* If certain characters are not handled appropriately for their role in
your language, you may [map characters](#map) to different
your language, you may [map characters](#toc-character-property-remapping) to different
word-breaking character classes - including custom ones. This will
override the default property they are assigned by the default
implementation, with the new property applying for all word-breaking
rules.
* If the default word-breaking classes from the specification are
too general for certain aspects of your language, it is possible to
[define custom character classes](#define) for use in custom
[define custom character classes](#toc-defining-and-using-new-word-breaking-properties) for use in custom
rules.

## A first example
Expand Down Expand Up @@ -183,7 +183,7 @@ the potential boundary.
### Word-breaking property names
The names used in each array must be defined in one of the following places:
* https://unicode.org/reports/tr29/#Table_Word_Break_Property_Values
* `customProperties` - your [declaration of any custom property types](#define)
* `customProperties` - your [declaration of any custom property types](#toc-defining-and-using-new-word-breaking-properties)
* One of the special property types `"Other"`, `"sot"`, or `"eot"`:
* `Other`: a character without an associated word-breaking property value
* `sot`: "start of text" - a marker indicating the beginning of the string being word-broken
Expand Down Expand Up @@ -277,7 +277,7 @@ As noted at the top of the file:

### Redefining character properties

Of note from [our first example](#example):
Of note from [our first example](#toc-a-first-example):

```typescript
/*** Character class overrides for specific characters ***/
Expand Down Expand Up @@ -324,7 +324,7 @@ words (and/or names) in some languages. Default word-breaking behavior will spl
hyphenated words and names apart, but by changing the property of hyphens, it is
possible to disable this behavior.

Noting [rule WB6](#WB6) and WB7, the `MidLetter` class is designed to prevent
Noting [rule WB6](#toc-custom-word-breaking-rules) and WB7, the `MidLetter` class is designed to prevent
word-breaks from occurring when its characters lie directly between letters -
hence the property name. Assigning hyphens to this class can provide the
desired behavior.
Expand Down Expand Up @@ -364,7 +364,7 @@ the second example above. (After all, `can'` could be the end of a quoted
phrase in English - `'sure you can'` - in which case we might want the split
to occur.)

Revisiting [an earlier example](#example) and simplifying a little bit:
Revisiting [an earlier example](#toc-redefining-character-properties) and simplifying a little bit:

```typescript
/*** Definition of extra word-breaking rules ***/
Expand Down Expand Up @@ -440,7 +440,7 @@ included a couple of extra rules:
}
```

By replicating [WB6](#WB6) and WB7's structure and allowing `Hyphen` to match in the same
By replicating [WB6](#toc-custom-word-breaking-rules) and WB7's structure and allowing `Hyphen` to match in the same
position as `MidLetter` in the original rules, we can prevent word-breaking splits
after additional text has been typed after a `Hyphen`-property character. This does not
_replace_ the behavior of WB6 and WB7 - it merely _extends_ it to include the new property.
Expand Down
6 changes: 3 additions & 3 deletions developer/18.0/guides/lexical-models/advanced/word-breaker.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,9 @@ However, in languages written in other scripts — especially East Asian
scripts like Chinese, Japanese, Khmer, Lao, and Thai — there are no obvious break in between words. For these languages, there must be special rules for determining when words start and stop. This is what a _word breaker function_ is responsible for. It is a little bit of code that looks at some text to determine where the words are.

You can customize the word breaker in three ways:
- If your language uses its writing system in an unconventional way (e.g., use spaces to separate words in Thai, Lao, Burmese, or Khmer), you can [override the script's default behaviour](#overrides)
- If the default word breaker creates **too many splits**, you can [choose which strings join words together](#join).
- If the default word breaker creates **not enough splits**, you must [create your own word breaker function](#custom).
- If your language uses its writing system in an unconventional way (e.g., use spaces to separate words in Thai, Lao, Burmese, or Khmer), you can [override the script's default behaviour](#toc-overriding-script-defaults)
- If the default word breaker creates **too many splits**, you can [choose which strings join words together](#toc-customize-joining-rules).
- If the default word breaker creates **not enough splits**, you must [create your own word breaker function](#toc-writing-a-custom-word-breaker-function).
- Alternatively, you may choose to [customize and extend the wordbreaker's behavior](./unicode-breaker-extension) by adding extra rules and changing how it treats specific characters.

## Overriding script defaults
Expand Down
20 changes: 10 additions & 10 deletions developer/18.0/reference/file-types/metadata.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,25 +38,25 @@ The kmp.json file is a base object described as

: `Object`

[`System`](#obj-system) object.
[`System`](#toc-the-system-object) object.

`options`

: `Object`

An [`Options`](#obj-options) object.
An [`Options`](#toc-the-options-object) object.

`startMenu`

: `Object`

[`Start Menu`](#obj-startMenu) object.
[`Start Menu`](#toc-the-start-menu-object) object.

`info`

: `Object`

[`Info`](#obj-info) object.
[`Info`](#toc-the-info-object) object.

`files`

Expand All @@ -69,13 +69,13 @@ The kmp.json file is a base object described as

: `Object`

Array of [`Keyboard`](#obj-keyboard) objects.
Array of [`Keyboard`](#toc-the-keyboard-object) objects.

`lexicalModels`

: `Object`

Array of [`LexicalModel`](#obj-lexicalModel) objects.
Array of [`LexicalModel`](#toc-the-lexicalmodel-object) objects.

### The System object

Expand Down Expand Up @@ -139,7 +139,7 @@ The `StartMenu` object is used by Keyman Desktop to install windows

: `Array`

An array of [Item](#obj-item) objects
An array of [Item](#toc-the-item-object) objects

### The Item Object

Expand Down Expand Up @@ -229,7 +229,7 @@ The `Keyboard` object describes an individual keyboard in the Keyman package. A

: `Array`

An array of [`Language`](#obj-language) objects linked to the keyboard.
An array of [`Language`](#toc-the-language-object) objects linked to the keyboard.

`displayFont`

Expand All @@ -247,7 +247,7 @@ The `Keyboard` object describes an individual keyboard in the Keyman package. A

: `Array`

An array of [`Example`](#obj-example) objects linked to the keyboard.
An array of [`Example`](#toc-the-example-object) objects linked to the keyboard.

### The Language object

Expand Down Expand Up @@ -342,4 +342,4 @@ The `LexicalModel` object describes an individual model in the Keyman package. A

: `Array`

An array of [`Language`](#obj-language) objects linked to the model.
An array of [`Language`](#toc-the-language-object) objects linked to the model.
2 changes: 1 addition & 1 deletion developer/18.0/reference/file-types/tsv.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ Details:
Spreadsheet programs such as Microsoft Excel and Google Sheets can
export into TSV format. TSVs can also be programmatically generated
from other data sources. For advanced users, see [File
Format](#file-format) for more details.
Format](#toc-file-format) for more details.

Distributed with lexical model:
: No. This is a development file and should not be distributed.
Expand Down

0 comments on commit 846f7b0

Please sign in to comment.