Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

auto: Keyman for developer help deployment #1824

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -23,16 +23,16 @@ the specification itself.
There are three ways - all of them optional - to extend and customize
the word-breaking rules themselves:

* If you need to prevent splits in very specific scenarios and/or add splits in other specific scenarios, you may specify [context-based rules](#rules) to obtain the desired behavior.
* If you need to prevent splits in very specific scenarios and/or add splits in other specific scenarios, you may specify [context-based rules](#toc-custom-word-breaking-rules) to obtain the desired behavior.
* If certain characters are not handled appropriately for their role in
your language, you may [map characters](#map) to different
your language, you may [map characters](#toc-character-property-remapping) to different
word-breaking character classes - including custom ones. This will
override the default property they are assigned by the default
implementation, with the new property applying for all word-breaking
rules.
* If the default word-breaking classes from the specification are
too general for certain aspects of your language, it is possible to
[define custom character classes](#define) for use in custom
[define custom character classes](#toc-defining-and-using-new-word-breaking-properties) for use in custom
rules.

## A first example
Expand Down Expand Up @@ -183,7 +183,7 @@ the potential boundary.
### Word-breaking property names
The names used in each array must be defined in one of the following places:
* https://unicode.org/reports/tr29/#Table_Word_Break_Property_Values
* `customProperties` - your [declaration of any custom property types](#define)
* `customProperties` - your [declaration of any custom property types](#toc-defining-and-using-new-word-breaking-properties)
* One of the special property types `"Other"`, `"sot"`, or `"eot"`:
* `Other`: a character without an associated word-breaking property value
* `sot`: "start of text" - a marker indicating the beginning of the string being word-broken
Expand Down Expand Up @@ -277,7 +277,7 @@ As noted at the top of the file:

### Redefining character properties

Of note from [our first example](#example):
Of note from [our first example](#toc-a-first-example):

```typescript
/*** Character class overrides for specific characters ***/
Expand Down Expand Up @@ -324,7 +324,7 @@ words (and/or names) in some languages. Default word-breaking behavior will spl
hyphenated words and names apart, but by changing the property of hyphens, it is
possible to disable this behavior.

Noting [rule WB6](#WB6) and WB7, the `MidLetter` class is designed to prevent
Noting [rule WB6](#toc-custom-word-breaking-rules) and WB7, the `MidLetter` class is designed to prevent
word-breaks from occurring when its characters lie directly between letters -
hence the property name. Assigning hyphens to this class can provide the
desired behavior.
Expand Down Expand Up @@ -364,7 +364,7 @@ the second example above. (After all, `can'` could be the end of a quoted
phrase in English - `'sure you can'` - in which case we might want the split
to occur.)

Revisiting [an earlier example](#example) and simplifying a little bit:
Revisiting [an earlier example](#toc-redefining-character-properties) and simplifying a little bit:

```typescript
/*** Definition of extra word-breaking rules ***/
Expand Down Expand Up @@ -440,7 +440,7 @@ included a couple of extra rules:
}
```

By replicating [WB6](#WB6) and WB7's structure and allowing `Hyphen` to match in the same
By replicating [WB6](#toc-custom-word-breaking-rules) and WB7's structure and allowing `Hyphen` to match in the same
position as `MidLetter` in the original rules, we can prevent word-breaking splits
after additional text has been typed after a `Hyphen`-property character. This does not
_replace_ the behavior of WB6 and WB7 - it merely _extends_ it to include the new property.
Expand Down
6 changes: 3 additions & 3 deletions developer/18.0/guides/lexical-models/advanced/word-breaker.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,9 @@ However, in languages written in other scripts — especially East Asian
scripts like Chinese, Japanese, Khmer, Lao, and Thai — there are no obvious break in between words. For these languages, there must be special rules for determining when words start and stop. This is what a _word breaker function_ is responsible for. It is a little bit of code that looks at some text to determine where the words are.

You can customize the word breaker in three ways:
- If your language uses its writing system in an unconventional way (e.g., use spaces to separate words in Thai, Lao, Burmese, or Khmer), you can [override the script's default behaviour](#overrides)
- If the default word breaker creates **too many splits**, you can [choose which strings join words together](#join).
- If the default word breaker creates **not enough splits**, you must [create your own word breaker function](#custom).
- If your language uses its writing system in an unconventional way (e.g., use spaces to separate words in Thai, Lao, Burmese, or Khmer), you can [override the script's default behaviour](#toc-overriding-script-defaults)
- If the default word breaker creates **too many splits**, you can [choose which strings join words together](#toc-customize-joining-rules).
- If the default word breaker creates **not enough splits**, you must [create your own word breaker function](#toc-writing-a-custom-word-breaker-function).
- Alternatively, you may choose to [customize and extend the wordbreaker's behavior](./unicode-breaker-extension) by adding extra rules and changing how it treats specific characters.

## Overriding script defaults
Expand Down
20 changes: 10 additions & 10 deletions developer/18.0/reference/file-types/metadata.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,25 +38,25 @@ The kmp.json file is a base object described as

: `Object`

[`System`](#obj-system) object.
[`System`](#toc-the-system-object) object.

`options`

: `Object`

An [`Options`](#obj-options) object.
An [`Options`](#toc-the-options-object) object.

`startMenu`

: `Object`

[`Start Menu`](#obj-startMenu) object.
[`Start Menu`](#toc-the-start-menu-object) object.

`info`

: `Object`

[`Info`](#obj-info) object.
[`Info`](#toc-the-info-object) object.

`files`

Expand All @@ -69,13 +69,13 @@ The kmp.json file is a base object described as

: `Object`

Array of [`Keyboard`](#obj-keyboard) objects.
Array of [`Keyboard`](#toc-the-keyboard-object) objects.

`lexicalModels`

: `Object`

Array of [`LexicalModel`](#obj-lexicalModel) objects.
Array of [`LexicalModel`](#toc-the-lexicalmodel-object) objects.

### The System object

Expand Down Expand Up @@ -139,7 +139,7 @@ The `StartMenu` object is used by Keyman Desktop to install windows

: `Array`

An array of [Item](#obj-item) objects
An array of [Item](#toc-the-item-object) objects

### The Item Object

Expand Down Expand Up @@ -229,7 +229,7 @@ The `Keyboard` object describes an individual keyboard in the Keyman package. A

: `Array`

An array of [`Language`](#obj-language) objects linked to the keyboard.
An array of [`Language`](#toc-the-language-object) objects linked to the keyboard.

`displayFont`

Expand All @@ -247,7 +247,7 @@ The `Keyboard` object describes an individual keyboard in the Keyman package. A

: `Array`

An array of [`Example`](#obj-example) objects linked to the keyboard.
An array of [`Example`](#toc-the-example-object) objects linked to the keyboard.

### The Language object

Expand Down Expand Up @@ -342,4 +342,4 @@ The `LexicalModel` object describes an individual model in the Keyman package. A

: `Array`

An array of [`Language`](#obj-language) objects linked to the model.
An array of [`Language`](#toc-the-language-object) objects linked to the model.
2 changes: 1 addition & 1 deletion developer/18.0/reference/file-types/tsv.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ Details:
Spreadsheet programs such as Microsoft Excel and Google Sheets can
export into TSV format. TSVs can also be programmatically generated
from other data sources. For advanced users, see [File
Format](#file-format) for more details.
Format](#toc-file-format) for more details.

Distributed with lexical model:
: No. This is a development file and should not be distributed.
Expand Down
Loading