Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More TTS improvements #401

Merged
merged 1 commit into from
Oct 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
75 changes: 54 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,37 @@ feature request or issue reports join the [discord](https://discord.gg/YyzaPhAN6
and it should work. If you don't want to run it as admin, you can disable the mouse control in the params.ini
by setting `vision_mode_only` to `true`.

### TTS

D4 uses a third-party TTS engine called Tolk. Tolk has a feature that allows custom third-party TTS DLLs to be loaded.
D4 automatically loads the DLL, which actually just sends the text to another application rather than reading it aloud.
This is similar to having a Braille TTS application for D4.

To use TTS, you need to:

- Copy `saapi64.dll` to your D4 directory
- Enable `Use Screen Reader` and `3rd Party Screen Reader` in D4 Accesibility settings
- Set `use_tts` in your `params.ini` to either `full` or `mixed` (or via the [GUI](#GUI))

#### Restrictions

Currently, `use_tts` enables either a mixed mode where image processing is still used for item and affix position detection,
but TTS is used for everything text-related. This results in a small improvement in performance and a major improvement
in accuracy. Or a full mode where only TTS is used.

**The following is currently supported using use_tts=mixed:**

- Full item detection for all wearable items, e.g. armor, weapons, and accessories. Both in `vision_mode` and
`loot_filter`.
- Basic item detection for all? other items, e.g. only type + rarity

**The following is currently supported using use_tts=mixed:**

- Full item detection for all wearable items, e.g. armor, weapons, and accessories. Only in `loot_filter`.
- For everything else, mixed mode is used

We might also discontinue the pure image processing mode in the future, as TTS is easier to maintain.

### Configs

The config folder in `C:/Users/<WINDOWS_USER>/.d4lf` contains:
Expand All @@ -75,17 +106,18 @@ The config folder in `C:/Users/<WINDOWS_USER>/.d4lf` contains:
| [general] | Description |
|---------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| profiles | A set of profiles separated by comma. d4lf will look for these yaml files in config/profiles and in C:/Users/WINDOWS_USER/.d4lf/profiles |
| keep_aspects | - `all`: Keep all legendary items <br>- `upgrade`: Keep all legendary items that upgrade your codex of power <br>- `none`: Keep no legendary items based on aspect (they are still filtered!) |
| browser | Which browser to use to get builds, please make sure you pick an installed browser: chrome, edge or firefox are currently supported |
| check_chest_tabs | Which chest tabs will be checked and filtered for items in case chest is open when starting the filter. You need to buy all slots. Counting is done left to right. E.g. 1,2,4 will check tab 1, tab 2, tab 4 |
| full_dump | When using the import build feature, whether to use the full dump (e.g. contains all filter items) or not |
| handle_rares | - `filter`: Filter them based on your profiles <br>- `ignore`: Ignores all rares, vision mode shows them as blue and auto mode never junks or favorites them <br>- `junk`: Vision mode shows them always as red, auto mode always junks rares |
| handle_uniques | How to handle uniques that do not match any filter. This property does not apply to filtered uniques. All mythics are favorited regardless of filter. <br/>- `favorite`: Mark the unique as favorite and vision mode will show it as green (default)<br/>- `ignore`: Do nothing with the unique and vision mode will show it as green<br/>- `junk`: Mark any uniques that don't match any filters as junk and show as red in vision mode |
| run_vision_mode_on_startup | If the vision mode should automatically start when starting d4lf. Otherwise has to be started manually with the vision button or the hotkey |
| check_chest_tabs | Which chest tabs will be checked and filtered for items in case chest is open when starting the filter. You need to buy all slots. Counting is done left to right. E.g. 1,2,4 will check tab 1, tab 2, tab 4 |
| move_to_inv_item_type<br/>move_to_stash_item_type | Which types of items to move when using fast move functionality. Will only affect tabs defined in check_chest_tabs. You can select more than one option. <br>- `favorites`: Move favorites only <br>- `junk`: Move junk only <br>- `unmarked`: Only items not marked as favorite or junk <br>- `everything`: Move everything |
| hidden_transparency | The overlay will become transparent after not hovering it for a while. This can be changed by specifying any value between [0, 1] with 0 being completely invisible and 1 completely visible |
| keep_aspects | - `all`: Keep all legendary items <br>- `upgrade`: Keep all legendary items that upgrade your codex of power <br>- `none`: Keep no legendary items based on aspect (they are still filtered!) |
| mark_as_favorite | Whether to favorite matched items or not. Defaults to true |
| minimum_overlay_font_size | The minimum font size for the vision overlay, specifically the green text that shows which filter(s) are matching. Note: For small profile names, the font may actually be larger than this size but will never go below this size. |
| hidden_transparency | The overlay will become transparent after not hovering it for a while. This can be changed by specifying any value between [0, 1] with 0 being completely invisible and 1 completely visible |
| browser | Which browser to use to get builds, please make sure you pick an installed browser: chrome, edge or firefox are currently supported |
| full_dump | When using the import build feature, whether to use the full dump (e.g. contains all filter items) or not |
| move_to_inv_item_type<br/>move_to_stash_item_type | Which types of items to move when using fast move functionality. Will only affect tabs defined in check_chest_tabs. You can select more than one option. <br>- `favorites`: Move favorites only <br>- `junk`: Move junk only <br>- `unmarked`: Only items not marked as favorite or junk <br>- `everything`: Move everything |
| run_vision_mode_on_startup | If the vision mode should automatically start when starting d4lf. Otherwise has to be started manually with the vision button or the hotkey |
| use_tts | use TTS instead of OCR, see [TTS](#TTS) |

| [char] | Description |
|-----------|-----------------------------------|
Expand Down Expand Up @@ -423,45 +455,46 @@ in [assets/lang/enUS/uniques.json](assets/lang/enUS/uniques.json).

### Python Setup

- You can use [miniconda](https://docs.conda.io/projects/miniconda/en/latest/) or just plain python.
- You can use plain python or something like [miniconda](https://docs.conda.io/projects/miniconda/en/latest/).

Conda setup:
Python setup (windows, linux venv activation differs):

```bash
git clone https://github.com/aeon0/d4lf
cd d4lf
conda env create -f environment.yml
conda activate d4lf
python -m venv venv
venv\Scripts\activate
python -m pip install -r requirements.txt
python -m src.main
```

Python setup (windows, linux venv activation differs):
Conda setup:

```bash
git clone https://github.com/aeon0/d4lf
cd d4lf
python -m venv venv
venv\Scripts\activate
python -m pip install -r requirements.txt
conda env create -f environment.yml
conda activate d4lf
python -m src.main
```

### Formatting & Linting

Ruff is used for linting and auto formatting. You can run it with:
Just use pre-commit.

```bash
ruff format
pre-commit install
```

or directly via

```bash
ruff check
pre-commit run -a
```

Setup VS Code by using the ruff extension. Also turn on "trim trailing whitespaces" is VS Code settings.

## Credits

- Icon based of: [CarbotAnimations](https://www.youtube.com/carbotanimations/about)
- Some of the OCR code is originally from [@gleed](https://github.com/aliig). Good guy.
- Some of the OCR code is originally from [@gleed](https://github.com/aliig). Good guy
- Names and textures for matching from [Blizzard](https://www.blizzard.com)
- Thanks to NekrosStratia for the initial idea and help with TTS mode
2 changes: 1 addition & 1 deletion src/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@

TP = concurrent.futures.ThreadPoolExecutor()

__version__ = "5.9.0alpha1"
__version__ = "5.9.0alpha2"
2 changes: 2 additions & 0 deletions src/config/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,6 @@ def get_base_dir(bundled: bool = False) -> Path:
return Path(__file__).parent.parent.parent


AFFIX_COMPARISON_CHARS = 60

BASE_DIR = get_base_dir(False)
36 changes: 21 additions & 15 deletions src/config/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,17 +29,21 @@ class AspectFilterType(enum.StrEnum):
upgrade = enum.auto()


class ComparisonType(enum.StrEnum):
larger = enum.auto()
smaller = enum.auto()


class HandleRaresType(enum.StrEnum):
filter = enum.auto()
ignore = enum.auto()
junk = enum.auto()


class MoveItemsType(enum.StrEnum):
everything = enum.auto()
favorites = enum.auto()
junk = enum.auto()
unmarked = enum.auto()
class ItemRefreshType(enum.StrEnum):
force_with_filter = enum.auto()
force_without_filter = enum.auto()
no_refresh = enum.auto()


class LogLevels(enum.StrEnum):
Expand All @@ -50,21 +54,23 @@ class LogLevels(enum.StrEnum):
critical = enum.auto()


class MoveItemsType(enum.StrEnum):
everything = enum.auto()
favorites = enum.auto()
junk = enum.auto()
unmarked = enum.auto()


class UnfilteredUniquesType(enum.StrEnum):
favorite = enum.auto()
ignore = enum.auto()
junk = enum.auto()


class ComparisonType(enum.StrEnum):
larger = enum.auto()
smaller = enum.auto()


class ItemRefreshType(enum.StrEnum):
force_with_filter = enum.auto()
force_without_filter = enum.auto()
no_refresh = enum.auto()
class UseTTSType(enum.StrEnum):
full = enum.auto()
mixed = enum.auto()
off = enum.auto()


class _IniBaseModel(BaseModel):
Expand Down Expand Up @@ -279,7 +285,7 @@ class GeneralModel(_IniBaseModel):
"C:/Users/USERNAME/.d4lf/profiles/*.yaml",
)
run_vision_mode_on_startup: bool = Field(default=True, description="Whether to run vision mode on startup or not")
use_tts: bool = Field(default=False, description="Whether to use tts or not")
use_tts: UseTTSType = Field(default=UseTTSType.off, description="Whether to use tts or not")

@field_validator("check_chest_tabs", mode="before")
def check_chest_tabs_index(cls, v: str) -> list[int]:
Expand Down
4 changes: 2 additions & 2 deletions src/dataloader.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
import logging
import threading

from src.config import BASE_DIR
from src.config import AFFIX_COMPARISON_CHARS, BASE_DIR
from src.config.loader import IniConfigLoader
from src.item.data.item_type import ItemType

Expand Down Expand Up @@ -67,5 +67,5 @@ def load_data(self):
with open(BASE_DIR / f"assets/lang/{IniConfigLoader().general.language}/uniques.json", encoding="utf-8") as f:
data = json.load(f)
for key, d in data.items():
self.aspect_unique_dict[key] = d["desc"][:60]
self.aspect_unique_dict[key] = d["desc"][:AFFIX_COMPARISON_CHARS]
self.aspect_unique_num_idx[key] = d["num_idx"]
28 changes: 27 additions & 1 deletion src/item/data/item_type.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,11 +33,14 @@ class ItemType(Enum):
Wand = "wand"
# Custom Types
Compass = "compass"
Consumable = "consumable"
Gem = "gem"
Incense = "incense"
Material = "material"
Rune = "rune"
Sigil = "nightmare sigil"
Tribute = "Tribute"
TemperManual = "temper manual"
Tribute = "tribute"


def is_armor(item_type: ItemType) -> bool:
Expand All @@ -50,13 +53,36 @@ def is_armor(item_type: ItemType) -> bool:
]


def is_consumable(item_type: ItemType) -> bool:
return item_type in [
ItemType.Consumable,
ItemType.Elixir,
ItemType.Incense,
ItemType.TemperManual,
]


def is_mapping(item_type: ItemType) -> bool:
return item_type in [
ItemType.Compass,
ItemType.Sigil,
]


def is_jewelry(item_type: ItemType) -> bool:
return item_type in [
ItemType.Amulet,
ItemType.Ring,
]


def is_socketable(item_type: ItemType) -> bool:
return item_type in [
ItemType.Gem,
ItemType.Rune,
]


def is_weapon(item_type: ItemType) -> bool:
return item_type in [
ItemType.Axe,
Expand Down
4 changes: 2 additions & 2 deletions src/item/descr/find_aspect.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

import numpy as np

from src.config import AFFIX_COMPARISON_CHARS
from src.dataloader import Dataloader
from src.item.data.aspect import Aspect
from src.item.descr.text import clean_str, closest_match, find_number
Expand All @@ -22,8 +23,7 @@ def find_aspect(img_item_descr: np.ndarray, aspect_bullet: TemplateMatch, do_pre
concatenated_str = image_to_text(img_full_aspect, do_pre_proc=do_pre_proc).text.lower().replace("\n", " ")
cleaned_str = clean_str(concatenated_str)

# Note: If you adjust the [:45] it also needs to be adapted in the dataloader
cleaned_str = cleaned_str[:60]
cleaned_str = cleaned_str[:AFFIX_COMPARISON_CHARS]
found_key = closest_match(cleaned_str, Dataloader().aspect_unique_dict)
num_idx = Dataloader().aspect_unique_num_idx

Expand Down
Loading