Skip to content

Commit

Permalink
text: Add README.md for vt transactions
Browse files Browse the repository at this point in the history
Signed-off-by: Andres Taylor <[email protected]>
  • Loading branch information
systay committed Nov 22, 2024
1 parent 030f025 commit 860e861
Show file tree
Hide file tree
Showing 2 changed files with 92 additions and 0 deletions.
6 changes: 6 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ The `vt` binary encapsulates several utility tools for Vitess, providing a compr
## Tools Included
- **`vt test`**: A testing utility using the same test files as the [MySQL Test Framework](https://github.com/mysql/mysql-server/tree/8.0/mysql-test). It compares the results of identical queries executed on both MySQL and Vitess (vtgate), helping to ensure compatibility.
- **`vt keys`**: A utility that analyzes query logs and provides information about queries, tables, joins, and column usage.
- **`vt transactions`**: A tool that analyzes query logs to identify transaction patterns and outputs a JSON report detailing these patterns.
- **`vt trace`**: A tool that generates execution traces for queries without comparing against MySQL. It helps analyze query behavior and performance in Vitess environments.
- **`vt summarize`**: A tool used to summarize or compare trace logs or key logs for easier human consumption.

Expand Down Expand Up @@ -116,6 +117,11 @@ This command generates a `keys-log.json` file that contains a detailed analysis
This command summarizes the key analysis, providing insight into which tables and columns are used across queries, and how frequently they are involved in filters, groupings, and joins.
[Here](https://github.com/vitessio/vt/blob/main/go/summarize/testdata/keys-summary.md) is an example summary report.

## Transaction Analysis with vt transactions
The `vt transactions` command is designed to analyze query logs and identify patterns of transactional queries.
It processes the logs to find sequences of queries that form transactions and outputs a JSON report summarizing these patterns.
Read more about how to use and how to read the output in the [vt transactions documentation](./go/transactions/README.md).

## Using `--backup-path` Flag

The `--backup-path` flag allows `vt test` and `vt trace` to initialize tests from a database backup rather than an empty database.
Expand Down
86 changes: 86 additions & 0 deletions go/transactions/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
# VT Transactions

The vt transactions command is a sub-command of the vt toolset, designed to analyze query logs, identify transaction patterns, and produce a JSON report summarizing these patterns.
This tool is particularly useful for understanding complex transaction behaviors, optimizing database performance, choosing sharding strategy, and auditing transactional queries.

## Usage

The basic usage of vt transactions is:

```bash
vt transactions querylog.log > report.json
```

* querylog.log: The input query log file. This can be in various formats, such as SQL files, slow query logs, MySQL general query logs, or VTGate query logs.
* report.json: The output JSON file containing the transaction patterns.

### Supported Input Types

`vt transactions` supports different input file formats through the --input-type flag:
* Default: Assumes the input is an SQL file or a slow query log. A SQL script would also fall under this category.
* MySQL General Query Log: Use --input-type=mysql-log for MySQL general query logs.
* VTGate Query Log: Use --input-type=vtgate-log for VTGate query logs.

## Understanding the JSON Output

The output JSON file contains an array of transaction patterns, each summarizing a set of queries that commonly occur together within transactions. Here’s a snippet of the JSON output:

```json
{
"query-signatures": [
"update pos_reports where id = :0 set `csv`, `error`, intraday, pos_type, ...",
"update pos_date_requests where cache_key = :1 set cache_value"
],
"predicates": [
"pos_date_requests.cache_key = ?",
"pos_reports.id = ?"
],
"count": 223
}
```

### Fields Explanation

* query-signatures: An array of generalized query patterns involved in the transaction. Placeholders like :0, :1, etc., represent variables in the queries.
* predicates: An array of predicates (conditions) extracted from the queries, generalized to identify patterns.
* count: The number of times this transaction pattern was observed in the logs.

### Understanding predicates

The predicates array lists the conditions used in the transactional queries, with variables generalized for pattern recognition.
* Shared Variables: If the same variable is used across different predicates within a transaction, it is assigned a numerical placeholder (e.g., 0, 1, 2). This indicates that the same variable or value is used in these predicates.
* Unique Variables: Variables that are unique to a single predicate are represented with a ?.

### Example Explained

Consider the following predicates array:

```json
{
"predicates": [
"timesheets.day = ?",
"timesheets.craft_id = ?",
"timesheets.store_id = ?",
"dailies.day = 0",
"dailies.craft_id = 1",
"dailies.store_id = 2",
"tickets.day = 0",
"tickets.craft_id = 1",
"tickets.store_id = 2"
]
}
```

* Shared Values: Predicates with the same value across different conditions are assigned numerical placeholders (0, 1, 2), indicating that the same variable or value is used in these predicates.
* For example, `dailies.craft_id = 1` and `tickets.craft_id = 1` share the same variable or value (represented as 1).
* Unique Values: Predicates used only once are represented with ?, indicating a unique or less significant variable in the pattern.
* For example, `timesheets.day = ?` represents a unique value for day.

This numbering helps identify the relationships between different predicates in the transaction patterns and can be used to optimize queries or understand transaction scopes.

## Practical Use Cases

* Optimization: Identify frequently occurring transactions to optimize database performance.
* Sharding Strategy: When implementing horizontal sharding, it’s crucial to ensure that as many transactions as possible are confined to a single shard. The insights from vt transactions can help in choosing appropriate sharding keys for your tables to achieve this.
* Audit: Analyze transactional patterns for security audits or compliance checks.
* Debugging: Understand complex transaction behaviors during development or troubleshooting.

0 comments on commit 860e861

Please sign in to comment.