Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tortoise queries with long JOIN aliases raise exceptions when using Postgres as the backend database #1902

Open
mudetz opened this issue Feb 28, 2025 · 0 comments

Comments

@mudetz
Copy link

mudetz commented Feb 28, 2025

Describe the bug
Hey guys, I'm using tortoise in production and I found a Postgres specific bug.
Postgres has a limitation where identifiers must be at most 63 character long. This means in practice that if you create a column name or alias which is longer than 63 bytes long, Postgres will truncate it.
This limit can be configured but it requires re-compiling Postgres which is not practical when using cloud-managed solutions.

Here's an example

SELECT 1 AS "my_very_very_very_long_unpractical_and_almost_comical_column_name_hehe";

returns

my_very_very_very_long_unpractical_and_almost_comical_column_nam
---
1

This is a problem for tortoise which relies on columns aliases to resolve field names and build related models when doing joins, or even if any field has a very long name.

To Reproduce
This is a kind of comical example that abuses the length of field names. But note, in production models tend to have descriptive names and it is not uncommon to require many joins when executing a queryset with .select_related(), which generates very long aliases by joining field names using double underscores.

# file models.py
import tortoise
import uuid


class Author(tortoise.models.Model):
	id = tortoise.fields.UUIDField(primary_key=True, default=uuid.uuid4)


class Book(tortoise.models.Model):
	id = tortoise.fields.UUIDField(primary_key=True, default=uuid.uuid4)
	author_model_relation_with_long_name: tortoise.fields.ForeignKeyRelation[Author] = tortoise.fields.ForeignKeyField(
		"models.Author"
	)


class Chapter(tortoise.models.Model):
	id = tortoise.fields.UUIDField(primary_key=True, default=uuid.uuid4)
	book_model_relation_with_long_name: tortoise.fields.ForeignKeyRelation[Book] = tortoise.fields.ForeignKeyField(
		"models.Book"
	)
#file main.py
import models
import tortoise

async def main():
	await tortoise.Tortoise.init(
		db_url="postgres://user:password@host/dbname",
		modules={"models": ["models"]},
	)
	await tortoise.Tortoise.generate_schemas()

	author = await models.Author.create()
	book = await models.Book.create(author_model_relation_with_long_name=author)
	chapter = await models.Chapter.create(book_model_relation_with_long_name=book)

	# Joins generate colums aliases with lenghth >63 chars
	queryset = models.Chapter.all().select_related(
		"book_model_relation_with_long_name",
		"book_model_relation_with_long_name__author_model_relation_with_long_name",
	)
	print(queryset.sql(), end="\n\n")


	# upon selecting, column aliases get truncated to 63 chars in Postgres response records
	# tortoise does not expect return aliases to be truncated
	join = await queryset.latest("id")  # BOOM!

tortoise.run_async(main())

For executing

python main.py

Which raises

IndexError: list index out of range

in

tortoise/backends/base/executor.py", line 120, in execute_select
    obj = model._init_from_db(**{k.split(".")[1]: v for k, v in related_items})

Note that the issue is having an alias which is very long

SELECT
    ....
    ,"chapter__book_model_relation_with_long_name__author_model_relation_with_long_name"."id" "chapter__book_model_relation_with_long_name__author_model_relation_with_long_name.id"
    ...
FROM ...

and the returned alias from Postgres does not include the dot . that tortoise needed to split the field name from the model reference alias.

Expected behavior
I expect this not to raise an exception and work as any other select with joins.

Additional context
AS tortoise generates the SQL to be executed, it knows in which order the rows will be returned from asyncpg. Tortoise could probably use the column index of each row and map that index to the long alias name instead of relying in the alias returned by Postgres.
I'm not sure how hard this is to implement. If you guys could provide any comments I could give a try to solve this problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant