-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add running totals functionality #76
base: master
Are you sure you want to change the base?
Conversation
aab3992
to
b7cf6d2
Compare
26c9655
to
d71730b
Compare
Codecov ReportAttention:
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## master #76 +/- ##
==========================================
+ Coverage 92.97% 93.18% +0.20%
==========================================
Files 59 62 +3
Lines 3828 4106 +278
Branches 248 279 +31
==========================================
+ Hits 3559 3826 +267
- Misses 224 231 +7
- Partials 45 49 +4 ☔ View full report in Codecov by Sentry. |
b160cc4
to
3b30eae
Compare
Now I have updated this PR to working state. It is working, but have some problems:
|
a5a3f46
to
a5f05c7
Compare
cd4162c
to
8192ecc
Compare
1d7bbef
to
741b68d
Compare
f9c7619
to
5789957
Compare
@PetrDlouhy I was thinking about this and would recommend a different approach if you're open to it. To speed up In the SO post, it's called Key Columns of the
Functionality
Differences from StackOverflow (SO)
Benefits
Drawbacks
Thoughts? |
Some quick thoughts:
|
61f16c8
to
568f99e
Compare
@PetrDlouhy thanks for the feedback:
Reservations with current RunningTotals Agreed that signals makes me pause a bit, but I don't think that's a blocker for this feature. In my eyes the biggest concern in the current architecture, is that One is - while it's been correctly identified a table Separately, looking at Recommendations
As it stands, any organization like our own would not be able to adopt this feature because we rely on |
@nitsujri Seems very reasonable. Although I am not sure, if I will find enough time for this in the near future. |
a8f3fa9
to
e8bd6ba
Compare
2399fd7
to
3d59235
Compare
c2d176a
to
fefe20a
Compare
This discussion sounds interesting and good to me. I have been aware of this possibly becoming a performance issue. Question: What kind of scale are you seeing this become an issue at? (i.e. roughly how many legs do you have on a big account) I'm very much in favour of this being done in-database rather than in-django. I think there are a couple of sides of this:
I could also see this functionality being provided rather easily by a postgresql materialised view, but that would be Postgresql-only (and |
I just had a little play with the SQL required to generate balances for all accounts (would would be useful for implementing this as a materialised view). Not sure if this will be useful, but I need to run so I'll leave it here in case it is: UPDATE: Ignore this old implementation, instead see the new version in source. SELECT
A.id as account_id,
L.*
FROM hordak_account A
INNER JOIN LATERAL
(
SELECT
L2.amount_currency as balance_currency,
COALESCE(SUM(L2.amount), 0.0) as balance,
MAX(L2.id) calculated_to_leg_id
FROM hordak_account A2
INNER JOIN public.hordak_leg L2 on L2.account_id = A2.id
WHERE A2.lft >= A.lft AND A2.rght <= A.rght AND A.tree_id = A2.tree_id
GROUP BY L2.amount_currency
) L ON True; The results will be unique on EDIT 1: Related: Calculating the balances for a list of accounts is also slow because this is all calculated in Python/Django, and not directly in the database. If we had a database function for account balance calculations ( EDIT 2: A question occurs to me: Is a |
@adamcharnock Ours service has ~1M users, (x2 Hordak accounts), ~13M transactions (x2 legs). Some of the accounts have much larger number of accounts than others, I would expect that maximum can reach 1M transactions per accont. We are using PostgreSQL, so PSQL only is not a problem for us. |
That SQL function doesn't work very quickly for me. This is
|
Oof, 31 seconds. Ok, I'll take another look and see what I can do. Off-the-cuff thoughts:
|
I don't use any child accounts. I just have 2 accounts for every user and then 3 internal accounts with the in/outband transactions. |
Ah. Yes, I see you are (reasonably) using the function from the comment above. I've improved this now and you can find the better version here: Once you've run this SQL, you should be able to do this:
That 420ms for an account with 1 million legs. What do you see on your side? UPDATE: This is on an M2 Macbook. Also, if I just calculate the balance for the one account (not including children) then it shaves about 30-40% off the execution time. This is a decent win, but I think we'll get bigger gains from adding running totals, as per this PR. I'm copying this comment to the #126 PR too. |
With huge amount of transaction I started to have problems with performance of counting account totals.
I tried to start counting running totals for the accounts, but I realized that the task is more complicated than I anticipated. I ended up rather optimizing the performance of the server, but I will probably need to implement this functionality eventually.
I am leaving the work in progress code here if anyone would be interested in finishing it.