You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After some early successes using formulas and schedula to calculate Excel LCAs, we have run into memory usage troubles when processing larger sheets. Some analysis shows that the memory is mostly in the formulasdsp property, which is a shcedula object, and I was hoping for some advice on what I could do to reduce memory usage there to support processing larger spreadsheets.
In a successful run on a host with 512GB of RAM, from a 5M-cell Excel file, I am able to load the 2.2M cells needed for my LCA calculation, with 165GB of total memory used. With logging I added, I can see that about 2GB is required to load the sheet into OpenPyXL, an additional 125GB to create the cells in formulas, an additional 10GB to .finish() the formulas object, and an additional 24GB to .compile() this into a function. For more details see excelsior-successful-run.txt.
In an unsuccessful run on a host with 512GB of RAM, from a 9.4M-cell Excel file, my program is killed by the Linux OOM killer after creating 8.8M cells and consuming 498GB of RAM. Logging indicates 5GB is used to load the Excel with OpenPyXL, then memory usage increases as formulas adds cells, until it is exhausted. For more details see excelsior-failed-run.txt.
My code to load the data and compile the function looks like this:
Hello,
After some early successes using formulas and
schedula
to calculate Excel LCAs, we have run into memory usage troubles when processing larger sheets. Some analysis shows that the memory is mostly in theformulas
dsp
property, which is ashcedula
object, and I was hoping for some advice on what I could do to reduce memory usage there to support processing larger spreadsheets.In a successful run on a host with 512GB of RAM, from a 5M-cell Excel file, I am able to load the 2.2M cells needed for my LCA calculation, with 165GB of total memory used. With logging I added, I can see that about 2GB is required to load the sheet into OpenPyXL, an additional 125GB to create the cells in
formulas
, an additional 10GB to.finish()
theformulas
object, and an additional 24GB to.compile()
this into a function. For more details see excelsior-successful-run.txt.In an unsuccessful run on a host with 512GB of RAM, from a 9.4M-cell Excel file, my program is killed by the Linux OOM killer after creating 8.8M cells and consuming 498GB of RAM. Logging indicates 5GB is used to load the Excel with OpenPyXL, then memory usage increases as
formulas
adds cells, until it is exhausted. For more details see excelsior-failed-run.txt.My code to load the data and compile the function looks like this:
Any advice is appreciated!
The text was updated successfully, but these errors were encountered: