Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large amount of time spent on cPickle.load() #15

Open
bramvdh91 opened this issue Jun 15, 2018 · 3 comments
Open

Large amount of time spent on cPickle.load() #15

bramvdh91 opened this issue Jun 15, 2018 · 3 comments
Assignees

Comments

@bramvdh91
Copy link
Contributor

Profiling the example.py shows that a lot of time is spent on loading a cPickle object while compiling the output .txt files for a feeder. Usually the user is not interested in the pickle files, which in addition take a lot of space.

I would like to try if the code can be sped up by only directly writing the variables of interest.

image

@bramvdh91 bramvdh91 self-assigned this Jun 15, 2018
@cprotopa
Copy link
Contributor

One problem is that it re-loads the files for every variable. Calling the output function only once for all variables would already reduce the time. I have such a version for myself, I once applied this in a local version. I could look for it, but perhaps you can easily implement it. I'm not sure if there are conflicts with other parts of the code though, in case you can somewhere ask which variables you actually want as output.

@xavfa
Copy link

xavfa commented Nov 5, 2020

Hi, I am not sure to understand, I have splited the output function (in feeder) into 2 : one for the data aggregation that saves into a self.dat (into the fee object) this one is called in place of hou.pickle() and another one that writes directly the ascii files called in place of output(). I didn't found elsewhere some pickle use. for the 5 building in the example it saves less than 4 second in my laptop (32Gb RAM). the overall process takes 86 seconds with the above modifications. I guess these modifications are the same as the one you mentioned above but as being part of the fee object i am not sure that there are impacts elsewhere.

@cprotopa
Copy link
Contributor

cprotopa commented Nov 6, 2020

I'm not sure what you mean exactly, but here is how the feeder model currently works: all houses are generated and saved as pickled files. Afterwards, all houses are loaded one by one, and the output files required for Modelica IDEAS simulations are created (one file per variable, each file has one column of data per house).
Previously, the pickled files were reloaded separately per variable, so this has been already fixed.

Skipping the pickling entirely is possible, in the way you describe it. So each house that is created is not saved in a file, but saved in a variable. Each new house that is simulated is added to the variable, and in the end you just write everything.
I have experimented with that, and it is indeed faster, for a small number of houses. If you don't care to save all results of individual houses, then it's a good approach. HOWEVER, you risk to run into memory problems as soon as you go to feeders above ~150-200 houses (depends on your PC, of course). Since it doesn't scale up well with size, we haven't implemented it, so that nobody risks to get issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants