You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
tweetset_loader looks at all files in the folder and simply counts lines in the files and produces a message at the console such as:
INFO:__main__:Counting tweets in 34 files.
INFO:__main__:191,631 total tweets
Following our documentation for loading to tweetsets results in the creation of other files in the folder that should not be counted, such as files containing concatenated contents from all of the tweet ID files, etc. - the result being that tweetset_loader counts lines in more files than necessary, leading to a wildly inaccurate tweet count.
Since this is a back-end function, I would suggest simply making the message less specific, rather than spending effort to make it more precise. This will at least avoid creating the appearance to the person invoking the load that something isn't correct.
The text was updated successfully, but these errors were encountered:
tweetset_loader
looks at all files in the folder and simply counts lines in the files and produces a message at the console such as:Following our documentation for loading to tweetsets results in the creation of other files in the folder that should not be counted, such as files containing concatenated contents from all of the tweet ID files, etc. - the result being that
tweetset_loader
counts lines in more files than necessary, leading to a wildly inaccurate tweet count.Relevant code is here:
https://github.com/gwu-libraries/TweetSets/blob/master/tweetset_loader.py#L319-L322
Since this is a back-end function, I would suggest simply making the message less specific, rather than spending effort to make it more precise. This will at least avoid creating the appearance to the person invoking the load that something isn't correct.
The text was updated successfully, but these errors were encountered: