-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add dependents field on repositories stream #34
Comments
@laurentS - I have no objection to adding the |
Just to clarify, I'm talking about Happy to do this as a child stream if it makes more sense. As far as I can see, it would be a single request/record per repo, with 2 data fields to start with (but potentially more in the future). |
@laurentS - Thanks for clarifying the If you only want the count of dependents, I could see this being a property of
|
There are all great points!
|
@laurentS I'd love to revive this now that we have the GraphQl endpoints. I think we should aim to grab both dependents and dependencies |
For the dependencies, you can use this - https://docs.github.com/en/graphql/overview/schema-previews#access-to-a-repositories-dependency-graph-preview, see how it can be used in https://github.com/simonw/til/blob/master/github/dependencies-graphql-api.md Or by scraping, see dogsheep/github-to-sqlite#70 and the assosciated functions |
We use the
dependents
count for a repository, which is currently fetched by grabbing the html page for the project (eg. https://github.com/facebook/react/network/dependents) and parsing the HTML. As I write this ticket, the link above returns7,878,702 Repositories
(and likewise for packages) and we grab these numbers.Unfortunately, this info does not seem to exist anywhere in either the REST or graphQL APIs.
@aaronsteers Would you have any objection to me adding a request for that page to the
repositories
stream resulting in an extra field? Possibly behind some config option as it is fairly download heavy (the page above weighs 187kB).Maybe in the
post_process
method? Ideally, the data will eventually be available in one of the APIs, and this can then be dropped.The text was updated successfully, but these errors were encountered: