Skip to content

Commit

Permalink
doc: update doc for new name
Browse files Browse the repository at this point in the history
Signed-off-by: Frankzhaopku <[email protected]>
  • Loading branch information
frank-zsy committed May 11, 2021
1 parent 1a6bc89 commit 850054d
Show file tree
Hide file tree
Showing 53 changed files with 170 additions and 436 deletions.
23 changes: 17 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,24 +1,35 @@
# GitHub Analysis Report
# OpenDigger

[![apache2](https://img.shields.io/badge/license-Apache%202-blue)](LICENSE) [![ccby4](https://img.shields.io/badge/license-CC%20BY%204.0-blue)](LICENSE-CC-BY) [![slack](https://img.shields.io/badge/slack-join%20chat-green)](https://join.slack.com/t/x-github-analysis/shared_invite/zt-jate2dty-oCvEheSrI0fI2BckbR1ptQ)

GitHub analysis report is an open source analysis report project for GitHub initiated by [X-lab](https://x-lab.info), this project aims to combine the wisdom of global developers to jointly analyze and insight into GitHub's developer behavior data to help everyone better understand and participate in open source.
Open digger is an open source analysis report project for all open source data initiated by [X-lab](https://x-lab.info), this project aims to combine the wisdom of global developers to jointly analyze and insight into open source related data to help everyone better understand and participate in open source.

## Report

We will generate reports into static web pages for viewing. Currently, we have following reports,

- [Global Study Report](http://opendigger-oss.x-lab.info/global-study.html)
- [Apache Software Foundation Study Report](http://opendigger-oss.x-lab.info/case-study-ASF.html)

## Data

We use [GHArchive](https://www.gharchive.org/) as our data source for GitHub logs and the data service is provided by self host [clickhouse](https://clickhouse.tech/) cluster in X-lab. For data details, please check the [data](https://github.com/X-lab2017/github-analysis-report/blob/master/docs/DATA.md) docs.
### GitHub Event Log

We use [GHArchive](https://www.gharchive.org/) as our data source for GitHub event logs and the data service is provided by [clickhouse](https://clickhouse.tech/) cluster cloud service. For data details, please check the [data](https://github.com/X-lab2017/open-digger/blob/master/docs/DATA.md) docs.

## Contributing guide

Please check the [contributing guide](https://www.x-lab.info/github-analysis-report/#/CONTRIBUTING) first if you want to be part of the report.
Please check the [contributing guide](http://www.x-lab.info/open-digger/#/CONTRIBUTING) first if you want to be part of the report.

## Architect & workflow

Please check the [architect](https://www.x-lab.info/github-analysis-report/#/architecture) and [workflow](https://www.x-lab.info/github-analysis-report/#/workflow) if you want to better understand the project.
Please check the [architect](https://www.x-lab.info/open-digger/#/architecture) and [workflow](https://www.x-lab.info/open-digger/#/workflow) if you want to better understand the project.

## Communication

Welcome to join our Slack workspace by clicking the Slack badge above if you want to communicate with us and learn more about the project.
Welcome to join our Slack workspace by clicking the Slack badge above if you want to communicate with us and learn more about the project. Or join the Wechat Group by scanning following QRCode.

![](./docs/assets/wechat-qrcode.png)

## License

Expand Down
14 changes: 7 additions & 7 deletions ROADMAP.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,20 @@
# Roadmap of GitHub analysis report
# Roadmap of OpenDigger

## Technical

- [WIP] New data structure and docs. Est. 2021.3.25
- [WIP] New technical architecture of report project. Est. 2021.3.25
- [WIP] Basic automation workflow. Est. 2021.4.8
- [Done] New data structure and docs.
- [Done] New technical architecture of report project. Based on pure HTML and static generation.
- [Done] Basic automation workflow. Based on GitHub Actions and Hypertrons
- [Long term] New analysis SQL components.
- [Long term] Stackoverflow data support.
- [Long term] NPM data support.
- [Long term] PyPI data support.

## Product

- [WIP] Online report prototype. Est. 2021.4.15
- [WIP] Online report service with WeChat payment. Est. 2021.4.29
- [Done] Online report prototype.
- [WIP] Online report service with WeChat payment. Est. 2021.5.29

## Community

- [WIP] Community governance handbook. Est. 2021.4.1
- [Done] Community governance handbook.
24 changes: 12 additions & 12 deletions docs/CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Contributing Guide
We would love for you to contribute to `github-analysis-report` and help make it even better than it is today! As a contributor, here are the guidelines we would like you to follow:
We would love for you to contribute to `OpenDigger` and help make it even better than it is today! As a contributor, here are the guidelines we would like you to follow:

- [Submitting an Issue](#issue)
- [Submitting a Pull Request](#pr)
Expand All @@ -11,7 +11,7 @@ If you have any questions or feature requests, please feel free to [submit an is
Before you submit an issue, consider the following guidelines:

- Please search for related issues. Make sure you are not going to open a duplicate issue.
- Please specify what kind of issue it is and explain it in the title or content, e.g. `feature`, `bug`, `documentation`, `discussion`, `help wanted`... The issue will be tagged automatically by the robot of the project(`analysis-report-bot`).
- Please specify what kind of issue it is and explain it in the title or content, e.g. `feature`, `bug`, `documentation`, `discussion`, `help wanted`... The issue will be tagged automatically by the robot of the project(`open-digger-bot`).

## <a name="pr"></a> Submitting a Pull Request

Expand All @@ -21,7 +21,7 @@ Before you submit your Pull Request, consider the following guidelines.

Be sure that an issue describes the problem you're fixing, or documents the design for the feature you'd like to add.

If you decide to fix an issue, please be sure to check the comment thread in case somebody is already working on a fix. If nobody is working on it at the moment, please leave a comment with `/self-assign` stating that you intend to work on it so other people don't accidentally duplicate your effort. `Analysis-report-bot` will set assignees of the issue to yourself automatically. Check [self_assign](https://www.x-lab.info/github-analysis-report/#/workflow?id=self_assign) to get more information.
If you decide to fix an issue, please be sure to check the comment thread in case somebody is already working on a fix. If nobody is working on it at the moment, please leave a comment with `/self-assign` stating that you intend to work on it so other people don't accidentally duplicate your effort. `open-digger-bot` will set assignees of the issue to yourself automatically. Check [self_assign](https://www.x-lab.info/open-digger#/workflow?id=self_assign) to get more information.

```shell
/self-assign
Expand All @@ -31,14 +31,14 @@ If somebody claims an issue but doesn't follow up for more than two weeks, it's

### 2. Fork and clone the repository

Visit [X-lab2017/github-analysis-report][repo] repo and make your own copy of the repository by **forking** it.
Visit [X-lab2017/open-digger][repo] repo and make your own copy of the repository by **forking** it.

Then **clone** your own copy of the repository to local, like :

```shell
# replace the XXX with your own user name
$ git clone [email protected]:xxx/github-analysis-report.git
$ cd github-analysis-report
$ git clone [email protected]:xxx/open-digger.git
$ cd open-digger
```

### 3. Create a new branch
Expand Down Expand Up @@ -67,7 +67,7 @@ You are encouraged to use [angular commit-message-format][angular-commit-message
Keep your local repository updated with upstream repository by:

```shell
$ git remote add upstream [email protected]:X-lab2017/github-analysis-report.git
$ git remote add upstream [email protected]:X-lab2017/open-digger.git
$ git fetch upstream master
$ git rebase upstream/master
```
Expand All @@ -87,7 +87,7 @@ $ git push -f origin your-branch-name

### 7. Create a Pull Request

In GitHub, send a pull request to [`X-lab2017/github-analysis-report`][repo].
In GitHub, send a pull request to [`X-lab2017/open-digger`][repo].

The core team is monitoring for pull requests. We will review your pull request and either merge it, request changes to it, or close it with an explanation.

Expand All @@ -105,7 +105,7 @@ If we suggest changes then:

That's it! Thank you for your contribution!
You can refer to [workflow](https://www.x-lab.info/github-analysis-report/#/workflow?id=appendix) to see more information about the `PR` workflow with `SQL` files invovled.
You can refer to [workflow](https://www.x-lab.info/open-digger/#/workflow?id=appendix) to see more information about the `PR` workflow with `SQL` files invovled.
### 8. After your pull request is merged
Expand Down Expand Up @@ -135,10 +135,10 @@ After your pull request is merged, you can safely delete your branch and pull th
$ git pull --ff upstream master
```
[new-issue]: https://github.com/X-lab2017/github-analysis-report/issues/new
[new-issue]: https://github.com/X-lab2017/open-digger/issues/new
[issue-label]: https://github.com/X-lab2017/github-analysis-report/labels
[issue-label]: https://github.com/X-lab2017/open-digger/labels
[repo]: https://github.com/X-lab2017/github-analysis-report
[repo]: https://github.com/X-lab2017/open-digger
[angular-commit-message-format]: https://github.com/angular/angular.js/blob/master/DEVELOPERS.md#-git-commit-guidelines
36 changes: 14 additions & 22 deletions docs/DATA.md
Original file line number Diff line number Diff line change
@@ -1,42 +1,34 @@
# Data Description

## Data Source
## GitHub Event Log

The data source of this project mainly comes from [GH Archive](https://www.gharchive.org/) which is a project to record the public GitHub timeline, archive it and make it easily accessible for further analysis. Each archive contains JSON encoded events as reported by the GitHub API. The raw JSON data is showing below. There are 6 important data features in this data, namely `id`,`type`,`actor`,`repo`,`payload`,`created_at`.
### Data Source

![](./pic/GHArchive_raw_data.png)
The data source comes from [GH Archive](https://www.gharchive.org/) which is a project to record the public GitHub timeline, archive it and make it easily accessible for further analysis. Each archive contains JSON encoded events as reported by the GitHub API. The raw JSON data is showing below. There are 6 important data features in this data, namely `id`, `type`, `actor`, `repo`, `payload`, `created_at`.

## Database
![](./assets/gharchive_raw_data.png)

In order to meet the requirement for high-speed analysis among such big data, we parse the row data into well-defined structure and import it into [ClickHouse](https://clickhouse.tech/) server which is an open source column-oriented database management system capable of real time generation of analytical data reports using SQL queries. The Clickhouse database version is 20.5.2.7 in our server.
### Database

## Data Schema in Database
In order to meet the requirement for high-speed analysis among such big data, we parse the row data into well-defined structure and import it into [ClickHouse](https://clickhouse.tech/) server which is an open source column-oriented database management system capable of real time generation of analytical data reports using SQL queries. The Clickhouse database version is 20.8.7.15 in our server.

The database table offered by the `Clickhouse` server is showing in [data description](./csv/data_description.csv). You can find a table with 127 rows of features which were parsed from the raw GH Archive datasets. Check the data descriptions and what features you want to play with.
### Data Schema in Database

## User Guide for Database Service
The database table offered by the `Clickhouse` server is showing in [data description](./assets/data_description.csv). You can find a table with 120+ rows of features which were parsed from the raw GHArchive datasets. Check the data descriptions and what features you want to play with.

For the detailed documentations for Clickhouse SQL usage, check out the [SQL reference](https://clickhouse.tech/docs/en/).
### User Guide for Database Service

## Examples
For the detailed documentations for Clickhouse SQL usage, check out the [SQL reference](https://clickhouse.tech/docs/en/).

There are some examples for query data from the Click house database table. You can find more real examples from `sqls`.
### Examples

* The number of distinct repositories on GitHub
There is an example for query data from the Clickhouse database table. You can find more real examples from study SQL components.

```
SELECT repo_id, sum(repo_size) AS sum_repo_size, COUNT(*) AS count
FROM {databse}.{table}
WHERE type = 'PullRequestEvent' OR type='PullRequestReviewCommentEvent'
GROUP BY repo_id
```

* Pull Request review comment data from a organization
* Pull Request review comment data from an organization

```
SELECT actor_id,actor_login,repo_id,repo_name,issue_id, action, created_at
SELECT actor_id, actor_login, repo_id, repo_name, issue_id, action, created_at
FROM {database}.{table}
WHERE type='PullRequestReviewCommentEvent' AND repo_name LIKE '{org}/%'
ORDER BY created_at ASC
```

25 changes: 18 additions & 7 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,34 @@
# GitHub Analysis Report
# OpenDigger

[GitHub analysis report](https://github.com/X-lab2017/github-analysis-report) is an open source analysis report project for GitHub initiated by [X-lab](https://x-lab.info), this project aims to combine the wisdom of global developers to jointly analyze and insight into GitHub's developer behavior data to help everyone better understand and participate in open source.
Open digger is an open source analysis report project for all open source data initiated by [X-lab](https://x-lab.info), this project aims to combine the wisdom of global developers to jointly analyze and insight into open source related data to help everyone better understand and participate in open source.

## Report

We will generate reports into static web pages for viewing. Currently, we have following reports,

- [Global Study Report](http://opendigger-oss.x-lab.info/global-study.html)
- [Apache Software Foundation Study Report](http://opendigger-oss.x-lab.info/case-study-ASF.html)

## Data

We use [GHArchive](https://www.gharchive.org/) as our data source for GitHub logs and the data service is provided by self host [clickhouse](https://clickhouse.tech/) cluster in X-lab. For data details, please check the [data](https://www.x-lab.info/github-analysis-report/#/data) docs.
### GitHub Event Log

We use [GHArchive](https://www.gharchive.org/) as our data source for GitHub event logs and the data service is provided by [clickhouse](https://clickhouse.tech/) cluster cloud service. For data details, please check the [data](https://github.com/X-lab2017/open-digger/blob/master/docs/DATA.md) docs.

## Contributing guide

Please check the [contributing guide](https://www.x-lab.info/github-analysis-report/#/CONTRIBUTING) first if you want to be part of the report.
Please check the [contributing guide](http://www.x-lab.info/open-digger/#/CONTRIBUTING) first if you want to be part of the report.

## Architect & workflow

Please check the [architect](https://www.x-lab.info/github-analysis-report/#/architecture) and [workflow](https://www.x-lab.info/github-analysis-report/#/workflow) if you want to better understand the project.
Please check the [architect](https://www.x-lab.info/open-digger/#/architecture) and [workflow](https://www.x-lab.info/open-digger/#/workflow) if you want to better understand the project.

## Communication

Welcome to join our Slack workspace by clicking the Slack badge above if you want to communicate with us and learn more about the project.
Welcome to join our Slack workspace by clicking the Slack badge above if you want to communicate with us and learn more about the project. Or join the Wechat Group by scanning following QRCode.

![](./assets/wechat-qrcode.png)

## License

We use [Apache-2.0 license](https://github.com/X-lab2017/github-analysis-report/blob/master/LICENSE) for code part and [CC-BY-4.0 license](https://github.com/X-lab2017/github-analysis-report/blob/master/LICENSE-CC-BY) for report part, please make sure abide by the licenses when using the project.
We use [Apache-2.0 license](https://github.com/X-lab2017/open-digger/blob/master/LICENSE) for code part and [CC-BY-4.0 license](https://github.com/X-lab2017/open-digger/blob/master/LICENSE-CC-BY) for report part, please make sure abide by the licenses when using the project.
4 changes: 2 additions & 2 deletions docs/_coverpage.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# GitHub Analysis Report
# OpenDigger

> An open source collaborate report for GitHub
> An open source collaborate report for open source
- Full access to all records on GitHub
- Global collaborate
Expand Down
7 changes: 0 additions & 7 deletions docs/_sidebar.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,3 @@
- [Report](/report.md)
- Case study
- [ASF](/case-study/ASF.md)
- [CNCF](/case-study/CNCF.md)
- [LF AI & Data](/case-study/LFAI.md)
- [Wuhan2020](/case-study/Wuhan2020.md)
- [Contributing Guide](/CONTRIBUTING.md)
- [Architecture](/architecture.md)
- [Workflow](/workflow.md)
- [Data Description](/data.md)
26 changes: 0 additions & 26 deletions docs/architecture.md

This file was deleted.

12 changes: 7 additions & 5 deletions docs/diagrams/architect.uml → docs/assets/architect.uml
Original file line number Diff line number Diff line change
@@ -1,22 +1,24 @@
@startuml

footer Architect of GitHub Analysis Report
footer Architect of OpenDigger

actor Developer as dev

node GitHub as github {
node "github-analysis-report" as repo {
node "open-digger" as repo {
folder ".github/hypertrons-components" as cwf {
file "Custom workflows"
}
folder "sqls" as sqls {
folder "global-study" as sqls {
file "All SQLs"
}
folder "case-study" as sqls {
file "All SQLs"
}
file "REPORT.md" as report
}
}

node "analysis-report-bot" as bot
node "open-digger-bot" as bot
note right: A GitHub app powered by Hypertrons

node "data service" as ds
Expand Down
File renamed without changes
File renamed without changes.
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
Binary file added docs/assets/wechat-qrcode.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 0 additions & 1 deletion docs/case-study/ASF.md

This file was deleted.

1 change: 0 additions & 1 deletion docs/case-study/CNCF.md

This file was deleted.

1 change: 0 additions & 1 deletion docs/case-study/LFAI.md

This file was deleted.

1 change: 0 additions & 1 deletion docs/case-study/Wuhan2020.md

This file was deleted.

Loading

0 comments on commit 850054d

Please sign in to comment.