Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gsc charts migration #274

Closed

Conversation

se7entyse7en
Copy link
Contributor

@se7entyse7en se7entyse7en commented Sep 3, 2019

This closes #235.

The first commit is simply a re-export of overview dashboard. The second commit does the actual job.

One chart is not working, and it's blocked by: https://github.com/src-d/gitbase-spark-connector-enterprise/issues/124

screencapture-127-0-0-1-8088-superset-dashboard-1-2019-09-03-15_11_59


  • I have updated the CHANGELOG file according to the conventions in keepachangelog.com
  • This PR contains changes that do not require a mention in the CHANGELOG file

@dpordomingo
Copy link
Contributor

I assumed this PR was about mostly changing queries, but it seems it's much more than that because reviewing the diff of 915bfa0, I saw there were many other changes and no clear mapping between old and new query.

@se7entyse7en
Copy link
Contributor Author

I assumed this PR was about mostly changing queries, but it seems it's much more than that because reviewing the diff of 915bfa0, I saw there were many other changes and no clear mapping between old and new query.

@dpordomingo, unfortunately, diffing import/export is a mess 😞. What I did in this PR was these:

  1. simply export/import of current charts
  2. change queries to work with gsc
  3. export charts.

@se7entyse7en
Copy link
Contributor Author

Given that https://github.com/src-d/gitbase-spark-connector-enterprise/issues/124 has been closed already, I'm gonna include the fix to the missing chart in this PR.

@se7entyse7en
Copy link
Contributor Author

I'll wait for a new release of gsc. Also unfortunately conflicts on charts 😢.

@se7entyse7en se7entyse7en changed the base branch from gsc-migration to master September 26, 2019 16:13
@se7entyse7en se7entyse7en changed the base branch from master to gsc-migration September 26, 2019 16:13
@se7entyse7en
Copy link
Contributor Author

I updated this after the UI changes that have been included in master. Here's a screenshot:

screencapture-127-0-0-1-8088-superset-dashboard-1-2019-09-26-17_28_34

The missing chart will be fixed once this is released and we update the gsc version.

It would be great if you could give it a try locally. Here's the docker-compose.yml file that I used:

version: '3.2'
services:
  bblfsh:
    image: bblfsh/bblfshd:v2.14.0-drivers
    restart: unless-stopped
    privileged: true
    ports:
      - 9432:9432

  gitcollector:
    image: srcd/gitcollector:v0.0.3
    # wait for db
    command: ['/bin/sh', '-c', 'sleep 10s && gitcollector download']
    environment:
      GITHUB_ORGANIZATIONS: ${GITHUB_ORGANIZATIONS-}
      GITHUB_TOKEN: ${GITHUB_TOKEN-}
      # use main db
      GITCOLLECTOR_METRICS_DB_URI: postgresql://superset:superset@postgres:5432/superset?sslmode=disable
      GITCOLLECTOR_NO_UPDATES: 'true'
      GITCOLLECTOR_NO_FORKS: ${NO_FORKS-true}
      LOG_LEVEL: ${LOG_LEVEL-info}
    depends_on:
      - postgres
    volumes:
      - type: ${GITBASE_VOLUME_TYPE}
        source: ${GITBASE_VOLUME_SOURCE}
        target: /library
        consistency: delegated
    deploy:
      resources:
        limits:
          cpus: ${GITCOLLECTOR_LIMIT_CPU-0.0}

  ghsync:
    image: srcd/ghsync:v0.2.0
    entrypoint: ['/bin/sh']
    # wait for db to be created
    # we need to use something like https://github.com/vishnubob/wait-for-it
    # or implement wait in ghsync itself
    command: ['-c', 'sleep 10s && ghsync migrate && ghsync shallow']
    depends_on:
      - metadatadb
    environment:
      GHSYNC_ORGS: ${GITHUB_ORGANIZATIONS-}
      GHSYNC_TOKEN: ${GITHUB_TOKEN-}
      GHSYNC_POSTGRES_DB: metadata
      GHSYNC_POSTGRES_USER: metadata
      GHSYNC_POSTGRES_PASSWORD: metadata
      GHSYNC_POSTGRES_HOST: metadatadb
      GHSYNC_POSTGRES_PORT: 5432
      GHSYNC_NO_FORKS: ${NO_FORKS-true}
      LOG_LEVEL: ${LOG_LEVEL-info}

  gitbase:
    image: srcd/gitbase:v0.23.1
    restart: unless-stopped
    ports:
      - 3306:3306
    environment:
      BBLFSH_ENDPOINT: bblfsh:9432
      SIVA: ${GITBASE_SIVA}
      GITBASE_LOG_LEVEL: ${LOG_LEVEL-info}
    depends_on:
      - bblfsh
    volumes:
      - type: ${GITBASE_VOLUME_TYPE}
        source: ${GITBASE_VOLUME_SOURCE}
        target: /opt/repos
        read_only: true
        consistency: delegated
      - gitbase_indexes:/var/lib/gitbase/index
    deploy:
      resources:
        limits:
          cpus: ${GITBASE_LIMIT_CPU-0.0}
          memory: ${GITBASE_LIMIT_MEM-0}

  bblfsh-web:
    image: bblfsh/web:v0.11.3
    restart: unless-stopped
    command: -bblfsh-addr bblfsh:9432
    ports:
      - 9999:8080
    depends_on:
      - bblfsh

  redis:
    image: redis:5-alpine
    restart: unless-stopped
    ports:
      - 6379:6379
    volumes:
      - redis:/data

  postgres:
    image: postgres:10-alpine
    restart: unless-stopped
    environment:
      POSTGRES_DB: superset
      POSTGRES_PASSWORD: superset
      POSTGRES_USER: superset
    ports:
      - 5432:5432
    volumes:
      - postgres:/var/lib/postgresql/data

  metadatadb:
    image: postgres:10-alpine
    restart: unless-stopped
    environment:
      POSTGRES_DB: metadata
      POSTGRES_PASSWORD: metadata
      POSTGRES_USER: metadata
    ports:
      - 5433:5432
    volumes:
      - metadata:/var/lib/postgresql/data

  gsc:
    image: srcd/gitbase-spark-connector-enterprise:v0.6.0-with-pg
    command:
      - start-thrift-server
    depends_on:
      - gitbase
      - bblfsh
    ports:
      # spark UI
      - 4040:4040
      # Thrift server
      - 10000:10000
    environment:
      - BBLFSH_HOST=bblfsh
      - BBLFSH_PORT=9432
      - GITBASE_SERVERS=gitbase:3306

  sourced-ui:
    image: srcd/sourced-ui:v0.7.0-gsc-migration-v0.0.1
    restart: unless-stopped
    environment:
      SYNC_MODE: ${GITBASE_SIVA}
      ADMIN_LOGIN: admin
      ADMIN_FIRST_NAME: admin
      ADMIN_LAST_NAME: admin
      ADMIN_EMAIL: [email protected]
      ADMIN_PASSWORD: admin
      POSTGRES_DB: superset
      POSTGRES_USER: superset
      POSTGRES_PASSWORD: superset
      POSTGRES_HOST: postgres
      POSTGRES_PORT: 5432
      REDIS_HOST: redis
      REDIS_PORT: 6379
      GSC_DB: default
      GSC_USER:
      GSC_PASSWORD:
      GSC_HOST: gsc
      GSC_PORT: 10000
      METADATA_DB: metadata
      METADATA_USER: metadata
      METADATA_PASSWORD: metadata
      METADATA_HOST: metadatadb
      METADATA_PORT: 5432
      BBLFSH_WEB_HOST: bblfsh-web
      BBLFSH_WEB_PORT: 8080
      SUPERSET_ENV: production
    ports:
      - 8088:8088
    depends_on:
      - postgres
      - metadatadb
      - redis
      - gitbase
      - bblfsh-web

  sourced-ui-celery:
    image: srcd/sourced-ui:v0.7.0-gsc-migration-v0.0.1
    restart: unless-stopped
    environment:
      SYNC_MODE: ${GITBASE_SIVA}
      ADMIN_LOGIN: admin
      ADMIN_FIRST_NAME: admin
      ADMIN_LAST_NAME: admin
      ADMIN_EMAIL: [email protected]
      ADMIN_PASSWORD: admin
      POSTGRES_DB: superset
      POSTGRES_USER: superset
      POSTGRES_PASSWORD: superset
      POSTGRES_HOST: postgres
      POSTGRES_PORT: 5432
      REDIS_HOST: redis
      REDIS_PORT: 6379
      GSC_DB: default
      GSC_USER:
      GSC_PASSWORD:
      GSC_HOST: gsc
      GSC_PORT: 10000
      METADATA_DB: metadata
      METADATA_USER: metadata
      METADATA_PASSWORD: metadata
      METADATA_HOST: metadatadb
      METADATA_PORT: 5432
      BBLFSH_WEB_HOST: bblfsh-web
      BBLFSH_WEB_PORT: 8080
      SUPERSET_ENV: celery
    depends_on:
      - postgres
      - metadatadb
      - redis
      - gitbase
      - sourced-ui

volumes:
  gitbase_repositories:
    external: false
  gitbase_indexes:
    external: false
  metadata:
    external: false
  postgres:
    external: false
  redis:
    external: false

This image srcd/sourced-ui:v0.7.0-gsc-migration-v0.0.1 is simply the build image corresponding to this branch. While this image srcd/gitbase-spark-connector-enterprise:v0.6.0-with-pg is a locally build image of gsc starting from v0.6.0 where I changed the build.sbt to include at compile time posgtres (it was the fastest thing that I found to make it work). Here's the diff:

diff --git a/build.sbt b/build.sbt
index de48254..76542d9 100644
--- a/build.sbt
+++ b/build.sbt
@@ -14,6 +14,7 @@ lazy val root = (project in file(".")).
     libraryDependencies += sparkHiveThriftserver % Provided,
     libraryDependencies ++= Seq(
       mysql % Compile,
+      postgresql % Compile,
       enry % Compile,
       bblfsh % Compile,
       dns % Compile
diff --git a/project/Dependencies.scala b/project/Dependencies.scala
index b179143..8374949 100644
--- a/project/Dependencies.scala
+++ b/project/Dependencies.scala
@@ -6,6 +6,7 @@ object Dependencies {
   lazy val sparkHiveThriftserver = "org.apache.spark" %% "spark-hive-thriftserver" % "2.3.1"
   lazy val dns = "com.spotify" % "dns" % "3.1.5"
   lazy val mysql = "mysql" % "mysql-connector-java" % "8.0.16"
+  lazy val postgresql = "org.postgresql" % "postgresql" % "42.2.6"
   lazy val enry = "tech.sourced" % "enry-java" % "1.7.1"
   lazy val bblfsh = "org.bblfsh" % "bblfsh-client" % "1.10.1"
   lazy val dockerJava = "com.github.docker-java" % "docker-java" % "3.0.14"

IMHO using gsc is not feasible as-is for local mode. If you try refreshing the dashboard many charts will timeout and in the meantime, the application becomes very laggy and sometimes unresponsive. I'd like to know your experience too.

@smacker
Copy link
Contributor

smacker commented Sep 27, 2019

I run it locally on a very small org go-chi. I confirm it is very slow and consumes a lot of resources. On such a small data charts didn't time out but it took ~5m to fully load Overview dashboard. Force-refresh happens a little faster (looks like gitbase/spark/... could cache data for some charts), 3 minutes which is still ridiculously slow.

@se7entyse7en
Copy link
Contributor Author

See here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants