Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor job footprint #277

Closed
wants to merge 59 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
59 commits
Select commit Hold shift + click to select a range
684cb5a
feat: change statistics render of metric plot to min/max/median
spacehamster87 May 8, 2024
786770f
Start to convert to new footprint layout
moebiusband73 Jun 28, 2024
aede5f7
Introduce adapted graphql schema
moebiusband73 Jun 28, 2024
97c807c
Add migration for footprint
moebiusband73 Jun 28, 2024
b3c1f39
Merge branch 'master' into Refactor-job-footprint
moebiusband73 Jun 28, 2024
130613b
Fix build errors
moebiusband73 Jun 28, 2024
bd89ce7
Extend schema and start Unit test implementation
moebiusband73 Jul 2, 2024
b059099
Add test for clusterConfig
moebiusband73 Jul 3, 2024
61eebc9
Rework initial commit
spacehamster87 Jul 3, 2024
1b70596
Fix and test subcluster Config
moebiusband73 Jul 4, 2024
1072d7b
Improve auth handling of rest apis used in frontend for compatibility
spacehamster87 Jul 4, 2024
614f694
fix typo in api url
spacehamster87 Jul 4, 2024
80c46be
Fix bugs and failed testcases
moebiusband73 Jul 4, 2024
ac9bba8
Restructure and simplify job repo
moebiusband73 Jul 4, 2024
9d47675
Restructure config frontend, add user jwt request
spacehamster87 Jul 4, 2024
3afe400
rename api userconfig to frontend, return json on api auth error
spacehamster87 Jul 5, 2024
63fb923
fix: fix api test router init
spacehamster87 Jul 5, 2024
be9df76
fix: setup user in api test config
spacehamster87 Jul 5, 2024
0a60433
Fix other apitest subtests
spacehamster87 Jul 5, 2024
11176da
Merge branch 'Refactor-job-footprint' into 264_user_api_access
spacehamster87 Jul 5, 2024
c6ede67
Add energy footprint
moebiusband73 Jul 5, 2024
a54acb8
Merge branch '264_user_api_access' into Refactor-job-footprint
moebiusband73 Jul 5, 2024
f1e341f
Initial commit for frontend refactor
spacehamster87 Jul 9, 2024
0240997
Merge branch '263_use_median_for_statsseries' into Refactor-job-footp…
spacehamster87 Jul 9, 2024
bf6b87d
Fix circular import after merge
spacehamster87 Jul 9, 2024
f1427d5
Add global metric list including graphQL query
moebiusband73 Jul 11, 2024
e8e3b15
Switch to Go 1.22 to get rid of global loop variable bug
moebiusband73 Jul 11, 2024
b64ce1f
Add LowerIsBetter Metric boolean. Upgrade dependencies.
moebiusband73 Jul 11, 2024
0adfb63
Update go version to 1.22 for Github test workflow
moebiusband73 Jul 11, 2024
a491289
Frontend refactor backend changes
spacehamster87 Jul 11, 2024
e14d6a8
fix: fix db migration to v8, changes key name to cpu_load
spacehamster87 Jul 11, 2024
68cf952
Merge branch 'Refactor-job-footprint' of https://github.com/ClusterCo…
spacehamster87 Jul 11, 2024
a8721dc
Regenerate gql after internal merge
spacehamster87 Jul 11, 2024
a07d167
Fix build error with updated prometheus client
moebiusband73 Jul 12, 2024
68a97dc
Add footprint to global metric list
moebiusband73 Jul 12, 2024
c61ffce
Make job query on metric stats generic
moebiusband73 Jul 12, 2024
0458675
Convert histogram query to json keys
moebiusband73 Jul 12, 2024
e348ec7
Fix bugs in stats.go
moebiusband73 Jul 12, 2024
01a4d33
Refactor: Archive workers and Tasks
moebiusband73 Jul 14, 2024
801607f
Refactor main
moebiusband73 Jul 16, 2024
b6f011c
Move footprint update task placeholder to taskmanager
moebiusband73 Jul 16, 2024
721b6b2
Change footprint variabel from bool to string
moebiusband73 Jul 20, 2024
c2f72f7
Update go dependencies
moebiusband73 Jul 20, 2024
c4d93e4
Remove bugs in main init
moebiusband73 Jul 20, 2024
6a1cb51
Refactor svelte frontend
spacehamster87 Jul 22, 2024
e65100c
Add vscode @component comment to every svelte file, remove unused js …
spacehamster87 Jul 25, 2024
18369da
Fix small oversight. remove wip plot component
spacehamster87 Jul 26, 2024
3ca1127
Restructure frontend svelte file src folder
spacehamster87 Jul 26, 2024
c80d3a6
fix: errors in import paths
spacehamster87 Aug 1, 2024
ce9995d
fix: fix wrongly inserted gql request and import path error
spacehamster87 Aug 8, 2024
561fd41
fix: add accelerator scope to to-be archived scopes
spacehamster87 Aug 13, 2024
9b6db46
Refactor: Remove redundant code
moebiusband73 Aug 15, 2024
ba2f406
Extend sqlite db migration
moebiusband73 Aug 15, 2024
e1faba0
Update cluster json schema
moebiusband73 Aug 15, 2024
5c99f5f
Only add footprint columns if not 0
moebiusband73 Aug 15, 2024
d6a8889
Refactor: Reduce struct memory size
moebiusband73 Aug 15, 2024
5e074da
Resolve error in migration
moebiusband73 Aug 15, 2024
49e0a2c
fix: add compatibility for footprint metrics without config
spacehamster87 Aug 15, 2024
5535c57
Merge branch 'Refactor-job-footprint' of https://github.com/ClusterCo…
spacehamster87 Aug 15, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ jobs:
- name: Install Go
uses: actions/setup-go@v4
with:
go-version: 1.20.x
go-version: 1.22.x
- name: Checkout code
uses: actions/checkout@v3
- name: Build, Vet & Test
Expand Down
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ TARGET = ./cc-backend
VAR = ./var
CFG = config.json .env
FRONTEND = ./web/frontend
VERSION = 1.3.1
VERSION = 1.4.0
GIT_HASH := $(shell git rev-parse --short HEAD || echo 'development')
CURRENT_TIME = $(shell date +"%Y-%m-%d:T%H:%M:%S")
LD_FLAGS = '-s -X main.date=${CURRENT_TIME} -X main.version=${VERSION} -X main.commit=${GIT_HASH}'
Expand Down
58 changes: 42 additions & 16 deletions api/schema.graphqls
Original file line number Diff line number Diff line change
Expand Up @@ -27,12 +27,7 @@ type Job {
tags: [Tag!]!
resources: [Resource!]!
concurrentJobs: JobLinkResultList

memUsedMax: Float
flopsAnyAvg: Float
memBwAvg: Float
loadAvg: Float

footprint: [FootprintValue]
metaData: Any
userData: User
}
Expand All @@ -45,7 +40,6 @@ type JobLink {
type Cluster {
name: String!
partitions: [String!]! # Slurm partitions
metricConfig: [MetricConfig!]!
subClusters: [SubCluster!]! # Hardware partitions/subclusters
}

Expand All @@ -61,9 +55,18 @@ type SubCluster {
flopRateSimd: MetricValue!
memoryBandwidth: MetricValue!
topology: Topology!
metricConfig: [MetricConfig!]!
footprint: [String!]!
}

type FootprintValue {
name: String!
stat: String!
value: Float!
}

type MetricValue {
name: String
unit: Unit!
value: Float!
}
Expand Down Expand Up @@ -102,6 +105,7 @@ type MetricConfig {
normal: Float
caution: Float!
alert: Float!
lowerIsBetter: Boolean
subClusters: [SubClusterConfig!]!
}

Expand Down Expand Up @@ -150,9 +154,10 @@ type MetricStatistics {
}

type StatsSeries {
mean: [NullableFloat!]!
min: [NullableFloat!]!
max: [NullableFloat!]!
mean: [NullableFloat!]!
median: [NullableFloat!]!
min: [NullableFloat!]!
max: [NullableFloat!]!
}

type MetricFootprints {
Expand Down Expand Up @@ -180,6 +185,19 @@ type NodeMetrics {
metrics: [JobMetricWithName!]!
}

type ClusterSupport {
cluster: String!
subClusters: [String!]!
}

type GlobalMetricListItem {
name: String!
unit: Unit!
scope: MetricScope!
footprint: String
availability: [ClusterSupport!]!
}

type Count {
name: String!
count: Int!
Expand All @@ -191,9 +209,15 @@ type User {
email: String!
}

input MetricStatItem {
metricName: String!
range: FloatRange!
}

type Query {
clusters: [Cluster!]! # List of all clusters
tags: [Tag!]! # List of all tags
globalMetrics: [GlobalMetricListItem!]!

user(username: String!): User
allocatedNodes(cluster: String!): [Count!]!
Expand Down Expand Up @@ -241,17 +265,14 @@ input JobFilter {

startTime: TimeRange
state: [JobState!]
flopsAnyAvg: FloatRange
memBwAvg: FloatRange
loadAvg: FloatRange
memUsedMax: FloatRange

metricStats: [MetricStatItem!]
exclusive: Int
node: StringInput
}

input OrderByInput {
field: String!
type: String!,
order: SortDirectionEnum! = ASC
}

Expand All @@ -270,9 +291,13 @@ input StringInput {
}

input IntRange { from: Int!, to: Int! }
input FloatRange { from: Float!, to: Float! }
input TimeRange { from: Time, to: Time }

input FloatRange {
from: Float!
to: Float!
}

type JobResultList {
items: [Job!]!
offset: Int
Expand All @@ -295,6 +320,7 @@ type HistoPoint {
type MetricHistoPoints {
metric: String!
unit: String!
stat: String
data: [MetricHistoPoint!]
}

Expand Down
33 changes: 33 additions & 0 deletions cmd/cc-backend/cli.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
// Copyright (C) NHR@FAU, University Erlangen-Nuremberg.
// All rights reserved.
// Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file.
package main

import "flag"

var (
flagReinitDB, flagInit, flagServer, flagSyncLDAP, flagGops, flagMigrateDB, flagRevertDB, flagForceDB, flagDev, flagVersion, flagLogDateTime bool
flagNewUser, flagDelUser, flagGenJWT, flagConfigFile, flagImportJob, flagLogLevel string
)

func cliInit() {
flag.BoolVar(&flagInit, "init", false, "Setup var directory, initialize swlite database file, config.json and .env")
flag.BoolVar(&flagReinitDB, "init-db", false, "Go through job-archive and re-initialize the 'job', 'tag', and 'jobtag' tables (all running jobs will be lost!)")
flag.BoolVar(&flagSyncLDAP, "sync-ldap", false, "Sync the 'user' table with ldap")
flag.BoolVar(&flagServer, "server", false, "Start a server, continues listening on port after initialization and argument handling")
flag.BoolVar(&flagGops, "gops", false, "Listen via github.com/google/gops/agent (for debugging)")
flag.BoolVar(&flagDev, "dev", false, "Enable development components: GraphQL Playground and Swagger UI")
flag.BoolVar(&flagVersion, "version", false, "Show version information and exit")
flag.BoolVar(&flagMigrateDB, "migrate-db", false, "Migrate database to supported version and exit")
flag.BoolVar(&flagRevertDB, "revert-db", false, "Migrate database to previous version and exit")
flag.BoolVar(&flagForceDB, "force-db", false, "Force database version, clear dirty flag and exit")
flag.BoolVar(&flagLogDateTime, "logdate", false, "Set this flag to add date and time to log messages")
flag.StringVar(&flagConfigFile, "config", "./config.json", "Specify alternative path to `config.json`")
flag.StringVar(&flagNewUser, "add-user", "", "Add a new user. Argument format: `<username>:[admin,support,manager,api,user]:<password>`")
flag.StringVar(&flagDelUser, "del-user", "", "Remove user by `username`")
flag.StringVar(&flagGenJWT, "jwt", "", "Generate and print a JWT for the user specified by its `username`")
flag.StringVar(&flagImportJob, "import-job", "", "Import a job. Argument format: `<path-to-meta.json>:<path-to-data.json>,...`")
flag.StringVar(&flagLogLevel, "loglevel", "warn", "Sets the logging level: `[debug,info,warn (default),err,fatal,crit]`")
flag.Parse()
}
85 changes: 85 additions & 0 deletions cmd/cc-backend/init.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
// Copyright (C) NHR@FAU, University Erlangen-Nuremberg.
// All rights reserved.
// Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file.
package main

import (
"fmt"
"os"

"github.com/ClusterCockpit/cc-backend/internal/repository"
"github.com/ClusterCockpit/cc-backend/internal/util"
"github.com/ClusterCockpit/cc-backend/pkg/log"
)

const envString = `
# Base64 encoded Ed25519 keys (DO NOT USE THESE TWO IN PRODUCTION!)
# You can generate your own keypair using the gen-keypair tool
JWT_PUBLIC_KEY="kzfYrYy+TzpanWZHJ5qSdMj5uKUWgq74BWhQG6copP0="
JWT_PRIVATE_KEY="dtPC/6dWJFKZK7KZ78CvWuynylOmjBFyMsUWArwmodOTN9itjL5POlqdZkcnmpJ0yPm4pRaCrvgFaFAbpyik/Q=="

# Some random bytes used as secret for cookie-based sessions (DO NOT USE THIS ONE IN PRODUCTION)
SESSION_KEY="67d829bf61dc5f87a73fd814e2c9f629"
`

const configString = `
{
"addr": "127.0.0.1:8080",
"archive": {
"kind": "file",
"path": "./var/job-archive"
},
"jwts": {
"max-age": "2000h"
},
"clusters": [
{
"name": "name",
"metricDataRepository": {
"kind": "cc-metric-store",
"url": "http://localhost:8082",
"token": ""
},
"filterRanges": {
"numNodes": {
"from": 1,
"to": 64
},
"duration": {
"from": 0,
"to": 86400
},
"startTime": {
"from": "2023-01-01T00:00:00Z",
"to": null
}
}
}
]
}
`

func initEnv() {
if util.CheckFileExists("var") {
fmt.Print("Directory ./var already exists. Exiting!\n")
os.Exit(0)
}

if err := os.WriteFile("config.json", []byte(configString), 0o666); err != nil {
log.Fatalf("Writing config.json failed: %s", err.Error())
}

if err := os.WriteFile(".env", []byte(envString), 0o666); err != nil {
log.Fatalf("Writing .env failed: %s", err.Error())
}

if err := os.Mkdir("var", 0o777); err != nil {
log.Fatalf("Mkdir var failed: %s", err.Error())
}

err := repository.MigrateDB("sqlite3", "./var/job.db")
if err != nil {
log.Fatalf("Initialize job.db failed: %s", err.Error())
}
}
Loading
Loading