Skip to content

Commit

Permalink
Merge remote-tracking branch 'origin/main' into biblioVino
Browse files Browse the repository at this point in the history
  • Loading branch information
fsteeg committed Feb 20, 2024
2 parents 5458e96 + 2b42c92 commit f05b812
Show file tree
Hide file tree
Showing 197 changed files with 265,893 additions and 242,538 deletions.
10 changes: 2 additions & 8 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,15 +11,9 @@ jobs:
uses: actions/setup-java@v1
with:
java-version: 1.8
- name: Install metafacture-core 5.7.0-rc1
- name: Install metafacture-fix 0.7.0
run: |
git clone https://github.com/metafacture/metafacture-core.git -b 5.7.0-rc1
cd metafacture-core
./gradlew publishToMavenLocal
cd ..
- name: Install metafacture-fix 0.6.0-rc3
run: |
git clone https://github.com/metafacture/metafacture-fix.git -b 0.6.0-rc3
git clone https://github.com/metafacture/metafacture-fix.git -b 0.7.0
cd metafacture-fix
./gradlew publishToMavenLocal
cd ..
Expand Down
7 changes: 7 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ bin
.project
.classpath
.settings
/conf/maps/beacons/*.tsv
/conf/output/*
!/conf/output/test-output-*.json
!/conf/output/rpb-50*
Expand All @@ -15,4 +16,10 @@ RPB-Export_HBZ_SW.txt
RPB-Export_HBZ_Tit.txt
RPB-Export_HBZ_Tit_hbzIds.txt
RPB-Export_HBZ_Bio.txt
RPB-Export_HBZ_Ort.txt
RPB-Export_HBZ_Raum.txt
RPB-Export_HBZ_SWN.txt
RPB-Export_HBZ_Syst.txt
RPB-Export_HBZ_ZSS.txt
conf/RPBEXP/*.ZIP
nohup.out*
44 changes: 40 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,13 +21,37 @@ cd rpb

## Deployment overview

The overall RPB system consists of 4 applications: RPB & BiblioVino (Java/Play, based on [NWBib](https://github.com/hbz/nwbib)), RPPD (Java/Play, based on [lobid-gnd](https://github.com/hbz/lobid-gnd)), and Strapi-RPB (JavaScript/React).

### RPB & BiblioVino

Source code: https://github.com/hbz/rpb & https://github.com/hbz/rpb/tree/biblioVino (https://github.com/hbz/rpb/pull/52)

| | Production | | Test | |
| --- | ---------- | --- | ---- | --- |
| Index alias | resources-rpb | http://weywot3.hbz-nrw.de:9200/resources-rpb/_search | resources-rpb-test | http://weywot3.hbz-nrw.de:9200/resources-rpb-test/_search
| lobid-resources instance (port 1990) | quaoar1:~/git/lobid-resources-rpb | http://quaoar1:1990/resources/search | quaoar3:~/git/lobid-resources-rpb | http://quaoar3:1990/resources/search |
| rpb instance (port 1991) | quaoar1:~/git/rpb | https://rpb.lobid.org/search | quaoar3:~/git/rpb | http://test.rpb.lobid.org/search |
| BiblioVino (port 1992) | quaoar1:~/git/biblioVino | http://wein.lobid.org/search | quaoar3:~/git/biblioVino | http://test.wein.lobid.org/search |

### RPPD

Source code: https://github.com/hbz/lobid-gnd/tree/rppd (https://github.com/hbz/lobid-gnd/pull/361)

| | Production | | Test | |
| --- | ---------- | --- | ---- | --- |
| Index alias | gnd-rppd | http://weywot3.hbz-nrw.de:9200/gnd-rppd/_search | gnd-rppd-test | http://weywot3.hbz-nrw.de:9200/gnd-rppd-test/_search
| rppd instance (port 1993) | quaoar1:~/git/rppd | https://rppd.lobid.org/search | quaoar3:~/git/rppd | http://test.rppd.lobid.org/search |

### Strapi-RPB

Source code: https://github.com/hbz/strapi-rpb

| | Production | | Test | |
| --- | ---------- | --- | ---- | --- |
| Admin UI | metadaten-nrw:/opt/strapi-rpb$ | https://rpb-cms.lobid.org/admin | test-metadaten-nrw:/opt/strapi-rpb$ | https://rpb-cms-test.lobid.org/admin
| Search API | " | [https://rpb-cms.lobid.org/api/articles?populate=*](https://rpb-cms.lobid.org/api/articles?populate=*) | " | [https://rpb-cms-test.lobid.org/api/articles?populate=*](https://rpb-cms-test.lobid.org/api/articles?populate=*) |

## Transformation development

### Create lookup table
Expand Down Expand Up @@ -60,14 +84,14 @@ This attempts to import all data selected with the `PICK` variable to the API en
To reimport existing entries, these may need to be deleted first, e.g. for `articles/1` to `articles/5`:

```
curl --request DELETE http://test-metadaten-nrw.hbz-nrw.de:1339/api/articles/[1-5]
curl --request DELETE http://test-metadaten-nrw.hbz-nrw.de:1337/api/articles/[1-5]
```

After import they are available at e.g. http://test-metadaten-nrw.hbz-nrw.de:1339/api/articles?populate=*
After import they are available at e.g. http://test-metadaten-nrw.hbz-nrw.de:1337/api/articles?populate=*

Entries using the same path can be filtered, e.g. to get only volumes (`f36_=sbd`):

http://test-metadaten-nrw.hbz-nrw.de:1339/api/independent-works?filters[f36_][$eq]=sbd&populate=*
http://test-metadaten-nrw.hbz-nrw.de:1337/api/independent-works?filters[f36_][$eq]=sbd&populate=*

### Import SKOS data into strapi

Expand Down Expand Up @@ -119,13 +143,25 @@ sh validateJsonOutput.sh

This validates the resulting files against the JSON schemas in `test/rpb/schemas/`.

### Adding test data

During development, you'll sometimes want to add a record with specific fields or values to the test data, e.g. when handling new fields or fixing edge cases in the transformation. Due to the unusual encoding of the input data (`IBM437`), editing the files in a text editor may result in a faulty encoding. Instead, we can use the command line and append to the test data directly with `>>`.

E.g. to add the last record in `conf/RPB-Export_HBZ_Bio.txt` that contains `#82b` to `conf/RPB-Export_HBZ_Bio_Test.txt`:

```bash
cat conf/RPB-Export_HBZ_Bio.txt | grep -a '#82b' | tail -n 1 >> conf/RPB-Export_HBZ_Bio_Test.txt
```

The `-a` is required to return all results since grep views parts of the files as binary data.

### Index creation

If you're not indexing into an existing lobid-resources index, make sure to create one with the proper index settings, e.g. to create `resources-rpb-20230623` from `quaoar3`:

```bash
unset http_proxy # for putting on weywot
sol@quaoar3:~/git/rpb$ curl -XPUT -H "Content-Type: application/json" weywot5:9200/resources-rpb-20230623?pretty -d @../lobid-resources/src/main/resources/alma/index-config.json
sol@quaoar3:~/git/rpb$ curl -XPUT -H "Content-Type: application/json" weywot5:9200/resources-rpb-20231130-1045?pretty -d @../lobid-resources-rpb/src/main/resources/alma/index-config.json
```

For testing, the real index name (e.g. `resources-rpb-20230623`) is aliased by `resources-rpb-test`, which is used by https://test.lobid.org/resources / http://test.rpb.lobid.org and in the transformation.
Expand Down
89 changes: 52 additions & 37 deletions app/controllers/nwbib/Application.java
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@
import play.mvc.Http;
import play.mvc.Result;
import play.mvc.Results;
import play.twirl.api.HtmlFormat;
import views.html.browse_classification;
import views.html.browse_register;
import views.html.classification;
Expand Down Expand Up @@ -220,8 +221,8 @@ private static List<Pair<String, String>> cleanSortUnique(
* @param publisher Query for the resource publisher
* @param issued Query for the resource issued year
* @param medium Query for the resource medium
* @param nwbibspatial Query for the resource nwbibspatial classification
* @param nwbibsubject Query for the resource nwbibsubject classification
* @param rpbspatial Query for the resource rpbspatial classification
* @param rpbsubject Query for the resource rpbsubject classification
* @param from The page start (offset of page of resource to return)
* @param size The page size (size of page of resource to return)
* @param owner Owner filter for resource queries
Expand All @@ -238,10 +239,11 @@ private static List<Pair<String, String>> cleanSortUnique(
public static Promise<Result> search(final String q, final String person,
final String name, final String subject, final String id,
final String publisher, final String issued, final String medium,
final String nwbibspatial, final String nwbibsubject, final int from,
final String rpbspatial, final String rpbsubject, final int from,
final int size, final String owner, String t, String sort,
boolean details, String location, String word, String corporation,
String raw, String format) {
response().setHeader("Access-Control-Allow-Origin", "*");
String uuid = session("uuid");
if (uuid == null)
session("uuid", UUID.randomUUID().toString());
Expand All @@ -262,17 +264,30 @@ public static Promise<Result> search(final String q, final String person,
if (form.hasErrors())
return Promise.promise(
() -> badRequest(search.render(null, q, person, name, subject, id,
publisher, issued, medium, nwbibspatial, nwbibsubject, from, size,
publisher, issued, medium, rpbspatial, rpbsubject, from, size,
0L, owner, t, sort, location, word, corporation, raw)));
String query = form.data().get("q");
Promise<Result> result = okPromise(query != null ? query : q, person, name,
subject, id, publisher, issued, medium, nwbibspatial, nwbibsubject,
subject, id, publisher, issued, medium, rpbspatial, rpbsubject,
from, size, owner, t, sort, details, location, word, corporation, raw,
format.isEmpty() ? "html" : format);
cacheOnRedeem(cacheId, result, ONE_HOUR);
return result;
}

public static Promise<Result> searchSpatial(final String id, final int from, final int size,
final String format) {
return Promise.pure(found(routes.Application.search("", "", "", "", "", "", "", "",
"https://rpb.lobid.org/spatial#n" + id, "", from, size, "", "", "", false, "", "",
"", "", format)));
}

public static Promise<Result> showPl(String name, String db, int index, int zeilen, String s1) {
return Promise
.pure(ok("<head><meta http-equiv='Refresh' content='0; URL=https://rppd.lobid.org/"
+ HtmlFormat.escape(s1) + "'/></head>").as("text/html"));
}

/**
* @param id The resource ID.
* @param format The requested resource format (html, json).
Expand Down Expand Up @@ -401,8 +416,8 @@ public static Result download(final String t) {
"attachment; filename=" + filename);
try {
return ok(new URL(CONFIG
.getString(t.equals("Raumsystematik") ? "index.data.nwbibspatial"
: "index.data.nwbibsubject")).openStream());
.getString(t.equals("Raumsystematik") ? "index.data.rpbspatial"
: "index.data.rpbsubject")).openStream());
} catch (IOException e) {
e.printStackTrace();
return internalServerError(e.getMessage());
Expand Down Expand Up @@ -476,18 +491,18 @@ private static Result classificationResult(String t, String placeholder) {
private static Promise<Result> okPromise(final String q, final String person,
final String name, final String subject, final String id,
final String publisher, final String issued, final String medium,
final String nwbibspatial, final String nwbibsubject, final int from,
final String rpbspatial, final String rpbsubject, final int from,
final int size, final String owner, String t, String sort,
boolean details, String location, String word, String corporation,
String raw, String format) {
final Promise<Result> result = call(q, person, name, subject, id, publisher,
issued, medium, nwbibspatial, nwbibsubject, from, size, owner, t, sort,
issued, medium, rpbspatial, rpbsubject, from, size, owner, t, sort,
details, location, word, corporation, raw, format);
return result.recover((Throwable throwable) -> {
Logger.error("Could not call Lobid", throwable);
flashError();
return internalServerError(search.render("[]", q, person, name, subject,
id, publisher, issued, medium, nwbibspatial, nwbibsubject, from, size,
id, publisher, issued, medium, rpbspatial, rpbsubject, from, size,
0L, owner, t, sort, location, word, corporation, raw));
});
}
Expand All @@ -511,12 +526,12 @@ private static void cacheOnRedeem(final String cacheId,
static Promise<Result> call(final String q, final String person,
final String name, final String subject, final String id,
final String publisher, final String issued, final String medium,
final String nwbibspatial, final String nwbibsubject, final int from,
final String rpbspatial, final String rpbsubject, final int from,
final int size, String owner, String t, String sort, boolean showDetails,
String location, String word, String corporation, String raw,
String format) {
final WSRequest requestHolder = Lobid.request(q, person, name, subject, id,
publisher, issued, medium, nwbibspatial, nwbibsubject, from, size,
publisher, issued, medium, rpbspatial, rpbsubject, from, size,
owner, t, sort, location, word, corporation, raw);
return requestHolder.get().map((WSResponse response) -> {
Long hits = 0L;
Expand Down Expand Up @@ -549,7 +564,7 @@ static Promise<Result> call(final String q, final String person,

return format.equals("html")
? ok(search.render(s, q, person, name, subject, id, publisher, issued,
medium, nwbibspatial, nwbibsubject, from, size, hits, owner, t,
medium, rpbspatial, rpbsubject, from, size, hits, owner, t,
sort, location, word, corporation, raw))
: ok(new ObjectMapper().writerWithDefaultPrettyPrinter()
.writeValueAsString(Json.parse(s)))
Expand All @@ -572,8 +587,8 @@ private static void uncache(List<String> ids) {
* @param publisher Query for the resource publisher
* @param issued Query for the resource issued year
* @param medium Query for the resource medium
* @param nwbibspatial Query for the resource nwbibspatial classification
* @param nwbibsubject Query for the resource nwbibsubject classification
* @param rpbspatial Query for the resource rpbspatial classification
* @param rpbsubject Query for the resource rpbsubject classification
* @param from The page start (offset of page of resource to return)
* @param size The page size (size of page of resource to return)
* @param owner Owner filter for resource queries
Expand All @@ -588,14 +603,14 @@ private static void uncache(List<String> ids) {
*/
public static Promise<Result> facets(String q, String person, String name,
String subject, String id, String publisher, String issued, String medium,
String nwbibspatial, String nwbibsubject, int from, int size,
String rpbspatial, String rpbsubject, int from, int size,
String owner, String t, String field, String sort, String location,
String word, String corporation, String raw) {

String key = String.format(
"facets.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s", field, q,
person, name, id, publisher, location, word, corporation, raw, subject,
issued, medium, nwbibspatial, nwbibsubject, owner, t);
issued, medium, rpbspatial, rpbsubject, owner, t);
Result cachedResult = (Result) Cache.get(key);
if (cachedResult != null) {
return Promise.promise(() -> cachedResult);
Expand Down Expand Up @@ -624,9 +639,9 @@ public static Promise<Result> facets(String q, String person, String name,
Comparator<Pair<JsonNode, String>> sorter = (p1, p2) -> {
String t1 = p1.getLeft().get("key").asText();
String t2 = p2.getLeft().get("key").asText();
boolean t1Current = current(subject, medium, nwbibspatial, nwbibsubject,
boolean t1Current = current(subject, medium, rpbspatial, rpbsubject,
owner, t, field, t1, raw);
boolean t2Current = current(subject, medium, nwbibspatial, nwbibsubject,
boolean t2Current = current(subject, medium, rpbspatial, rpbsubject,
owner, t, field, t2, raw);
if (t1Current == t2Current) {
if (!field.equals(ISSUED_FIELD)) {
Expand Down Expand Up @@ -655,12 +670,12 @@ public static Promise<Result> facets(String q, String person, String name,
: queryParam(t, term);
String ownerQuery = !field.equals(ITEM_FIELD) ? owner //
: withoutAndOperator(queryParam(owner, term));
String nwbibsubjectQuery =
!field.equals(RPB_SUBJECT_FIELD) ? nwbibsubject //
: queryParam(nwbibsubject, term);
String nwbibspatialQuery =
!field.equals(NWBIB_SPATIAL_FIELD) ? nwbibspatial //
: queryParam(nwbibspatial, term);
String rpbsubjectQuery =
!field.equals(RPB_SUBJECT_FIELD) ? rpbsubject //
: queryParam(rpbsubject, term);
String rpbspatialQuery =
!field.equals(NWBIB_SPATIAL_FIELD) ? rpbspatial //
: queryParam(rpbspatial, term);
String rawQuery = !field.equals(COVERAGE_FIELD) ? raw //
: rawQueryParam(raw, term);
String locationQuery = !field.equals(SUBJECT_LOCATION_FIELD) ? location //
Expand All @@ -670,12 +685,12 @@ public static Promise<Result> facets(String q, String person, String name,
String issuedQuery = !field.equals(ISSUED_FIELD) ? issued //
: queryParam(issued, term);

boolean current = current(subject, medium, nwbibspatial, nwbibsubject,
boolean current = current(subject, medium, rpbspatial, rpbsubject,
owner, t, field, term, raw);
String routeUrl = routes.Application.search(q, person, name, subjectQuery,
id, publisher, issuedQuery, mediumQuery, nwbibspatialQuery,
nwbibsubjectQuery, from, size, ownerQuery, typeQuery,
sort(sort, nwbibspatialQuery, nwbibsubjectQuery, subjectQuery), false,
id, publisher, issuedQuery, mediumQuery, rpbspatialQuery,
rpbsubjectQuery, from, size, ownerQuery, typeQuery,
sort(sort, rpbspatialQuery, rpbsubjectQuery, subjectQuery), false,
locationQuery, word, corporation, rawQuery, "").url();

String result = String.format(
Expand All @@ -690,7 +705,7 @@ public static Promise<Result> facets(String q, String person, String name,
};

Promise<Result> promise = Lobid.getFacets(q, person, name, subject, id,
publisher, issued, medium, nwbibspatial, nwbibsubject, owner, field, t,
publisher, issued, medium, rpbspatial, rpbsubject, owner, field, t,
location, word, corporation, raw).map(json -> {
Stream<JsonNode> stream = StreamSupport.stream(
Spliterators.spliteratorUnknownSize(json.findValue("aggregation")
Expand All @@ -705,7 +720,7 @@ public static Promise<Result> facets(String q, String person, String name,
String labelKey = String.format(
"facets-labels.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s.%s",
field, raw, q, person, name, id, publisher, word, corporation,
subject, issued, medium, nwbibspatial, nwbibsubject, raw,
subject, issued, medium, rpbspatial, rpbsubject, raw,
field.equals(ITEM_FIELD) ? "" : owner, t, location);

@SuppressWarnings("unchecked")
Expand All @@ -723,22 +738,22 @@ public static Promise<Result> facets(String q, String person, String name,
return promise;
}

private static String sort(String sort, String nwbibspatialQuery,
String nwbibsubjectQuery, String subjectQuery) {
return (nwbibspatialQuery + nwbibsubjectQuery + subjectQuery).contains(",")
private static String sort(String sort, String rpbspatialQuery,
String rpbsubjectQuery, String subjectQuery) {
return (rpbspatialQuery + rpbsubjectQuery + subjectQuery).contains(",")
? ""
/* relevance */ : sort;
}

private static boolean current(String subject, String medium,
String nwbibspatial, String nwbibsubject, String owner, String t,
String rpbspatial, String rpbsubject, String owner, String t,
String field, String term, String raw) {
return field.equals(MEDIUM_FIELD) && contains(medium, term)
|| field.equals(TYPE_FIELD) && contains(t, term)
|| field.equals(ITEM_FIELD) && contains(owner, term)
|| field.equals(NWBIB_SPATIAL_FIELD) && contains(nwbibspatial, term)
|| field.equals(NWBIB_SPATIAL_FIELD) && contains(rpbspatial, term)
|| field.equals(COVERAGE_FIELD) && rawContains(raw, quotedEscaped(term))
|| field.equals(RPB_SUBJECT_FIELD) && contains(nwbibsubject, term)
|| field.equals(RPB_SUBJECT_FIELD) && contains(rpbsubject, term)
|| field.equals(SUBJECT_FIELD) && contains(subject, term);
}

Expand Down
Loading

0 comments on commit f05b812

Please sign in to comment.