Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix DDC issues #8

Closed
1 of 3 tasks
stefandesu opened this issue Mar 4, 2021 · 7 comments
Closed
1 of 3 tasks

Fix DDC issues #8

stefandesu opened this issue Mar 4, 2021 · 7 comments
Milestone

Comments

@stefandesu
Copy link
Member

stefandesu commented Mar 4, 2021

There are some issues with our version of DDC German (in https://coli-conc.gbv.de/api/) and our tools that need to be addressed before coli-ana can work properly:

  • We need to get an updated version of DDC that includes all classes since some classes are missing in our version.
  • We need to check whether our conversion of DDC to JSKOS works properly (in particular, check if classes exist that include a : in their notation, for example T1--0901-T1--0905:07).
    • Note that the Norsk version of DDC does also not include these notations. Could they be an addition to German DDC only?
  • jskos-tools' uriFromNotation method currently does not work correctly with the tables. For example, uriFromNotation("T1--0") returns http://dewey.info/class/T1--0/e23/ even though it should be http://dewey.info/class/1--0/e23/.
@nichtich
Copy link
Member

nichtich commented Mar 5, 2021

I created an issue at jskos-tools (gbv/jskos-tools#28) but think the best workaround to notations starting with T is to remove the T on import, so T1--0 becomes 1--0. Number spans in tables also need to be modified, e.g. T1--0901-T1--0905:07 => 1--0901-0905. In summary (Perl syntax):

s/:\d+$//; # remove colon suffix
if ($_ =~ /^T(\d[ABC]?)--/) {
  $table = $1;
  $_ =~ s/T\d[ABC]?--//g; # remove all T...
  $_ = $table . "--" . $_; # add back removed table number
}

This fits to existing notation pattern used in existing DDC data in RDF.

@nichtich
Copy link
Member

nichtich commented Mar 5, 2021

To clarify classes with colon suffixes T1--0901-T1--0905:07 is actually T1-0901-0905 with multiple notations:

  • 1--0901-0905 (internally/fallback)
  • T1--0901-0905 (display, retrieved from backend database with full DDC data)
  • --0.901--------, --0.902--------... (coli-ana, depending on example)

@stefandesu
Copy link
Member Author

To clarify classes with colon suffixes T1--0901-T1--0905:07 is actually T1-0901-0905 with multiple notations:

  • 1--0901-0905 (internally/fallback)
  • T1--0901-0905 (display, retrieved from backend database with full DDC data)
  • --0.901--------, --0.902--------... (coli-ana, depending on example)

Could you clarify this further? I still don't understand this part. Referring to WebDewey Deutsch, they have different labels:

  • T1--0901-T1--0905 = Zeitabschnitte
  • T1--0901-T1--0905:07 = Museen, Sammlungen, Ausstellungen; Sammeln von Objekten

In the decomposition of 700.90440747471, both T1--0901-T1--0905 and T1--0901-T1--0905:07 are listed. The former, however, has for some reason the suffix :0904 which confuses me even further (it seems to refer to the next line T1--0904).

@stefandesu
Copy link
Member Author

In summary (Perl syntax):

Apart from removing the colon suffix (see my previous comment), this is looking good. I could try to adjust this to JavaScript and add it to coli-ana first (and if it works as expected, we could add it to jskos-tools).

@nichtich
Copy link
Member

nichtich commented Mar 5, 2021

The workaround is implemented but

  • we'd better directly add the uri in convert.js instead of using uriFromNotation, so we only have two notations (one to display normally, one for the decomposition table) and look up classes via their URI.
  • I've re-introduce the colons for further inspection

@stefandesu
Copy link
Member Author

  • we'd better directly add the uri in convert.js instead of using uriFromNotation, so we only have two notations (one to display normally, one for the decomposition table) and look up classes via their URI.

Yeah, I didn't add URIs there in order to save space in the database, but we can add the URIs during conversion.

@nichtich nichtich added this to the 0.2.0 milestone Mar 8, 2021
stefandesu added a commit that referenced this issue Mar 9, 2021
@nichtich nichtich closed this as completed Mar 9, 2021
@nichtich
Copy link
Member

nichtich commented Mar 9, 2021

Closed in favor of more specific #13.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants