Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend the json-ld context #232

Closed
1 task done
jh-RLI opened this issue Dec 11, 2024 · 18 comments · Fixed by #234
Closed
1 task done

Extend the json-ld context #232

jh-RLI opened this issue Dec 11, 2024 · 18 comments · Fixed by #234
Assignees
Labels
part: backend 🧱 Backend component priority: critical 🔥 Critical priority status: completed ✔️ Task has been completed type: feature 🛠 New feature or request

Comments

@jh-RLI
Copy link
Contributor

jh-RLI commented Dec 11, 2024

Description of the issue

The oemetadata should be available as RDF data and as valid JSON-LD. A good first step is to complete the context definition available in the context.json file for each release. Once it is complete, we can use existing tools to convert the oemetadata.json format to the RDF format.

Ideas of solution

Complete the context.json file for v20 of the oemetadata json.
Add a script to convert to RDF.

Workflow checklist

vismayajochem added a commit that referenced this issue Dec 12, 2024
vismayajochem added a commit that referenced this issue Dec 12, 2024
@Ludee Ludee added priority: high ⚡ High priority part: backend 🧱 Backend component type: feature 🛠 New feature or request status: active 🚧 Work in progress labels Dec 12, 2024
vismayajochem added a commit that referenced this issue Dec 19, 2024
vismayajochem added a commit that referenced this issue Dec 19, 2024
vismayajochem added a commit that referenced this issue Dec 19, 2024
@Ludee
Copy link
Member

Ludee commented Jan 6, 2025

@vismayajochem
Copy link
Collaborator

Two questions:
In the context.json file "path" is defined as "@id" in line 3. Is that in conflict with defining "path" later on according to the metadata? Is it possible to define the same name to two different things?
The field "fields" is in the metadata 2 times; once as category, therefore corresponding to @nest and once as field needing to be defined by an ontology term. How can that be resovled?

@vismayajochem
Copy link
Collaborator

vismayajochem commented Jan 6, 2025

Following terms I could not find in ontologies and are therefore missing in the context ld:

These two dont need any further description

  • @id
  • @context
  • Topics
  • isActive
  • address
  • latitude
  • longitude
  • resolutionValue
  • resolutionUnit
  • crs
  • allignment
  • nullable
  • primaryKey
  • decimalSeperator

@jh-RLI
Copy link
Contributor Author

jh-RLI commented Jan 8, 2025

Regarding the question aiming at duplicated key. There is a concept we can use in JSON-LD called aliases. The Processor for JSON-LD should be able to still map the properties from the context and metadata even if you use an alias name in the context part:

{
    "@context": {
        "nestedFields": {
            "@id": "@nest",
            "@type": "@id"
        },
        "schemaFields": {
            "@id": "csvw:column",
            "@type": "csvw:Column"
        }
    }
}

I found some more properties for the spatial fields (lat, long ...)

 "latitude": {
            "@id": "geo:lat",
            "@type": "xsd:decimal"
        },
"longitude": {
            "@id": "geo:long",
            "@type": "xsd:decimal"
        },
"crs": {
    "@id": "geo:crs",
    "@type": "xsd:string"
}

Spatial resolution is more tricky. There must be a vocabulary for it. So far i only found this one:
https://www.w3.org/TR/vocab-dcat-3/#Property:dataset_spatial_resolution
In the dcat vocabulary they use "rdfs:Literal" as a parent for resolution, we could also do that to map it to anything at least.

For "adress" we could use the https://www.dublincore.org/specifications/dublin-core/dcmi-terms/#Location but there might also be a "adress" termin in the geo vocabulary i mentioned above.

There is also a topic "Topic of interest to a person" in the FOAF vocabulary
http://xmlns.com/foaf/spec/#term_topic

Another option is https://www.w3.org/TR/vocab-dcat-3/#Property:record_primary_topic but this one aims more on topics for catalogs of various datasets. We would need topics for datasets.

@jh-RLI
Copy link
Contributor Author

jh-RLI commented Jan 8, 2025

Another solution would be https://www.w3.org/TR/json-ld11/#scoped-contexts but they seem to be more effort. I think the aliases should work for now.

@vismayajochem
Copy link
Collaborator

For "fields" your solution seems to work. How should we do it for "path"?
For address in the geo ontology I only found http://purl.obolibrary.org/obo/BFO_0000040 "material entity", which is much broader than address. So I suggest using dc:location.

vismayajochem added a commit that referenced this issue Jan 9, 2025
@jh-RLI
Copy link
Contributor Author

jh-RLI commented Jan 9, 2025

I think it would be okay to have a "path" and a "sourcePath" or am i missing something ? :)

vismayajochem added a commit that referenced this issue Jan 9, 2025
@Ludee Ludee added the other: help wanted 🙋 Extra attention is needed label Jan 9, 2025
@jh-RLI
Copy link
Contributor Author

jh-RLI commented Jan 9, 2025

I will add some code to your branch that will validate the context and check if it works with the oemetadata properties.

@jh-RLI
Copy link
Contributor Author

jh-RLI commented Jan 9, 2025

I have also found a vocabulary for tables (there are also primary keys):
https://www.w3.org/TR/tabular-metadata/

By the way, I have found that it is much more successful to search for “xxxx vocabulary” instead of “xxx ontology” when searching for terms to add to the context.

@Ludee Ludee added priority: critical 🔥 Critical priority and removed priority: high ⚡ High priority labels Jan 9, 2025
@jh-RLI
Copy link
Contributor Author

jh-RLI commented Jan 9, 2025

Well, writing the tests showed that our current approach of resolving the nested structures ("@nest") is not entirely correct. We need to describe the context more detailed. I found out about the possibility to specify collections properties like the ressources: [{}] property. Here we cant just use nest as it is an array of objects. In this case we would use:

        "resources": {
            "@container": "@set",
            "@id": "ex:resource"
        },

@set can be used if it is an unordered list of objects, @list would be a ordered list

There are things missing, but i was able to make the json-ld processor validate the current context. This is how it looks currents (as i said not complete yet.

Note: I added the ex namespace for example values. I remove all "xx" placeholder values before validation.

@jh-RLI
Copy link
Contributor Author

jh-RLI commented Jan 9, 2025

@vismayajochem i added something for everything now. It would be good to check again, also sometimes i use "ex": "http://example.org/" if i didn't find anything better (this is a better placeholder then "xx" :) )

Don't forget to pull the changes first.

jh-RLI added a commit that referenced this issue Jan 9, 2025
jh-RLI added a commit that referenced this issue Jan 9, 2025
@jh-RLI
Copy link
Contributor Author

jh-RLI commented Jan 9, 2025

here is a full example how the RDF version looks like.
result.json

vismayajochem added a commit that referenced this issue Jan 16, 2025
@vismayajochem
Copy link
Collaborator

vismayajochem commented Jan 16, 2025

A few things a noticed while reviewing the context.json:

  • "FundingAgency" and "FundingAgencyLogo" currently refer to the same OEO term. Is that ok, is that a problem?
  • The "@type": "csvw:Column" should be removed since "Column" isn't a literal.
  • Could we use dct:subject or dc:subject for "subject"? It seems not an exact fit to me, but an ok one
  • For "extend" I would suggest dct:spatial.
  • For "spatialResolutionValue" we could use dcat:spatialResolutionInMeters even though thats narrower (since it requires meters).
  • For "resolutionUnit" we could use qudt:Unit (QUDT - Quantities, Units, Dimensions, and Data Types Ontologies) with IRI: http://qudt.org/vocab/unit/
  • For "source" I would recommend using dct:source.

@jh-RLI
Copy link
Contributor Author

jh-RLI commented Jan 16, 2025

Thanks for the review! That sounds great. I will implement it that way. I'm not sure about the resolution in meters yet. As far as I know, most of the time it will be meters anyway, so I agree with your suggestion.

Ludee added a commit that referenced this issue Jan 16, 2025
Ludee added a commit that referenced this issue Jan 16, 2025
jh-RLI added a commit that referenced this issue Jan 17, 2025
jh-RLI added a commit that referenced this issue Jan 17, 2025
jh-RLI added a commit that referenced this issue Jan 17, 2025
@vismayajochem
Copy link
Collaborator

vismayajochem commented Jan 21, 2025

If there is @nest there is no @id given. Is that correct or are they missing?

vismayajochem added a commit that referenced this issue Jan 21, 2025
@vismayajochem
Copy link
Collaborator

I couldn't find a working URL for the sdo ontology. Does someone know this ontology and could provide the URL?
Also the URLs for the geo ontology http://www.georss.org/georss/ don't work for me and I don't get why.

vismayajochem added a commit that referenced this issue Jan 21, 2025
@vismayajochem
Copy link
Collaborator

sc:isNullable in https://schema.org/ does not exist. So we need an alternativ for "nullable"

Ludee added a commit that referenced this issue Jan 22, 2025
Ludee added a commit that referenced this issue Jan 22, 2025
@Ludee
Copy link
Member

Ludee commented Jan 23, 2025

@Ludee Ludee moved this to In Review in OEMetadata v2.0 Jan 23, 2025
Ludee added a commit that referenced this issue Jan 23, 2025
@Ludee Ludee moved this from In Review to Done in OEMetadata v2.0 Jan 23, 2025
@Ludee Ludee added status: completed ✔️ Task has been completed and removed other: help wanted 🙋 Extra attention is needed status: active 🚧 Work in progress labels Jan 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
part: backend 🧱 Backend component priority: critical 🔥 Critical priority status: completed ✔️ Task has been completed type: feature 🛠 New feature or request
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants