From 75bdb7f50306be9f8c5ca567791639ea60f9d613 Mon Sep 17 00:00:00 2001 From: Oktie Hassanzadeh Date: Fri, 24 May 2024 15:23:25 -0400 Subject: [PATCH] Metadata2KG Round 2 instructions --- data/metadata2kg/round2/README.md | 2 +- docs/tracks/metadata-to-kg-track.html | 251 ++++++++++++++------------ 2 files changed, 134 insertions(+), 119 deletions(-) diff --git a/data/metadata2kg/round2/README.md b/data/metadata2kg/round2/README.md index e8a387f..3e86bb1 100644 --- a/data/metadata2kg/round2/README.md +++ b/data/metadata2kg/round2/README.md @@ -1,6 +1,6 @@ # Metadata to KG Track Round 2 Datasets -In this round, a JSONL file is provided with each line representing a column in a table, along with table name, column name, and other columns in the same table. The goal is to map each such column to one "business glossary" item. We have also provided the metadata as well as the glossary in the form of an OWL ontology, to facilitate the mapping using ontology matching tools. +In this round, a JSONL file is provided with each line representing a column in a table, along with table name, column name, and other columns in the same table. The goal is to map each such column to one "glossary" item. We have also provided the metadata as well as the glossary in the form of an OWL ontology, to facilitate the mapping using ontology matching tools. Sample data: - [Sample Metadata File in JSONL](r2_sample_metadata.jsonl) diff --git a/docs/tracks/metadata-to-kg-track.html b/docs/tracks/metadata-to-kg-track.html index c82183d..5854294 100644 --- a/docs/tracks/metadata-to-kg-track.html +++ b/docs/tracks/metadata-to-kg-track.html @@ -118,137 +118,152 @@

Round 1

JSONL file with each line containing a mapping of a column ID to an array of DBpedia property URIs and scores, which will be sorted in descending order by score for evaluation. Round 1 data has one mapping for each column, which is the most relevant property it maps to (e.g., if the column is about movie directors, the correct - mapping should be https://dbpedia.org/ontology/director). In Round 2, each column may map to more than one - property/class, or no property/class at all. The provided evaluation script measures Hit@1 and Hit@5. Other - measures may be added for final evaluation and in Round 2.

+ mapping should be https://dbpedia.org/ontology/director).

-
-
-

Round 2

-

Round 2 dataset will consist of metadata of a number of relational tables and - a custom ontology. Stay tuned. -

-
- -
-
-

Submission Instructions

-

Submission: Are you ready? Then, submit up to 4 result sets for the test - set using the Submission Form.

-
- - - -
-

Track Organizers

- +
+

Round 2

+

+ Round 2 dataset + consists of a select set of open data table metadata + that need to be mapped to a custom glossary (dictionary of term labels and descriptions).

-
+

Check out the README file + for input/output format, sample input/output, and an evaluation script. Note that the output of the mapping is a + JSONL file with each line containing a mapping of a column ID to an array of glossary items and scores, + which will be sorted in descending order by score for evaluation. Similar to Round 1 data, Round 2 data has one + mapping for each column, + which is the most relevant glossary item it maps to. We acknowledge that there may be more than one relevant + glossary item suitable for each column, which is why we use Hit@k scores for evaluation. We may use additional + scores that are not included in the evaluate.py script for the final ranking of the submissions.

+ - - - + \ No newline at end of file