-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What scores does Table 1 use on the paper? #26
Comments
If I understand correctly the data you are referring to also provides a "proposal set" of entities for each mention, and marks one of the proposed entities as correct, while others are incorrect? |
Well, I just used AIDA CoNLL-YAGO dataset, and prepared it for solving accuracy like this: out = []
with open('../conll_dataset/aida-yago2-dataset/AIDA-YAGO2-dataset.tsv') as f:
index = 1
me = []
ss = []
first = True
for line in f:
if line.startswith('-DOCSTART-'):
if first:
first = False
continue
out.append([index, ' '.join(ss), list(set(me))])
index += 1
me = []
ss = []
else:
line_spl = line.replace('\n', '').split('\t')
ss.append(line_spl[0])
if len(line_spl) > 4:
if line_spl[1] == 'B':
me.append((line_spl[2], line_spl[4].replace('http://en.wikipedia.org/wiki/','')))
data = out data[0] is like this: # [doc_id, doc_text, [pairs of mention and true entity] ]
[1,
'EU rejects German call to boycott British lamb . Peter Blackburn BRUSSELS 1996-08-22 The European Commission said on Thursday it disagreed with German advice to consumers to shun British lamb until scientists determine whether mad cow disease can be transmitted to ...... ',
[('Loyola de Palacio', 'Loyola_de_Palacio'),
('Britain', 'United_Kingdom'),
('Germany', 'Germany'),
('European Commission', 'European_Commission'),
('France', 'France'),
('Europe', 'Europe'),
('BRUSSELS', 'Brussels'),
...
]] and calculated accuracy: for d in tqdm_notebook(data):
# sentence of the target document
sentence = d[1]
# ts are target mentions on the document
ts = [str(t[0]) for t in d[2]]
true_entities = [str(t[1]).replace('_', ' ') for t in d[2]]
# tokenize sentence by using target mentions
# and model_probs is the output of get_probs function from the notebook you added
tokenize = partial(en_tokenize, ts=ts)
sent_splits, model_probs = solve_model_probs(sentence, tagger, tokenize=tokenize)
# predicted entities that have the highest score of each mentions
pred_entities = run(ts, sent_splits, model_probs, indices2title, type_oracle, trie, trie_index2indices_values, trie_index2indices_counts)
# append result: true -> true entity, pred -> predicted entity
results += [{'doc_id':d[0], 'mention':x, 'true': y, 'pred': z} for x,y,z in zip(ts, true_entities, pred_entities)]
df = pd.DataFrame(results)
matched = df['pred'] == df['true']
length = df['pred'].shape[0]
assert len(df['pred']) == len(df['true'])
accuracy = float(sum(matched))/float(length) Is this correct way to calculate accuracy? |
I'm stuck on this same part, the accuracy calculated this way is 0.7 though. |
On the paper, Table 1 (c) shows the entity linking scores, but how to solve them especially CoNLL scores?
For example, some mentions and its candidate entities are there.
If it predicts one entity that has the highest score of each mentions, I don't need to use false candidates to solve accuracy, but I don't know the Table 1 used false candidates or not.
How did you solve the Table 1 (c) scores?
Paper: https://arxiv.org/pdf/1802.01021.pdf
The text was updated successfully, but these errors were encountered: