t_res.utils.rel_e2e
module
- t_res.utils.rel_e2e.rel_end_to_end(sent: str) dict
Perform REL end-to-end entity linking using the API.
- t_res.utils.rel_e2e.get_rel_from_api(dSentences: dict, rel_end2end_path: str) None
Use the REL API to perform end-to-end entity linking.
- t_res.utils.rel_e2e.match_wikipedia_to_wikidata(wiki_title: str, path_to_db: str) str
Retrieve the Wikidata ID corresponding to a Wikipedia title.
- t_res.utils.rel_e2e.match_ent(pred_ents, start, end, prev_ann, gazetteer_ids)
Find the corresponding string and prediction information returned by REL for a specific gold standard token position in a sentence.
- Parameters:
pred_ents (list) – A list of lists, where each inner list corresponds to a token.
start (int) – The start character offset of the token in the gold standard.
end (int) – The end character offset of the token in the gold standard.
prev_ann (str) – The entity type of the previous token.
gazetteer_ids (set) – A set of entity IDs in the knowledge base.
- Returns:
- A tuple with three elements:
The entity type.
The entity link.
The entity type of the previous token.
- Return type:
- t_res.utils.rel_e2e.postprocess_rel(rel_preds, dSentences, gold_tokenization, wikigaz_ids)
Retokenize the REL output for each sentence to match the gold standard tokenization.
- Parameters:
rel_preds (dict) – A dictionary containing the predictions using REL.
dSentences (dict) – A dictionary that maps a sentence ID to the text.
gold_tokenization (dict) – A dictionary that contains the tokenized sentence with gold standard annotations of entity type and link per sentence.
wikigaz_ids (set) – A set of Wikidata IDs of entities in the gazetteer.
- Returns:
A dictionary that maps a sentence ID to the REL predictions, retokenized as in the gold standard.
- Return type:
- t_res.utils.rel_e2e.store_rel(experiment: Experiment, dREL: dict, approach: str, how_split: str) None
Store the REL results for a specific experiment, approach, and split, in the format required by the HIPE scorer.
- Parameters:
experiment (Experiment) – The experiment object containing the results path and dataset.
dREL (dict) – A dictionary mapping sentence IDs to REL predictions.
approach (str) – The approach used for REL.
how_split (str) – The type of split for which to store the results (e.g.,
originalsplit
,Ashton1860
).
- Returns:
None.
Note
This function saves a TSV file with the results in the Conll format required by the scorer.