t_res.geoparser.pipeline.Pipeline

class t_res.geoparser.pipeline.Pipeline(recogniser: Optional[Recogniser] = None, ranker: Optional[Ranker] = None, linker: Optional[Linker] = None, resources_path: Optional[str] = None, experiments_path: Optional[str] = '../experiments')

Represents a pipeline for processing a text using natural language processing, including Named Entity Recognition (NER), Ranking, and Linking, to geoparse any entities in the text.

Parameters:
  • ner (ner.Recogniser, optional) – The NER (Named Entity Recogniser) object to use in the pipeline. If None, a default Recogniser will be instantiated. For the default settings, see Notes below.

  • ranker (ranking.Ranker, optional) – The Ranker object to use in the pipeline. If None, the default Ranker will be instantiated. For the default settings, see Notes below.

  • linker (linking.Linker, optional) – The Linker object to use in the pipeline. If None, the default Linker will be instantiated. For the default settings, see Notes below.

  • resources_path (str, optional) – The path to your resources directory.

  • experiments_path (str, optional) – The path to the experiments directory. Default is “../experiments”.

Example

>>> # Instantiate the Pipeline object with a default setup
>>> pipeline = Pipeline()
>>> # Now you can use the pipeline for processing text or sentences
>>> text = "I visited Paris and New York City last summer."
>>> processed_data = pipeline.run_text(text)
>>> # Access the processed mentions in the document
>>> for mention in processed_data:
>>>     print(mention)

Note

  • The default settings for the Recogniser:

    ner.PretrainedRecogniser(
        model="Livingwithmachines/toponym-19thC-en",
    )
    
  • The default settings for the Ranker:

    ranking.Ranker(
        method="perfectmatch",
        resources_path=resources_path,
    )
    
  • The default settings for the Linker:

    linking.Linker(
        method="mostpopular",
        resources_path=resources_path,
    )
    
run(text: str, place_of_pub_wqid: Optional[str] = None, place_of_pub: Optional[str] = None) Predictions
run_candidate_selection(sentence_mentions: List[SentenceMentions], place_of_pub_wqid: Optional[str] = None, place_of_pub: Optional[str] = None) Candidates

Runs the candidate selection step of the pipeline.

run_disambiguation(candidates: Candidates) Predictions

Runs the entity disambiguation step of the pipeline.

run_text_recognition(text: str) List[SentenceMentions]

Runs the named entity recognition step of the pipeline.