MapReader

A computer vision pipeline for exploring and analyzing images at scale

View the Project on GitHub Living-with-machines/MapReader


MapReader

A computer vision pipeline for exploring and analyzing images at scale

PyPI Binder License Integration Tests badge

classification_one_inch_maps_001
tutorial for classification_one_inch_maps_001
Tutorial: train/fine-tune PyTorch CV classifiers on historical maps (Fig: rail infrastructure around London as predicted by a MapReader model).
classification_plant_phenotype
tutorial for classification_plant_phenotype
Tutorial: train/fine-tune PyTorch CV classifiers on plant patches in images (plant phenotyping example).
classification_mnist
tutorial for classification_mnist
Tutorial: train/fine-tune PyTorch CV classifiers on whole MNIST images (not on patches/slices of those images).
MapReader paper
MapReader's paper

What is MapReader?

MapReader is an end-to-end computer vision (CV) pipeline for exploring and analyzing images at scale.

MapReader was developed in the Living with Machines project to analyze large collections of historical maps but is a generalisable computer vision pipeline which can be applied to any images in a wide variety of domains. See Gallery for some examples.

Refer to each tutorial/example in the use cases section for more details on MapReader's relevant functionalities for non-geospatial and geospatial images.

Contents

Overview

MapReader is a groundbreaking interdisciplinary tool that emerged from a specific set of geospatial historical research questions. It was inspired by methods in biomedical imaging and geographic information science, which were adapted for annotation and use by historians, for example in JVC and MapReader papers. The success of the tool subsequently generated interest from plant phenotype researchers working with large image datasets, and so MapReader is an example of cross-pollination between the humanities and the sciences made possible by reproducible data science.

MapReader has two main components: preprocessing/annotation and training/inference as shown in this figure:

MapReader pipeline

It provides a set of tools to:

Installation

Set up a conda environment

We recommend installation via Anaconda (refer to Anaconda website and follow the instructions).

conda create -n mr_py38 python=3.8
conda activate mr_py38

Method 1

pip install mapreader 

To work with geospatial images (e.g., maps):

pip install "mapreader[geo]" 
python -m ipykernel install --user --name mr_py38 --display-name "Python (mr_py38)"
# activate the environment
conda activate mr_py38

# install rasterio and fiona manually
conda install -c conda-forge rasterio=1.2.10
conda install -c conda-forge fiona=1.8.20

# install git
conda install git

# install MapReader
pip install git+https://github.com/Living-with-machines/MapReader.git

# open Jupyter Notebook (if you want to test/work with the notebooks in "examples" directory)
cd /path/to/MapReader 
jupyter notebook

Method 2

git clone https://github.com/Living-with-machines/MapReader.git 
cd /path/to/MapReader
pip install -v -e .

To work with geospatial images (e.g., maps):

cd /path/to/MapReader
pip install -e ."[geo]"
python -m ipykernel install --user --name mr_py38 --display-name "Python (mr_py38)"

Use cases

Tutorials are organized in Jupyter Notebooks. Follow the hyperlinks on input type names ("Non-Geospatial" or "Geospatial") to read guidance specific to those image types.

How to contribute

We welcome contributions related to new applications, both with geospatial images (other maps, remote sensing data, aerial photography) and non-geospatial images (for example, other scientific image datasets).

How to cite MapReader

Please consider acknowledging MapReader if it helps you to obtain results and figures for publications or presentations, by citing:

Link: https://dl.acm.org/doi/10.1145/3557919.3565812

Kasra Hosseini, Daniel C. S. Wilson, Kaspar Beelen, and Katherine McDonough. 2022. MapReader: a computer vision pipeline for the semantic exploration of maps at scale. In Proceedings of the 6th ACM SIGSPATIAL International Workshop on Geospatial Humanities (GeoHumanities '22). Association for Computing Machinery, New York, NY, USA, 8–19. https://doi.org/10.1145/3557919.3565812

and in BibTeX:

@inproceedings{10.1145/3557919.3565812,
author = {Hosseini, Kasra and Wilson, Daniel C. S. and Beelen, Kaspar and McDonough, Katherine},
title = {MapReader: A Computer Vision Pipeline for the Semantic Exploration of Maps at Scale},
year = {2022},
isbn = {9781450395335},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3557919.3565812},
doi = {10.1145/3557919.3565812},
booktitle = {Proceedings of the 6th ACM SIGSPATIAL International Workshop on Geospatial Humanities},
pages = {8–19},
numpages = {12},
keywords = {supervised learning, historical maps, deep learning, digital libraries and archives, computer vision, classification},
location = {Seattle, Washington},
series = {GeoHumanities '22}
}

Credits and re-use terms

Digitized maps

MapReader can retrieve maps from NLS (National Library of Scotland) via webservers. For all the digitized maps (retrieved or locally stored), please note the re-use terms:

:warning: Use of the digitised maps for commercial purposes is currently restricted by contract. Use of these digitised maps for non-commercial purposes is permitted under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC-BY-NC-SA) licence. Please refer to https://maps.nls.uk/copyright.html#exceptions-os for details on copyright and re-use license.

Metadata

We have provided some metadata files in mapreader/persistent_data. For all these file, please note the re-use terms:

:warning: Use of the metadata for commercial purposes is currently restricted by contract. Use of this metadata for non-commercial purposes is permitted under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC-BY-NC-SA) licence. Please refer to https://maps.nls.uk/copyright.html#exceptions-os for details on copyright and re-use license.

Acknowledgements

This work was supported by Living with Machines (AHRC grant AH/S01179X/1) and The Alan Turing Institute (EPSRC grant EP/N510129/1). Living with Machines, funded by the UK Research and Innovation (UKRI) Strategic Priority Fund, is a multidisciplinary collaboration delivered by the Arts and Humanities Research Council (AHRC), with The Alan Turing Institute, the British Library and the Universities of Cambridge, East Anglia, Exeter, and Queen Mary University of London.