{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "Z8l3phE7O_fi" }, "source": [ "# Sample Inspector (Part I)\n", "## Looking at keywords over time\n", "\n", "In this notebook, we investigate how the digital sample compares to the 'universe' of book publications in the 19th century. Even though the Microsoft Digitised Books corpus is a rich collection, it remains unclear what is in there and what's not. Especially when one is interested in a specific topic, like 'machines', knowing what content we don't have digital access to is critical if we want to make sense and of findings built on such digital resources.\n", "\n", "To understand the digital sample to the population of printed works, we compare the keywords in titles between these two levels. We show how to load and process data in Pandas, build a quick and efficient method to gauge and visualize the presence of a set of selected keywords in book titles over time." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "zci-0phcO_fl" }, "outputs": [], "source": [ "# first we need import all libraries and tools required in the rest of this notebook\n", "%matplotlib inline\n", "import json\n", "from tqdm.notebook import tqdm\n", "import pandas as pd\n", "import numpy as np\n", "from sklearn.feature_extraction.text import CountVectorizer\n", "from collections import Counter, defaultdict" ] }, { "cell_type": "markdown", "id": "9e850520", "metadata": {}, "source": [ "## Loading the data" ] }, { "cell_type": "markdown", "metadata": { "id": "znApsC3CO_fm" }, "source": [ "Using Pandas, we load the metadata on the BL books collection and print the first three rows for inspection." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "vQEZ2G0fO_fn" }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
BL record IDType of resourceNameDates associated with nameType of nameRoleAll namesTitleVariant titlesSeries title...Date of publicationEditionPhysical descriptionDewey classificationBL shelfmarkTopicsGenreLanguagesNotesBL record ID for physical resource
0014602826MonographYearsley, Ann1753-1806personNaNMore, Hannah, 1745-1833 [person] ; Yearsley, A...Poems on several occasions [With a prefatory l...NaNNaN...1786Fourth edition MANUSCRIPT noteNaNNaNDigital Store 11644.d.32NaNNaNEnglishNaN3996603
1014602830MonographA, T.NaNpersonNaNOldham, John, 1653-1683 [person] ; A, T. [person]A Satyr against Vertue. (A poem: supposed to b...NaNNaN...1679NaN15 pages (4°)NaNDigital Store 11602.ee.10. (2.)NaNNaNEnglishNaN1143
2014602831MonographNaNNaNNaNNaNNaNThe Aeronaut, a poem; founded almost entirely,...NaNNaN...1816NaN17 pages (8°)NaNDigital Store 992.i.12. (3.)Dublin (Ireland)NaNEnglishNaN22782
\n", "

3 rows × 24 columns

\n", "
" ], "text/plain": [ " BL record ID Type of resource Name Dates associated with name \\\n", "0 014602826 Monograph Yearsley, Ann 1753-1806 \n", "1 014602830 Monograph A, T. NaN \n", "2 014602831 Monograph NaN NaN \n", "\n", " Type of name Role All names \\\n", "0 person NaN More, Hannah, 1745-1833 [person] ; Yearsley, A... \n", "1 person NaN Oldham, John, 1653-1683 [person] ; A, T. [person] \n", "2 NaN NaN NaN \n", "\n", " Title Variant titles \\\n", "0 Poems on several occasions [With a prefatory l... NaN \n", "1 A Satyr against Vertue. (A poem: supposed to b... NaN \n", "2 The Aeronaut, a poem; founded almost entirely,... NaN \n", "\n", " Series title ... Date of publication Edition \\\n", "0 NaN ... 1786 Fourth edition MANUSCRIPT note \n", "1 NaN ... 1679 NaN \n", "2 NaN ... 1816 NaN \n", "\n", " Physical description Dewey classification BL shelfmark \\\n", "0 NaN NaN Digital Store 11644.d.32 \n", "1 15 pages (4°) NaN Digital Store 11602.ee.10. (2.) \n", "2 17 pages (8°) NaN Digital Store 992.i.12. (3.) \n", "\n", " Topics Genre Languages Notes BL record ID for physical resource \n", "0 NaN NaN English NaN 3996603 \n", "1 NaN NaN English NaN 1143 \n", "2 Dublin (Ireland) NaN English NaN 22782 \n", "\n", "[3 rows x 24 columns]" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "metadata_blb = pd.read_csv(\n", " \"https://bl.iro.bl.uk/downloads/e1be1324-8b1a-4712-96a7-783ac209ddef?locale=en\",\n", " dtype={\"BL record ID\": \"string\"},\n", " parse_dates=False,\n", ")\n", "metadata_blb.head(3)" ] }, { "cell_type": "markdown", "id": "57ceeaaf", "metadata": {}, "source": [ "## Parsing all of the titles for a year" ] }, { "cell_type": "markdown", "metadata": { "id": "5yVABbGdO_fn" }, "source": [ "Next, we process the titles in this dataframe: we create a `pd.Series` object in which we map a year to a long string that contains all the titles of books published in that year." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "xPmvT38pO_fo" }, "outputs": [], "source": [ "# convert all titles to a string\n", "metadata_blb[\"title_str\"] = metadata_blb.Title.apply(lambda x: \"\".join([t for t in x]))\n", "# in case of a date range select the first year\n", "metadata_blb[\"Date of publication\"] = metadata_blb[\"Date of publication\"].apply(\n", " lambda x: str(x).split(\"-\")[0]\n", ")\n", "# group all titles by year, i.e. for each year we concatenate all titles as one long string\n", "titles_by_year_bl = metadata_blb.groupby(\"Date of publication\")[\"title_str\"].apply(\n", " \" \".join\n", ")\n", "# slice the Series to retain only titles that fall within the target period\n", "titles_by_year_bl = titles_by_year_bl[\"1800\":\"1899\"]" ] }, { "cell_type": "markdown", "metadata": { "id": "02B7JX07O_fo" }, "source": [ "These operations return a variable `titles_by_year_bl` of type `pd.Series` in which the index refers to the year of publication and the values are a concatenation of all titles" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "bEbhDI4tO_fp", "outputId": "fe47eb43-7b83-4733-c2c8-4324d6028796" }, "outputs": [ { "data": { "text/plain": [ "Date of publication\n", "1800 Egbert; or, The Suicide. A tale [In verse.] So...\n", "1801 The Old Hag in a Red Cloak. A romance [In vers...\n", "1802 Elegy to the memory of the late Duke of Bedfor...\n", "1803 [Britons strike home.] Songs &c. in Britons st...\n", "1804 The Sports of the Genii [Etchings from drawing...\n", " ... \n", "1895 A Few Verses The Ghais o' Dennilair: a legend ...\n", "1896 Book of Word & Music of Humorous Songs, Burles...\n", "1897 Altenglische Spruchweisheit, alt- und mittelen...\n", "1898 A Vision. (Penny edition.) The Ballad of the W...\n", "1899 Hawthorn and Lavender: songs and madrigals We ...\n", "Name: title_str, Length: 100, dtype: object" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "titles_by_year_bl" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 54 }, "id": "a1CqyCB_O_fq", "outputId": "67311896-3585-41f3-e976-93a74518ac96" }, "outputs": [ { "data": { "text/plain": [ "'Egbert; or, The Suicide. A tale [In verse.] Songs, Chorusses, etc, in the new pantomime of Harlequin Tour; or, the Dominion of Fancy, as performed at the Theatre Royal, Covent-Garden, etc A Journey to'" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "titles_by_year_bl[\"1800\"][:200]" ] }, { "cell_type": "markdown", "id": "6b86f73f", "metadata": {}, "source": [ "## Loading catalogue metadata" ] }, { "cell_type": "markdown", "metadata": { "id": "PbesLukZO_fr" }, "source": [ "Then we load a .csv file which contains an export of the British Library Catalogue. For sure, the BL catalogue won't give us a complete list of all books in the 19th century, but at least it is converging to the universe of known printed works in this period. As a comparison and contextualisation tool, it helps us understand the contours and composition of the digital collection." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 426 }, "id": "wBtf_8q9O_fr", "outputId": "066716e5-3149-423b-bd04-5408fd11d90e" }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/usr/local/lib/python3.7/site-packages/IPython/core/interactiveshell.py:3296: DtypeWarning: Columns (2,3,19) have mixed types.Specify dtype option on import or set low_memory=False.\n", " exec(code_obj, self.user_global_ns, self.user_ns)\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
BL record IDType of resourceBNB numberISBNNameDates associated with nameType of nameRoleAll namesTitle...PublisherDate of publicationEditionPhysical descriptionDewey classificationBL shelfmarkTopicsGenreLanguagesNotes
0000000004MonographNaNNaNNaNNaNNaNNaNCarlbohm, Johan Arvid, printer [person]Aabc [etc.] Jesus Vocales, eli äänelliset boks......präntätty directörin J.A. Carlbohmin tykönä1800NaN16 unnumbered pages, 17 cm (8°)NaN12976.aa.3Writing ; Reading ; Writing--Alphabets--Primer...NaNFinnishFinnish primer, beginning with the Lord's pray...
1000000006MonographNaNNaNNaNNaNNaNNaNNaNA che serve il Papa?...Tiberina1889NaN32 pages, 14 cmNaN3900.aaa.20. (4.)NaNNaNItalianNaN
2000000007MonographNaNNaNNaNNaNNaNNaNNaNA. for Apple [An illustrated alphabet.]...Ward & Lock1894NaNNaNNaN12811.h.70NaNNaNNaNNaN
\n", "

3 rows × 25 columns

\n", "
" ], "text/plain": [ " BL record ID Type of resource BNB number ISBN Name \\\n", "0 000000004 Monograph NaN NaN NaN \n", "1 000000006 Monograph NaN NaN NaN \n", "2 000000007 Monograph NaN NaN NaN \n", "\n", " Dates associated with name Type of name Role \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "\n", " All names \\\n", "0 Carlbohm, Johan Arvid, printer [person] \n", "1 NaN \n", "2 NaN \n", "\n", " Title ... \\\n", "0 Aabc [etc.] Jesus Vocales, eli äänelliset boks... ... \n", "1 A che serve il Papa? ... \n", "2 A. for Apple [An illustrated alphabet.] ... \n", "\n", " Publisher Date of publication Edition \\\n", "0 präntätty directörin J.A. Carlbohmin tykönä 1800 NaN \n", "1 Tiberina 1889 NaN \n", "2 Ward & Lock 1894 NaN \n", "\n", " Physical description Dewey classification BL shelfmark \\\n", "0 16 unnumbered pages, 17 cm (8°) NaN 12976.aa.3 \n", "1 32 pages, 14 cm NaN 3900.aaa.20. (4.) \n", "2 NaN NaN 12811.h.70 \n", "\n", " Topics Genre Languages \\\n", "0 Writing ; Reading ; Writing--Alphabets--Primer... NaN Finnish \n", "1 NaN NaN Italian \n", "2 NaN NaN NaN \n", "\n", " Notes \n", "0 Finnish primer, beginning with the Lord's pray... \n", "1 NaN \n", "2 NaN \n", "\n", "[3 rows x 25 columns]" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "metadata_cat = pd.read_csv(\n", " \"https://bl.iro.bl.uk/downloads/e4bf0f74-2c64-4322-93c7-0dcc5e5246da?locale=en\",\n", " dtype={\"BL record ID\": \"string\"},\n", ")\n", "metadata_cat.head(3)" ] }, { "cell_type": "markdown", "id": "21e90580", "metadata": {}, "source": [ "### Parsing date ranges" ] }, { "cell_type": "markdown", "metadata": { "id": "8wBEOOCQO_ft" }, "source": [ "Values in the `'Date of publication'` column sometimes refer to a date range instead of a specific year. To simplify things, we take the first year as the date of publication. Admittedly, this choice is questionable, and we encourage that you craft other solutions depending on your research interests." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "zjX56boSO_ft", "outputId": "8efe5501-a2d5-4287-db05-b7311e9f1658" }, "outputs": [ { "data": { "text/plain": [ "array(['1800', '1889', '1894', ..., '1855-1875', '1887-1937', '1893-1963'],\n", " dtype=object)" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "metadata_cat[\"Date of publication\"].unique()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "ngRreO3KO_fv" }, "outputs": [], "source": [ "metadata_cat[\"date\"] = metadata_cat[\"Date of publication\"].apply(\n", " lambda x: str(x).split(\"-\")[0]\n", ")" ] }, { "cell_type": "markdown", "metadata": { "id": "DTi0a3ktRP8q" }, "source": [ "We also do the same for the BL Books collection" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "8T8z-e1SQlxx", "lines_to_next_cell": 2 }, "outputs": [], "source": [ "metadata_blb[\"date\"] = metadata_blb[\"Date of publication\"].apply(\n", " lambda x: str(x).split(\"-\")[0]\n", ")" ] }, { "cell_type": "markdown", "metadata": { "id": "XV5nDjJ1O_fv" }, "source": [ "With its 5.1 billion tokens and close to 50k titles, the Microsoft Digitised Books is an impressive collection. However, it constitutes just 2.73% of all books printed in the 19th century. As a historian, the amount of information seems overwhelming, but one should realise it is still a small empirical basis when making claims about cultural evolution in the 19th century. " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "4mVzTHzZO_fv", "outputId": "c161592e-d1dc-41b5-822b-84ca931feb5a" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Number of 19thC works in the BL Books collection: 46518\n", "Number of 19thC works in the Catologue: 1686158\n", "The sample size is 2.76 %\n" ] } ], "source": [ "num_19thc_books_bl = len(\n", " metadata_blb[(metadata_blb.date >= \"1800\") & (metadata_blb.date < \"1900\")]\n", ")\n", "num_19thc_books_cat = len(\n", " metadata_cat[(metadata_cat.date >= \"1800\") & (metadata_cat.date < \"1900\")]\n", ")\n", "print(\"Number of 19thC works in the BL Books collection: \", num_19thc_books_bl)\n", "print(\"Number of 19thC works in the Catologue: \", num_19thc_books_cat)\n", "print(\n", " \"The sample size is\", round(num_19thc_books_bl / num_19thc_books_cat * 100, 2), \"%\"\n", ")" ] }, { "cell_type": "markdown", "metadata": { "id": "m4wf7X4-O_fw" }, "source": [ "\n", "### Comparing our corpus to a wider collection \n", "\n", "Similar to what we did with BL Books metadata, we concatenate titles in the catalogue to one long string, one per year, and slice the `pd.Series` object." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "eJTMxlRHO_fw" }, "outputs": [], "source": [ "metadata_cat[\"Title\"] = metadata_cat[\"Title\"].astype(\n", " str\n", ") # convert all titles to strings\n", "titles_by_year_cat = metadata_cat.groupby(\"date\")[\"Title\"].apply(\n", " \" \".join\n", ") # group titles by year and join as one long string\n", "titles_by_year_cat = titles_by_year_cat[\"1800\":\"1899\"] # slice the series" ] }, { "cell_type": "markdown", "metadata": { "id": "bKlAbfYnO_fx" }, "source": [ "We established that the digital corpus constitutes only 2.76% of the population. To assess changes over time, we can visualize the sample size (as a percentage of the population) for each year." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 300 }, "id": "YdExY9hIO_fx", "outputId": "b7ef83e3-ed29-4294-e360-5bc00cad6d5e" }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "a = metadata_cat.groupby(\"date\")[\"Title\"].count()[\"1800\":\"1900\"]\n", "b = metadata_blb.groupby(\"date\")[\"title_str\"].count()[\"1800\":\"1900\"]\n", "(b / a * 100).plot()" ] }, { "cell_type": "markdown", "metadata": { "id": "rgG7XOIjO_fx" }, "source": [ "In the cells below, we turn to analysing and comparing the content of the titles. We repeat the following steps for both the BL Books metadata and catalogue information.\n", "- We create a document-term matrix, where each row comprises the word counts for all titles in a specific year\n", "- We compute the total word counts per year\n", "\n", "We use the `CountVectorizer` object provided by `sklearn` which takes a list of texts as input and converts this to a document-term matrix. The `min_df` argument allows us to discard words that only appear once, thus avoiding that the matrix becomes too large." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "t8OjeRALO_fy" }, "outputs": [], "source": [ "bl_counts = CountVectorizer(min_df=2) # create a CountVectorizer object\n", "bl_dtm = bl_counts.fit_transform(\n", " titles_by_year_bl\n", ") # fit the CountVectorizer on the titles\n", "totals_bl = bl_dtm.sum(axis=1)\n", "# sum row wise, i.e. totals for each year\n", "totals_bl = np.squeeze(np.array(totals_bl))" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "NdTqcmI6O_fy" }, "outputs": [], "source": [ "cat_counts = CountVectorizer(min_df=10)\n", "cat_dtm = cat_counts.fit_transform(titles_by_year_cat)\n", "totals_cat = cat_dtm.sum(axis=1)\n", "totals_cat = np.squeeze(np.array(totals_cat))" ] }, { "cell_type": "markdown", "metadata": { "id": "XeCkERUtO_fy" }, "source": [ "By amending the `terms` variable you can select the keywords you want to investigate and compare between sample and population." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "pzojViY3O_fz" }, "outputs": [], "source": [ "terms = {\"machinery\", \"machines\", \"machine\", \"engine\", \"engines\"}" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "I-7VdpDqO_fz", "outputId": "f1a81614-2c7b-4283-e4e9-3a9f1ac10dca" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Plotting relative frequency for following words in BL Books titles:\n", "\t ['machinery', 'machines', 'engines']\n", "\n", "Plotting relative frequency for following words in BL Catalogue titles:\n", "\t ['machine', 'machines', 'engines', 'machinery', 'engine']\n" ] } ], "source": [ "terms_bl = list(terms.intersection(set(bl_counts.get_feature_names())))\n", "print(\n", " \"Plotting relative frequency for following words in BL Books titles:\\n\\t\", terms_bl\n", ")\n", "terms_cat = list(terms.intersection(set(cat_counts.get_feature_names())))\n", "print()\n", "print(\n", " \"Plotting relative frequency for following words in BL Catalogue titles:\\n\\t\",\n", " terms_cat,\n", ")" ] }, { "cell_type": "markdown", "metadata": { "id": "nsVIdxMmO_fz" }, "source": [ "In the code cell below, we first collect frequencies of the given query terms in the BL Books corpus (and sum all counts if more than one keyword is given). We divide the yearly absolute counts by the total number of words in titles to obtain yearly relative frequencies. We repeat the same procedure for the catalogue data.\n", "\n", "Lastly, we plot the relative frequencies for both the BL Books (blue) and catalogue (orange), slight smoothing the timeline by showing the rolling mean." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 334 }, "id": "rESidLpSO_fz", "outputId": "5712d9d4-5870-4f60-80ea-006d0d505307" }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "idx_bl = [bl_counts.vocabulary_[term] for term in terms_bl]\n", "if len(idx_bl) <= 1:\n", " counts_bl = np.squeeze(bl_dtm[:, idx_bl].toarray())\n", "else:\n", " counts_bl = np.squeeze(bl_dtm[:, idx_bl].toarray()).sum(axis=1)\n", "\n", "pd.Series(counts_bl / totals_bl, index=titles_by_year_bl.index).rolling(2).mean().plot()\n", "idx_cat = [cat_counts.vocabulary_[term] for term in terms_cat]\n", "\n", "if len(idx_cat) <= 1:\n", " counts_cat = np.squeeze(cat_dtm[:, idx_cat].toarray())\n", "else:\n", " counts_cat = np.squeeze(cat_dtm[:, idx_cat].toarray()).sum(axis=1)\n", "\n", "pd.Series(counts_cat / totals_cat).rolling(2).mean().plot()" ] }, { "cell_type": "markdown", "metadata": { "id": "0hdpuD8pO_fz" }, "source": [ "## Fin." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "JAYPUdi-O_f0" }, "outputs": [], "source": [] } ], "metadata": { "colab": { "name": "sample_inspector_i.ipynb", "provenance": [] }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.4" } }, "nbformat": 4, "nbformat_minor": 1 }