# Model inference 

Inference is the process of making new predictions on unseen data. There are different approaches to carrying out inference which will depend on purpose of the model and how it will be used. Two main approaches to doing inference are:
- 'real-time' single item predictions i.e. calling an API to predict a single example
- 'batch inference' i.e. running inference against a larger volume of data 
 
Since we have a set of data we want to augment with additional machine generated labels we will use the second, batch inference, approach. Because we are only likely to run this batch prediction process occasionally, for example if we create a better performing model, we won't spend much time worrying about how quick the inference process is. 

In [None]:
!pip install fastai==2.5.2

Collecting fastai==2.5.2
  Downloading fastai-2.5.2-py3-none-any.whl (186 kB)
[?25l
[K     |█▊                              | 10 kB 21.2 MB/s eta 0:00:01
[K     |███▌                            | 20 kB 7.0 MB/s eta 0:00:01
[K     |█████▎                          | 30 kB 5.1 MB/s eta 0:00:01
[K     |███████                         | 40 kB 4.9 MB/s eta 0:00:01
[K     |████████▉                       | 51 kB 2.5 MB/s eta 0:00:01
[K     |██████████▌                     | 61 kB 2.8 MB/s eta 0:00:01
[K     |████████████▎                   | 71 kB 2.8 MB/s eta 0:00:01
[K     |██████████████                  | 81 kB 3.1 MB/s eta 0:00:01
[K     |███████████████▉                | 92 kB 3.3 MB/s eta 0:00:01
[K     |█████████████████▋              | 102 kB 2.7 MB/s eta 0:00:01
[K     |███████████████████▍            | 112 kB 2.7 MB/s eta 0:00:01
[K     |█████████████████████           | 122 kB 2.7 MB/s eta 0:00:01
[K     |██████████████████████▉         | 133 kB 2.7 MB/s eta 0:00:01


In the previous notebook we saved our model. We can load it using the `load_model` method. 

In [None]:
from fastai.text.all import *

If you don't have a saved model you can grab one by uncomenting this cell

In [None]:
!wget -O 20210928-model.pkl https://zenodo.org/record/5245175/files/20210928-model.pkl?download=1

--2021-11-02 19:34:35--  https://zenodo.org/record/5245175/files/20210928-model.pkl?download=1
Resolving zenodo.org (zenodo.org)... 137.138.76.77
Connecting to zenodo.org (zenodo.org)|137.138.76.77|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 158529715 (151M) [application/octet-stream]
Saving to: ‘20210928-model.pkl’


2021-11-02 19:34:43 (23.3 MB/s) - ‘20210928-model.pkl’ saved [158529715/158529715]



In [None]:
learn_class = load_learner("20210928-model.pkl", cpu=False)

## Trying some examples of made up books

To start with let's just call the `predict` method on some made up book titles to see if it gives sensible answers:

In [None]:
learn_class.predict("A history of the French Navy")

('Non-fiction', tensor(1), tensor([0.0081, 0.9919]))

In [None]:
learn_class.predict("Communist Manifesto")

('Non-fiction', tensor(1), tensor([0.4674, 0.5326]))

These seem sensible enough predictions. We can also see what information we get back from the `predict` method. Particularly important to note here is that we get back a tensor containing the confidence for each prediction. We are likely going to want to keep this information alongside our predictions. 

## Predicting against the full BL Microsoft books metadata 

We are now ready to run predictions against the full collection of metadata which contains all of the titles we want to have genre labels for.

In [None]:
full_metadata_url = (
    "https://bl.iro.bl.uk/downloads/e4bf0f74-2c64-4322-93c7-0dcc5e5246da?locale=en"
)

In [None]:
dtypes = {
    "BL record ID": "string",
    "Type of resource": "category",
    "Name": "category",
    "Role": "category",
    "Title": "string",
    "Country of publication": "category",
    "Place of publication": "category",
    "Publisher": "category",
    "Genre": "category",
    "Languages": "category",
}

In [None]:
df_full = pd.read_csv(full_metadata_url, low_memory=False, dtype=dtypes)

As a reminder we can check how big this dataset is

In [None]:
len(df_full)

1752078

In [None]:
df_full = df_full[df_full.Title.notna()]

### Creating our test data

We need to make sure that our data is processed in the same way when we do inference as when we make predictions. For example our text needs to be tokenized in the same way. This is made very easy in fastai because we can use the `test_dl` method. This method knows how to process data for our model. We just need to pass in the relevant column containing our text. 

In [None]:
titles = df_full.loc[:, "Title"]

In [None]:
learn_class.dls.num_workers = 0

In [None]:
%%time
test_data = learn_class.dls.test_dl(titles)

CPU times: user 30min 24s, sys: 1min 13s, total: 31min 38s
Wall time: 30min 19s


Once we have done this we can use the `get_preds` method to run predictions against all of our data.

In [None]:
%%time
predictions = learn_class.get_preds(dl=test_data)

CPU times: user 3min 53s, sys: 8.72 s, total: 4min 1s
Wall time: 17min


You can see that this didn't take too long considering the size of our data. We might want to double check our predictions match the lenght of our original data. If we just call length on `predictions`

In [None]:
len(predictions)

2

You can see we get something back which has `len` 2. Let's have a look at this. 

In [None]:
predictions

(tensor([[0.0759, 0.9241],
         [0.1282, 0.8718],
         [0.9074, 0.0926],
         ...,
         [0.0986, 0.9014],
         [0.0675, 0.9325],
         [0.0834, 0.9166]]), None)

We can see that this is a tuple, with the first element containing the tensor we're interested in. Let's get the length of this. 

In [None]:
len(predictions[0])

1752072

In [None]:
assert len(predictions[0]) == len(df_full)

Since we only want the first element of our predictions `tuple` let's store it in a new variable `preds_tensor`. 

In [None]:
preds_tensor = predictions[0]

In [None]:
preds_tensor[0]

tensor([0.0759, 0.9241])

At the moment we have the probabilities for each label. We can get the vocab from our `dls` attribute. 

In [None]:
learn_class.dls.vocab[1]

['Fiction', 'Non-fiction']

To make it easier to work with this data let's map our probabilties to this vocab. We'll first store the `argmax` value for each prediction i.e. the index of the max value. 

In [None]:
df_full["predicted_label"] = preds_tensor.numpy().argmax(1)

We can then create a dictionary which we can use to map our `1` and `0` labels to the text versions

In [None]:
decode = dict(enumerate(learn_class.dls.vocab[1]))

In [None]:
decode

{0: 'Fiction', 1: 'Non-fiction'}

In [None]:
df_full.predicted_label = df_full.predicted_label.replace(decode)

We'll create two new variables to store the probabilties for each of our labels. 

In [None]:
import numpy as np

In [None]:
fiction_probs, non_fiction_probs = np.hsplit(preds_tensor.numpy(), learn_class.dls.c)

In [None]:
df_full["fiction_probs"] = fiction_probs
df_full["non_fiction_probs"] = non_fiction_probs

Let's take a quick look at how our new columns look:

In [None]:
df_full[["Title", "predicted_label", "fiction_probs", "non_fiction_probs"]].head(5)

Unnamed: 0,Title,predicted_label,fiction_probs,non_fiction_probs
0,"Aabc [etc.] Jesus Vocales, eli äänelliset bokstawit Consonantes Luku-merkit",Non-fiction,0.075868,0.924132
1,A che serve il Papa?,Non-fiction,0.128236,0.871764
2,A. for Apple [An illustrated alphabet.],Fiction,0.907428,0.092572
3,Á Grãa Bretanha,Non-fiction,0.262661,0.737339
4,A quien me entiende [On the factious spirit of the Mexican press. Signed: Uno de tantos.],Non-fiction,0.479002,0.520998


This looks like a fairly reasonable format for storing our predictions. Let's save as a `json` and `csv` file. 

In [None]:
df_full.to_json("bl_books_w_genre.json")

In [None]:
df_full.to_csv("bl_books_w_genre.csv", index=False)

## Conclusion 

We have now got a full set of predictions that we could work with. We might want to dig into the potential weakness of our model further though and try and improve on this intial model. We'll do that in the next sections. 