Sharing our work
Contents
Sharing our work¶
We have now gone through the process of creating a model that performs fairly well for our task. Although we originally created this model for our own needs, it is possible that others might benefit from our work. It therefore makes sense to share it so others can adapt or build on the model. Another reason we might want to share the things we used to build our model is so that others can interrogate our approach and model.
Sharing our model?¶
We’ll start by discussing how we can share our model. What does it mean to share our model? When we train our model we essentially have a few different components that fit together:
the architecture for our model, the number of layers etc.
the model weights (trainable and non-trainable parameters)
the training process, hyperparameters, learning rate etc.
When we share the artefact of our work we mainly care about the first two components. For someone else to be able use our model we’ll usually need to think about how to share the first two components. In our work so far we’ve mainly used ‘of the shelf’ model architectures, namely AWD_LSTM
for our fastai
model and DistilBERT
for our huggingface model. This means we could in theory just point people to those architectures and share the model weights with them. In practice most machine learning frameworks have some way of making the process of saving, and loading, all the necessary components of our model in an easier way. We won’t dig too much into the specifics of this here since it becomes too framework specific. Instead we’ll focus on more general considerations around the process of sharing a model.
Model cards¶
We’ve discussed some of the necessary steps to sharing our model. However, if we stop at the point of sharing model files/weights we aren’t being super helpful. What are the model weights? Essentially a bunch of numbers. If this is all we share without any context it’s hard for any one else who hasn’t been involved in the process of developing the model to know:
what the model does?
how to use it?
what the possible labels are?
is the model any good?
when does the model not work well?
what training data was used?
This is a non-exhaustive list of things people may want to know. This information becomes even more important if our model is going to be used to make decisions or inform decisions that impact humans. In our particular case we are predicting the genre of 19th-Century books. This is maybe not as high stakes as a model approving or rejecting insurance claims but it may still have an impact on people depending on how the model is used. For example, if the model is used to help with discovery, books which are mislabelled might be less discoverable. If the mistakes are model are not random i.e. they are systematic in some way, this might mean some types of books be made less discoverable. For example, we have been concerned throughout this process with the impact of language on our model. If our model sucks at predicting non English book titles it is important people know this.
This is what a model card aims to help address. The Model Cards for Model Reporting [10] paper offers a framework for approaching this task. The paper gives some structure to addressing things people may want, and more likely should know, about our model. We have attempted to try and follow this structure in documenting our models. Whether we have done a good job is open to debate, however, we think that attempting to complete these model cards when sharing models. You can read the model card for our model here.
Sharing our data¶
Similar to how we shared our model using the 🤗 hub we can also share our training data on this hub. There are some nice advantages to this:
we make our data discoverable in a ecosystem where many people are looking for datasets already
we can give access to our dataset in one line of code. This can also help ‘abstract’ any nuances about pre-processing etc for the data.
we are strongly encouraged to properly document our dataset (more on this below)
This dataset can be found is available as blbooksgenre on the 🤗 hub.
Datasheets for datasets¶
We’ve agonised about our data a fair bit in the development of this model. It seems sensible then that we should also share the data we used to train our model. Whilst our training data is a bit more directly interpretable in comparison to our model (which in some sense is a bunch of numbers), there are still many things that our data won’t convey directly. For this reason it is important to also document our data. Datasheets for Datasets [11] provides a framework to help us do this documentation.
Completing this datasheet isn’t always easy, especially when the data you are working with has a complex provenance, as is often the case with GLAM collections. However, shifting from documenting individual items at the catalogue level, to documenting collections at scale will require different approaches and Datasheets for Datasets offers a useful starting point for tackling this challenge.
This video provides more background to the motivations and aims of datasheets.
Datasheets for Datasets from CACM on Vimeo.
GLAM specific considerations?¶
We have shown some possible approaches to sharing our work. Most of this has been fairly generic i.e. we didn’t focus a huge amount on anything GLAM specific. Many of the things we said about making things more accessible would be relevant to many domains which want to make use of machine learning i.e. a biologist wanting to use a machine learning model might have similar challenges to a librarian. However, there may be some things which are GLAM specific particularly around GLAM data.
We don’t offer much in the way of suggestions here. Model cards and datasheets both offer a very useful foundation. It probably makes sense to build on this work and go through the work of trying to document GLAM collections for use in machine learning. Working through this process will hopefully begin to reveal some common needs or extensions that might be required for GLAM data. i.e. there may need to be more background on selection biases for digitised collections, or an explanation of metadata standards which might be familiar to GLAM staff but not to people working in NLP.
This is hopefully an area that will continue to develop as part of initiatives like ai4lam.
Demo apps¶
One other way in which we can make our models more useful for others is to create an application which allows people to use the model without having to go through a lot of steps to get the model setup. This might sound like a lot of work but there are a number of tools that make this easier to do. Two of these are Gradio and Streamlit.
We created a simple demo app for the fastai model we trained in an earlier section: Training our first book genre classification model. This demo app is primarily intended to offer some ways of exploring the model.
We also created a demo app for our updated model.
More to come…¶
There are other potential demo apps related to this work which we might create after the release of this ‘book’.