Contributing
Please see our Code of Conduct for policies on contributing. We also broadly follow the Turing Way Code of Conduct to encourage a pleasant experience contributing and collaborating on this project.
Documentation
If you would only like to contribute to documentation, the easiest way to deploy and see changes rendered with each edit is to run outside docker:
$ git clone https://github.com/living-with-machines/lwmdb
$ cd lwmdb
$ poetry install --with dev --with docs
$ poetry run mkdocs serve --dev-addr=0.0.0.0:8080
Note
The --with dev
and --with docs
options are currently included by default, but they may be set as optional in the future.
Documentation should also be available on https://localhost:9000
when running
but it does not auto update as local changes are made. Port 8080
is specified in the example above to avoid conflict with a local docker compose
run (which defaults to 0.0.0.0:9000
).
Local docker
test runs
Local environment
Tests are built and run via pytest
and docker
using pytest-django
. To run tests ensure a local docker
install, a local git
checkout of lwmdb
and a build (see install instructions for details).
Running locally with local.yml
in a terminal deploys the site and this documentation:
- Site at
localhost:3000
- Docs at
localhost:9000
Note
If there are issues starting the server, shutting it down and then starting up again may help
Running tests
To run tests, open another terminal to run pytest
within the django
docker
container
while docker
is running.
These will print out a summary of test results like:
Test session starts (platform: linux, Python 3.11.3, pytest 7.3.1, pytest-sugar 0.9.7)
django: settings: config.test_settings (from ini)
rootdir: /app
configfile: pyproject.toml
plugins: pyfakefs-5.2.2, anyio-3.6.2, sugar-0.9.7, cov-4.0.0, django-4.5.2
collected 33 items / 1 deselected / 32 selected
gazetteer/tests.py ✓ 3% ▍
lwmdb/tests/test_commands.py xx 9% ▉
mitchells/tests.py x✓ 100% ██████████
newspapers/tests.py ✓✓✓✓✓ 28% ██▊
lwmdb/utils.py ✓✓✓✓✓✓✓✓✓ 56% █████▋
lwmdb/tests/test_utils.py ✓✓✓✓✓✓✓✓✓✓✓✓✓ 97% █████████▊
------------ coverage: platform linux, python 3.11.3-final-0 ---------------
Name Stmts Miss Cover
----------------------------------------------------------------------------
lwmdb/management/commands/connect.py 10 3 70%
lwmdb/management/commands/createfixtures.py 42 30 29%
lwmdb/management/commands/fixtures.py 126 78 38%
lwmdb/management/commands/load_json_fixtures.py 20 11 45%
lwmdb/management/commands/loadfixtures.py 27 8 70%
lwmdb/management/commands/makeitemfixtures.py 78 62 21%
lwmdb/tests/test_commands.py 15 2 87%
lwmdb/tests/test_utils.py 25 7 72%
lwmdb/utils.py 120 48 60%
----------------------------------------------------------------------------
TOTAL 508 284 44%
8 files skipped due to complete coverage.
============================ slowest 3 durations ===========================
3.85s setup gazetteer/tests.py::TestGeoSpatial::test_create_place_and_distance
1.06s call lwmdb/tests/test_commands.py::test_mitchells
0.14s call lwmdb/utils.py::lwmdb.utils.download_file
Results (6.74s):
29 passed
3 xfailed
1 deselected
Adding all expected failed tests
In the previous example, 29 tests passed, 3 failed as expected (hence xfailed
) and 1 test was skipped (deselected
). To see the deatils of what tests failed, adding the --runxfail
option will add reports like the following:
...
def __getattr__(self, name: str):
"""
After regular attribute access, try looking up the name
This allows simpler access to columns for interactive use.
"""
# Note: obj.x will always call obj.__getattribute__('x') prior to
# calling obj.__getattr__('x').
if (
name not in self._internal_names_set
and name not in self._metadata
and name not in self._accessors
and self._info_axis._can_hold_identifiers_and_holds_name(name)
):
return self[name]
> return object.__getattribute__(self, name)
E AttributeError: 'Series' object has no attribute 'NLP'
/usr/local/lib/python3.11/site-packages/pandas/core/generic.py:5989: AttributeError
-------------------------- Captured stdout call ----------------------------
Warning: Model mitchells.Issue is missing a fixture file and will not load.
Warning: Model mitchells.Entry is missing a fixture file and will not load.
Warning: Model mitchells.PoliticalLeaning is missing a fixture file and will not load.
Warning: Model mitchells.Price is missing a fixture file and will not load.
Warning: Model mitchells.EntryPoliticalLeanings is missing a fixture file and will not load.
Warning: Model mitchells.EntryPrices is missing a fixture file and will not load.
lwmdb/tests/test_commands.py ⨯ 6% ▋
...
and summaries at the end of the report
...
============================ slowest 3 durations ===========================
3.87s setup gazetteer/tests.py::TestGeoSpatial::test_create_place_and_distance
1.07s call lwmdb/tests/test_commands.py::test_mitchells
0.15s call lwmdb/utils.py::lwmdb.utils.download_file
========================== short test summary info =========================
FAILED lwmdb/tests/test_commands.py::test_mitchells - AttributeError: 'Series' object
has no attribute 'NLP'
FAILED lwmdb/tests/test_commands.py::test_gazzetteer - SystemExit: App(s) not allowed: ['gazzetteer']
FAILED mitchells/tests.py::MitchelsFixture::test_load_fixtures - assert 0 > 0
Results (6.90s):
29 passed
3 failed
- lwmdb/tests/test_commands.py:9 test_mitchells
- lwmdb/tests/test_commands.py:19 test_gazzetteer
- mitchells/tests.py:18 MitchelsFixture.test_load_fixtures
1 deselected
Terminal Interaction
Adding the --pdb
option generates an ipython
shell at the point a test fails:
def __getattr__(self, name: str):
"""
After regular attribute access, try looking up the name
This allows simpler access to columns for interactive use.
"""
# Note: obj.x will always call obj.__getattribute__('x') prior to
# calling obj.__getattr__('x').
if (
name not in self._internal_names_set
and name not in self._metadata
and name not in self._accessors
and self._info_axis._can_hold_identifiers_and_holds_name(name)
):
return self[name]
> return object.__getattribute__(self, name)
E AttributeError: 'Series' object has no attribute 'NLP'
/usr/local/lib/python3.11/site-packages/pandas/core/generic.py:5989: AttributeError
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> entering PDB >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> PDB post_mortem (IO-capturing turned off) >>>>>>>>>>>>>>>>>
> /usr/local/lib/python3.11/site-packages/pandas/core/generic.py(5989)__getattr__()
5987 ):
5988 return self[name]
-> 5989 return object.__getattribute__(self, name)
5990
5991 def __setattr__(self, name: str, value) -> None:
ipdb>
Development
Commits
Pre-commit
The .pre-commit-config.yaml
file manages configurations to ensure quality of each git
commit. Ensure this works by installing pre-commit
before making any git
commits.
Note
pre-commit
is included in the pyproject.toml
dev
dependencies group, so it’s possible to run all git
commands within a local poetry
install of lwmdb
without installing pre-commit
globally.
This will automatically download and install dependencies specified in .pre-commit-config.yaml
and then run all those checks for any git
commit.
You can run all of these checks outside a commit with
Commit messages
For git commit messages we try to follow the conventional commits
spec, where commits are prefixed by categories:
fix
: something fixedfeat
: a new featuredoc
: documentationrefactor
: a significant rearangement code structuretest
: adding testsci
: continuous integrationschore
: something relatively small like updating a dependency
App
Once docker compose
is up, any local modifications should automatically be loaded in the local django
docker
container
and immediately applied. This suits reloading web app changes (including css
etc.) and writing and running tests. No additional docker build
commands should be required unless very significant modifcations, such as shifting between git
branches
.
Tests
Doctests
Including docstrings
with example tests is an efficient way to add tests, document usage and help ensure documentation is consistent with code changes.
Pytest Tests
We use pytest
for tests, and their documentation is quite comprehensive. The django-pytest
module is crucial to the test functionality as well.
Pytest Configuration
The config for running tests is shared between pyproject.toml
and lwmdb/tests/conftest.py
.
The pyproject.toml
section below provides automatic test configuration whenever pytest
is run. An example config at the time of this writing:
[tool.pytest.ini_options]
DJANGO_SETTINGS_MODULE = "config.test_settings"
python_files = ["tests.py", "test_*.py"]
addopts = """
--cov=lwmdb
--cov-report=term:skip-covered
--pdbcls=IPython.terminal.debugger:TerminalPdb
--doctest-modules
--ignore=compose
--ignore=jupyterhub_config.py
--ignore=notebooks
--ignore=docs
--ignore=lwmdb/contrib/sites
-m "not slow"
--durations=3
"""
markers = [
"slow: marks tests as slow (deselect with '-m \"not slow\"')"
]
--cov=lwmdb
specifies the path to test (in this case the name of this project)--cov-report=term:skip-covered
excludes files with full coverage from the coverage report--pydbcls=Ipython.terminal.debugger:TerminalPdb
enables theipython
terminal for debugging--doctest-modules
indicatesdoctests
are included in test running--ignore
excludes folders from testing (eg:--ignore=compose
skips thecompose
folder)-m "not slow"
skips tests marked with@pytest.mark.slow
--duration=3
lists the duration of the 3 slowest running tests
Example Tests
Within each django
app
in the project there is either a tests.py
file or a tests
folder, where any file name beginning with test_
is included (like test_commands.py
).
An example test from mitchells/tests.py
:
def test_download_local_mitchells_excel(caplog, mitchells_data_path) -> None:
"""Test downloading `MITCHELLS_EXCEL_URL` fixture.
Note:
`assert LOG in caplog.messages` is designed to work whether the file is
downloaded or not to ease caching and testing
"""
caplog.set_level(INFO)
success: bool = download_file(mitchells_data_path, MITCHELLS_EXCEL_URL)
assert success
LOG = f"{MITCHELLS_EXCEL_URL} file available from {mitchells_data_path}"
assert LOG in caplog.messages
mitchells_data_path
fixture
is defined in conftest.py
and returns a Path
for the folder where raw mitchells
data is stored prior to processing into json
.
Fixtures
in pytest
work by automatically populating any functions names beginning with test_
with whatever is returned from registered fixture
functions. Here the mitchells_data_path
Path
object is passed to the download_file
function and saved to MITCHELLS_LOCAL_LINK_EXCEL_URL
. download_file
returns a bool
to indicate if the dowload was successful, hence then testing if the value returned is True
via the line:
The lines involving caplog
aid testing logging
. The logging level is to INFO
to capture levels lower than the default WARNING
level.
This then means the logging is captured and can be tested on the final line
assert caplog.messages == [
f'{MITCHELLS_LOCAL_LINK_EXCEL_PATH} file available from {mitchells_data_path}'
]
Note
To ease using python
logging and django
logging features we use our log_and_django_terminal
wrapper to ease managing logs that might also need to be printed at the terminal alongside commands.
Crediting Contributions
We use All Contributors in our semi-automated file citation file .all-contributorsrc
and Citation File Format via CITATION.cff
to help manage attributing contributions to both this code base and datasets we release for use with lwmdb
. We endeavour to harmonise contributions from collaborators across Living with Machines whose copious, interdisciplinary collaboration led to lwmdb
.
All Contributors
All Contributors is a service for managing credit for contributions to a git
repository. .all-contributorsrc
is a json
file in the root directory of the alnm
repository. It also specifies design for what’s rendered in README.md
and intro contributors section of this documentation.
The json
structure follows the All Contributors specification
. Below is an example of this format
{
"files": [
"README.md"
],
"imageSize": 100,
"commit": false,
"commitType": "docs",
"commitConvention": "angular",
"contributors": [
{
"login": "github-user-name",
"name": "Person Name",
"avatar_url": "https://avatars.githubusercontent.com/u/1234567?v=4",
"profile": "http://www.a-website.org",
"contributions": [
"code",
"ideas",
"doc"
]
},
{
"login": "another-github-user-name",
"name": "Another Name",
"avatar_url": "https://avatars.githubusercontent.com/u/7654321?v=4",
"contributions": [
"code",
"ideas",
"doc",
"maintenance"
]
},
],
"contributorsPerLine": 7,
"skipCi": true,
"repoType": "github",
"repoHost": "https://github.com",
"projectName": "lwmdb",
"projectOwner": "Living-with-machines"
}
The contribution
component per user indicates type of contributionat present we consider these:
code
ideas
mentoring
maintenance
doc
At present we aren’t crediting other types of contribution but may expand in the future. For more other contribtuion types provided by allcontributors
by default, see the emoji-key
table.
Adding credit, including types, via GitHub comments
For All Contributors git
accounts with at least moderator
status with our GitHub
repository should have permission to modify credit by posting in the following form on an lwmdb
github
ticket:
@all-contributors
please add @github-user for code, ideas, planning.
please add @github-other-user for code, ideas, planning.
This should cause the all-contributors bot
to indicated success:
@ModUserWhoPosted
I've put up a pull request to add @github-user! 🎉
I've put up a pull request to add @github-other-user! 🎉
or report errors:
This project's configuration file has malformed JSON: .all-contributorsrc. Error:: Unexpected token : in JSON at position 2060
CITATION.CFF
We also maintain a Citation File Format (CFF)
file for citeable, academic credit for contributions via our zenodo
registration. This helps automate the process of releasing academically citeable Digital Object Identifyer (DOI) for releases of lwmdb
.
CFF
supports Open Researcher and Contributor IDs (orcid
), which eases automating academic credit for evolving contribtuions to academic work, even as individuals change academic positions.
For reference a simplified example based on cff-version 1.2.0
:
cff-version: 1.2.0
title: Living With Machines Database
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Person
family-names: Name
orcid: 'https://orcid.org/0000-0000-0000-0000'
affiliation: A UNI
- given-names: Another
family-names: Name
orcid: 'https://orcid.org/0000-0000-0000-0001'
affiliation: UNI A
identifiers:
- type: doi
value: 10.5281/zenodo.8208204
repository-code: 'https://github.com/Living-with-machines/lwmdb'
url: 'https://livingwithmachines.ac.uk/'
license: MIT
Troubleshooting
Unexpected lwmdb/static/css/project.css
changes
At present (see issue #110 for updates) running docker compose
is likely to truncate the last line of /lwmdb/static/css/project.css
which, can then appear as a local change in a git
checkout:
$ git status
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: lwmdb/static/css/project.css
This should be automatically fixed via pre-commit
, and if necessary you can run pre-commit
directly to clean that issue outside of a git
commit. Given how frequently this may occur, it is safest to simply leave that until commiting a change.