Developer guidelines

Från Wikimedia
Hoppa till navigering Hoppa till sök

Denna sida är på engelska av transparensskäl samt då den använda terminologin oftast är på engelska. Undersidor kan ibland vara på svenska.

The default language for new projects should, whenever appropriate, be Python 3.

What to include

Log files should not be included in the repositories. When relevant, these may instead be uploaded as attachments to Phabricator tasks.

Input data may be included in the repository depending on scope. E.g. a .json file containing a mapping of parish names to Wikidata may be appropriate whereas all of the metadata for a batch upload may not be. If needed, such files can instead be stored on Google Drive.

Processed data which can be recreated by re-running the code should never be part of a repository. Output that is expensive to recreate, doesn't take too much space and may be useful in the future can be stored on Google Drive.

Documentation

Documentation ensures the code is understandable by other members of the team, an external audience, as well as by the coder themselves a year later.

All new public functions and classes are expected to be documented. It is recommended to also document function parameters. A style for parameter documentation has not yet been selected. Python documentation is expected to follow PEP 257.

Code-style and linting

Linting ensures code consistency and reduces the need for nitpicking during code-review. Basic linting is expected to have been done by the person responsible for a PR. Even better is to have linting be part of the CI process. For python, the code style is expected to follow PEP8. Helpful tools are flake8 and pydocstyle.

Gerrit, Github and Phabricator

For MediaWiki development gerrit should always be used. For other projects, Github may often be more appropriate. For one-use-only repositories (such as code for a specific batch upload) Github is recommended. Repositories in Github can either be under the user account or the Wikimedia-Sverige account. When creating new repos https://github.com/Wikimedia-Sverige/.github/generate. This uses our template which adds code of conduct, standard license and generic .gitignore.

In order to comply with our Phabricator guidelines, issue tracking for any new repositories must be done in Phabricator. To connect the two, Github commit messages should include a link to the Phabricator task, and the commit/Pull request should be linked to from the task. If the patch is up for review the #patch-for-review tag should also be added to the task. To make the issue handling system discoverable, a link to the relevant Phabricator board/project should be added to the README file. It may also be advisable to disable issues in that repository (Settings → Options → Features).

For existing repositories already containing Github issues: new issues should instead be filed as Phabricator tasks. Existing issues should get an equivalent Phabricator task when work commences on them (or if they are included in planning). When created, it should be clear from the task, which the old issue was and from the issue, which task that is now dealing with it.

For repositories which are temporary forks of another Github repository or where there is extensive collaboration outside of Wikimedia, issue tracking may continue in Github.

Licensing

All code produced by Wikimedia Sverige should be explicitly released under an appropriate OSI-approved license[1]. When expanding on an existing codebase the license used should ensure compatibility with the remaining codebase. In particular when developing for MediaWiki GPL 2-or-later should be used for compatibility. When the choice is free MIT is the preferred license of Wikimedia Sverige, in part due to its popularity and its compatibility with e.g. GPL without enforcing copyleft.

In practice

  • Add a LICENSE-file to the root of the repo. Suitable templates an be found in e.g. this repo. If any components are not covered by this license then add a separate LICENSE-file to the root of each such directory.
  • For php: Add a PHPDoc @license <license> comment to the top of each file, above the use-statement and below any potential namespace-statement.[2]
  • For Python: A header in each file is not necessary but might be appropriate if the codebase already makes use of this. The header would look something like the following (C) <Name>, <YEAR>. Depending on the license this can be followed by the GPL paste or something like Distributed under the terms of the <license> license..

Code of Conduct

All code developed by WMSE is covered by the WMF's code of conduct for Wikimedia technical spaces independently on which space it is hosted on. In cases where we are not already mandated by the WMF to carry a CODE_OF_CONDUCT.md file one should still be added to the root of the repository with the following content:

The development of this software is covered by a [Code of Conduct](https://www.mediawiki.org/wiki/Special:MyLanguage/Code_of_Conduct). This applies even if the software is hosted on a space not explicitly listed in the linked Code of Conduct.

Code-review and task branching

The implementation of code-review varies between our different projects, but the following are general recommendations.

To make code-review possible, and to keep the commit history cleaner, tasks should be developed in separate branches. These are then merged into the master branch via a gerrit review or a Github pull request (both hereafter PR). The scope of a branch often corresponds to a single Phabricator task (e.g. new functionality, changed functionality, a bug). When merged into master, it is most often appropriate to squash each PR into a single commit.

Whenever possible, PRs should be independent of each other. This way they can be developed in parallel and the review/merge of one is less likely to impact the other.

Code-review happens either on the PR level or, in special cases, before deployment. Unless otherwise decided the review is done by the project manager/senior developer.

Making code-review happen

Due to limited resources, having every PR reviewed is not always realistic. The following should be taken in consideration when determining whether code-review is necessary:

  • The size of the project. The time and effort put into code-review should be proportional to those put into the whole project.
  • The scope of the project. Projects with a more general scope, built with re-usability in mind, should be prioritized for code-review, even if they are currently being maintained/used by one developer/specific application.
  • The people involved. Projects actively maintained by several developers should be prioritized for code-review.

Testing

Testing makes it easier to update existing code, to deal with edge cases and to deploy with confidence. It can also help when writing the product code, as it makes you think more about what it does and how.

Unit testing should be implemented in new code. It is not necessary to introduce in older code that's assumed to be working. While not necessary, it may be a good idea to use test driven development.

Integration testing is not required at this point. This can partially be mitigated using documentation in e.g. Phabricator describing the expected output/behaviour given some real-life input examples.

Tests should ideally be run as part of the CI process.

Sub-pages

Footnotes

  1. Note that CC0 is not an appropriate license for code since it is incompatible with e.g. GPL.
  2. See e.g. this example file.