Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
140 changes: 100 additions & 40 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,76 @@
How to contribute
-----------------
This document describes the workflow on how to contribute to the openml-python package.
If you are interested in connecting a machine learning package with OpenML (i.e.
write an openml-python extension) or want to find other ways to contribute, see [this page](https://openml.github.io/openml-python/master/contributing.html#contributing).

The preferred workflow for contributing to the OpenML python connector is to
Scope of the package
--------------------

The scope of the OpenML Python package is to provide a Python interface to
the OpenML platform which integrates well with Python's scientific stack, most
notably [numpy](http://www.numpy.org/), [scipy](https://www.scipy.org/) and
[pandas](https://pandas.pydata.org/).
To reduce opportunity costs and demonstrate the usage of the package, it also
implements an interface to the most popular machine learning package written
in Python, [scikit-learn](http://scikit-learn.org/stable/index.html).
Thereby it will automatically be compatible with many machine learning
libraries written in Python.

We aim to keep the package as light-weight as possible and we will try to
keep the number of potential installation dependencies as low as possible.
Therefore, the connection to other machine learning libraries such as
*pytorch*, *keras* or *tensorflow* should not be done directly inside this
package, but in a separate package using the OpenML Python connector.
More information on OpenML Python connectors can be found [here](https://openml.github.io/openml-python/master/contributing.html#contributing).

Reporting bugs
--------------
We use GitHub issues to track all bugs and feature requests; feel free to
open an issue if you have found a bug or wish to see a feature implemented.

It is recommended to check that your issue complies with the
following rules before submitting:

- Verify that your issue is not being currently addressed by other
[issues](https://github.com/openml/openml-python/issues)
or [pull requests](https://github.com/openml/openml-python/pulls).

- Please ensure all code snippets and error messages are formatted in
appropriate code blocks.
See [Creating and highlighting code blocks](https://help.github.com/articles/creating-and-highlighting-code-blocks).

- Please include your operating system type and version number, as well
as your Python, openml, scikit-learn, numpy, and scipy versions. This information
can be found by running the following code snippet:
```python
import platform; print(platform.platform())
import sys; print("Python", sys.version)
import numpy; print("NumPy", numpy.__version__)
import scipy; print("SciPy", scipy.__version__)
import sklearn; print("Scikit-Learn", sklearn.__version__)
import openml; print("OpenML", openml.__version__)
```

Determine what contribution to make
-----------------------------------
Great! You've decided you want to help out. Now what?
All contributions should be linked to issues on the [Github issue tracker](https://github.com/openml/openml-python/issues).
In particular for new contributors, the *good first issue* label should help you find
issues which are suitable for beginners. Resolving these issues allow you to start
contributing to the project without much prior knowledge. Your assistance in this area
will be greatly appreciated by the more experienced developers as it helps free up
their time to concentrate on other issues.

If you encountered a particular part of the documentation or code that you want to improve,
but there is no related open issue yet, open one first.
This is important since you can first get feedback or pointers from experienced contributors.

To let everyone know you are working on an issue, please leave a comment that states you will work on the issue
(or, if you have the permission, *assign* yourself to the issue). This avoids double work!

General git workflow
--------------------

The preferred workflow for contributing to openml-python is to
fork the [main repository](https://github.com/openml/openml-python) on
GitHub, clone, check out the branch `develop`, and develop on a new branch
branch. Steps:
Expand Down Expand Up @@ -114,6 +183,10 @@ First install openml with its test dependencies by running
$ pip install -e .[test]
```
from the repository folder.
Then configure pre-commit through
```bash
$ pre-commit install
```
This will install dependencies to run unit tests, as well as [pre-commit](https://pre-commit.com/).
To run the unit tests, and check their code coverage, run:
```bash
Expand Down Expand Up @@ -141,51 +214,38 @@ If you want to run the pre-commit tests without doing a commit, run:
```
Make sure to do this at least once before your first commit to check your setup works.

Filing bugs
-----------
We use GitHub issues to track all bugs and feature requests; feel free to
open an issue if you have found a bug or wish to see a feature implemented.

It is recommended to check that your issue complies with the
following rules before submitting:

- Verify that your issue is not being currently addressed by other
[issues](https://github.com/openml/openml-python/issues)
or [pull requests](https://github.com/openml/openml-python/pulls).

- Please ensure all code snippets and error messages are formatted in
appropriate code blocks.
See [Creating and highlighting code blocks](https://help.github.com/articles/creating-and-highlighting-code-blocks).

- Please include your operating system type and version number, as well
as your Python, openml, scikit-learn, numpy, and scipy versions. This information
can be found by running the following code snippet:
Executing a specific unit test can be done by specifying the module, test case, and test.
To obtain a hierarchical list of all tests, run

```python
import platform; print(platform.platform())
import sys; print("Python", sys.version)
import numpy; print("NumPy", numpy.__version__)
import scipy; print("SciPy", scipy.__version__)
import sklearn; print("Scikit-Learn", sklearn.__version__)
import openml; print("OpenML", openml.__version__)
```
```bash
$ pytest --collect-only

<Module 'tests/test_datasets/test_dataset.py'>
<UnitTestCase 'OpenMLDatasetTest'>
<TestCaseFunction 'test_dataset_format_constructor'>
<TestCaseFunction 'test_get_data'>
<TestCaseFunction 'test_get_data_rowid_and_ignore_and_target'>
<TestCaseFunction 'test_get_data_with_ignore_attributes'>
<TestCaseFunction 'test_get_data_with_rowid'>
<TestCaseFunction 'test_get_data_with_target'>
<UnitTestCase 'OpenMLDatasetTestOnTestServer'>
<TestCaseFunction 'test_tagging'>
```

New contributor tips
--------------------
You may then run a specific module, test case, or unit test respectively:
```bash
$ pytest tests/test_datasets/test_dataset.py
$ pytest tests/test_datasets/test_dataset.py::OpenMLDatasetTest
$ pytest tests/test_datasets/test_dataset.py::OpenMLDatasetTest::test_get_data
```

A great way to start contributing to openml-python is to pick an item
from the list of [Good First Issues](https://github.com/openml/openml-python/labels/Good%20first%20issue)
in the issue tracker. Resolving these issues allow you to start
contributing to the project without much prior knowledge. Your
assistance in this area will be greatly appreciated by the more
experienced developers as it helps free up their time to concentrate on
other issues.
Happy testing!

Documentation
-------------

We are glad to accept any sort of documentation: function docstrings,
reStructuredText documents (like this one), tutorials, etc.
reStructuredText documents, tutorials, etc.
reStructuredText documents live in the source code repository under the
doc/ directory.

Expand Down
158 changes: 17 additions & 141 deletions doc/contributing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,158 +2,34 @@

.. _contributing:


============
Contributing
============

Contribution to the OpenML package is highly appreciated. Currently,
there is a lot of work left on implementing API calls,
testing them and providing examples to allow new users to easily use the
OpenML package. See the :ref:`issues` section for open tasks.

Please mark yourself as contributor in a github issue if you start working on
something to avoid duplicate work. If you're part of the OpenML organization
you can use github's assign feature, otherwise you can just leave a comment.

.. _scope:

Scope of the package
====================

The scope of the OpenML Python package is to provide a Python interface to
the OpenML platform which integrates well with Python's scientific stack, most
notably `numpy <http://www.numpy.org/>`_ and `scipy <https://www.scipy.org/>`_.
To reduce opportunity costs and demonstrate the usage of the package, it also
implements an interface to the most popular machine learning package written
in Python, `scikit-learn <http://scikit-learn.org/stable/index.html>`_.
Thereby it will automatically be compatible with many machine learning
libraries written in Python.

We aim to keep the package as light-weight as possible and we will try to
keep the number of potential installation dependencies as low as possible.
Therefore, the connection to other machine learning libraries such as
*pytorch*, *keras* or *tensorflow* should not be done directly inside this
package, but in a separate package using the OpenML Python connector.

.. _issues:

Open issues and potential todos
===============================

We collect open issues and feature requests in an `issue tracker on github <https://github.com/openml/openml-python/issues>`_.
The issue tracker contains issues marked as *Good first issue*, which shows
issues which are good for beginners. We also maintain a somewhat up-to-date
`roadmap <https://github.com/openml/openml-python/issues/410>`_ which
contains longer-term goals.

.. _how_to_contribute:

How to contribute
=================

There are many ways to contribute to the development of the OpenML Python
connector and OpenML in general. We welcome all kinds of contributions,
especially:

* Source code which fixes an issue, improves usability or implements a new
feature.
* Improvements to the documentation, which can be found in the ``doc``
directory.
* New examples - current examples can be found in the ``examples`` directory.
* Bug reports - if something doesn't work for you or is cumbersome, please
open a new issue to let us know about the problem.
* Use the package and spread the word.
* `Cite OpenML <https://www.openml.org/cite>`_ if you use it in a scientific
publication.
* Visit one of our `hackathons <https://meet.openml.org/>`_.
* Check out how to `contribute to the main OpenML project <https://github.com/openml/OpenML/blob/master/CONTRIBUTING.md>`_.

Contributing code
~~~~~~~~~~~~~~~~~

Our guidelines on code contribution can be found in `this file <https://github.com/openml/openml-python/blob/master/CONTRIBUTING.md>`_.

.. _installation:

Installation
============

Installation from github
~~~~~~~~~~~~~~~~~~~~~~~~

The package source code is available from
`github <https://github.com/openml/openml-python>`_ and can be obtained with:

.. code:: bash

git clone https://github.com/openml/openml-python.git


Once you cloned the package, change into the new directory.
If you are a regular user, install with

.. code:: bash

pip install -e .

If you are a contributor, you will also need to install test dependencies

.. code:: bash
Contribution to the OpenML package is highly appreciated in all forms.
In particular, a few ways to contribute to openml-python are:

pip install -e ".[test]"
* A direct contribution to the package, by means of improving the
code, documentation or examples. To get started, see `this file <https://github.com/openml/openml-python/blob/master/CONTRIBUTING.md>`_
with details on how to set up your environment to develop for openml-python.

* A contribution to an openml-python extension. An extension package allows OpenML to interface
with a machine learning package (such as scikit-learn or keras). These extensions
are hosted in separate repositories and may have their own guidelines.
For more information, see the :ref:`extensions` below.

Testing
=======

From within the directory of the cloned package, execute:

.. code:: bash

pytest tests/

Executing a specific test can be done by specifying the module, test case, and test.
To obtain a hierarchical list of all tests, run

.. code:: bash

pytest --collect-only

.. code:: bash

<Module 'tests/test_datasets/test_dataset.py'>
<UnitTestCase 'OpenMLDatasetTest'>
<TestCaseFunction 'test_dataset_format_constructor'>
<TestCaseFunction 'test_get_data'>
<TestCaseFunction 'test_get_data_rowid_and_ignore_and_target'>
<TestCaseFunction 'test_get_data_with_ignore_attributes'>
<TestCaseFunction 'test_get_data_with_rowid'>
<TestCaseFunction 'test_get_data_with_target'>
<UnitTestCase 'OpenMLDatasetTestOnTestServer'>
<TestCaseFunction 'test_tagging'>


To run a specific module, add the module name, for instance:

.. code:: bash

pytest tests/test_datasets/test_dataset.py

To run a specific unit test case, add the test case name, for instance:

.. code:: bash

pytest tests/test_datasets/test_dataset.py::OpenMLDatasetTest

To run a specific unit test, add the test name, for instance:
* Bug reports. If something doesn't work for you or is cumbersome, please
open a new issue to let us know about the problem.
See `this section <https://github.com/openml/openml-python/blob/develop/CONTRIBUTING.md#reporting-bugs>`_.

.. code:: bash
* `Cite OpenML <https://www.openml.org/cite>`_ if you use it in a scientific
publication.

pytest tests/test_datasets/test_dataset.py::OpenMLDatasetTest::test_get_data
* Visit one of our `hackathons <https://meet.openml.org/>`_.

Happy testing!
* Contribute to another OpenML project, such as `the main OpenML project <https://github.com/openml/OpenML/blob/master/CONTRIBUTING.md>`_.

.. _extensions:

Connecting new machine learning libraries
=========================================
Expand Down