vaex-ml: package centred around machine learning related tasks#254
vaex-ml: package centred around machine learning related tasks#254JovanVeljanoski merged 36 commits intomasterfrom
Conversation
maartenbreddels
left a comment
There was a problem hiding this comment.
Awesome! I'll fix travis.
packages/vaex-ml/vaex/ml/pycache/ should not go in
| 'vaex-astro==0.4', | ||
| 'vaex-arrow==0.3' | ||
| 'vaex-arrow==0.3', | ||
| 'vaex-ml==0.4' |
There was a problem hiding this comment.
I don't think vaex-ml should be installed by default when you install vaex, like vaex-ui, what do you think?
There was a problem hiding this comment.
Especially since vaex-ml now still depends on numba.
There was a problem hiding this comment.
Why not? I thought it is good to come as a default. I am afraid it will not get noticed or add extra complexity otherwise.
For production environments, people can of course choose what to install.
| @@ -0,0 +1 @@ | |||
| !coverage.py: This is a private format, don't read it directly!{"lines":{}} No newline at end of file | |||
There was a problem hiding this comment.
I don't think this file should go into the repo.
There was a problem hiding this comment.
Agreed! I tried to exclude it.. i will check the gitignore file again
There was a problem hiding this comment.
Same for the pycache/ like things.
bacb96b to
1ed152d
Compare
548aff4 to
9acd38c
Compare
…aring vaex.ml to pure sklearn
…to test similarity of results up to a certain precision.
…th the binaries, we do run on windows/osx
|
🎉 yeah! |
This is a big PR in which we are introducing
vaex-ml, avaexpackage centred around machine learning related tasks and applications. The following describes the contents:vaex.ml.transformations: methods related to preprocessing: scalers, categorical encoders, PCAvaex.ml.cluster: provides an efficient KMeans clustering algorithmvaex.ml.ui: provides means to construct anipywidgetfor nearly any transformer in this packagevaex.ml.xgboost: a binding to thexgboostlibraryvaex.ml.lightgbm: a binding to thelightgbmlibraryvaex.ml.catboost: a binding to thecatboostlibraryvaex.ml.sklearn: a binding to thescikit-learnlibrary. At the moment, only the estimators are supported.vaex.ml.incubator: a module housing various machine learning models. The bindings in the incubator are considered experimental and are under testing. The API, implementation or support may change without notice.vaex.ml.datasets: contains datasets for experimentation and training. Currently contains the titanic and the iris classical datasets. It also contains methods for replicating the iris dataset such that it contains a total of 10^9 samples, creating a "big data" example.vaex.ml.generate: module that auto-generates an alternative API for the transformers and ML models.vaex.ml.pipeline: provides a pipeline object for thevaex-mltransformers and estimatorsvaex.ml.state: methods for serialisation ofvaexobjectsvaex.ml.linear_model: provides an implementation of linear models that operate on a grid (binned data) instead of individual samples.Everything comes with a full suite of tests handled by
pytest.NB: The contents above and this description itself may change slightly as
vaex-mlis integrated withvaex.