Conversation
| values[0].v_handle = const_cast<DLTensor*>(&(tblobs[0].dltensor())); | ||
|
|
||
| // scalar param | ||
| type_codes[1] = kDLFloat; |
There was a problem hiding this comment.
@yzhliu Since I need to pass a double param to the op func generated by TVM, I cannot use the Call function defined in the TVMOpModule. I moved the logic of preparing TVMArgs up here from the Call function to MXNet op's FCompute function and added an independent CallEx in TVMOpModule to just invoke the kernel. We can discuss the change of the API to cater for more use cases.
4b372cc to
595e2f7
Compare
88e63b4 to
127b036
Compare
d7f2963 to
327e0f7
Compare
Makefile
Outdated
| LDFLAGS += -L$(ROOTDIR)/lib -ltvm_runtime -Wl,-rpath,'$${ORIGIN}' | ||
|
|
||
| TVM_USE_CUDA := OFF | ||
| TVM_OP_CUDA_ARCH := NONE |
There was a problem hiding this comment.
Any particular reason you are introducing a second set instead of using the arch set variable we already have? In which use case would these two differ?
There was a problem hiding this comment.
Reverted. Thanks for the suggestion.
| __macro$(__VA_ARGS__, int64_t); \ | ||
| __macro$(__VA_ARGS__, bool) | ||
|
|
||
| #define IMPLEMENT_WORKLOAD_VALUE_FOR_TYPE(__op$, __typ$) \ |
There was a problem hiding this comment.
could you please add a comment to this macro to clarify?
| acc_type = {'float16': 'float32', 'float32': 'float64', 'float64': 'float64', | ||
| 'int8': 'int32', 'int32': 'int64', 'int64': 'int64'} | ||
| 'int8': 'int32', 'int32': 'int64', 'int64': 'int64', 'bool': 'int64'} | ||
| for hybridize in [False, True]: |
There was a problem hiding this comment.
would using https://docs.python.org/3.7/library/itertools.html#itertools.product help readability of the code and make it less nested?
There was a problem hiding this comment.
Thanks for the suggestion. Will consider refactoring it in the following PRs.
c7e273c to
db57be4
Compare
marcoabreu
left a comment
There was a problem hiding this comment.
Approve for build system
db57be4 to
b49cfce
Compare
Add np.equal implemented using tvmop Fix setting DLDataType conversion for boolean ndarray Add equal_gpu Fix inputs with different ndims Fix copying boolean ndarrays across devices Refactor binary logic op impl by tvm Add more logic ops Refactor TVMOpModule::Call to CallEx Add binary scalar logic op expr and schedule Add binary scalar logic ops Add free functions for logic ops Rebase with master to fix SetDLTensor bug Fix pylint Add sum op for boolean ndarrays using tvm op module Add sum boolean gpu compute Add bool type support to boolean_mask Boolean indexing working Clean up Fix merge Sync Makefile Rebase Add boolean indexing test Fix sanity Fix gpu and add autograd test Rebase Fix test for windows Fix tests Try to fix cuda arch missing error in ci Fix ci Fix windows build Try to fix cmake Fix cmake Fix Revert config.mk
5dffe7a to
a6dac14
Compare
Description
np.bool_as their output tensors'dtype.Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Comments
Follow-up work includes:
ndarrayboolean indexingThank @yzhliu @hzfan for the help on debugging.