[Python API] Fix: Create constant for strings and fix segfault #33574
[Python API] Fix: Create constant for strings and fix segfault #33574KarSri7694 wants to merge 25 commits intoopenvinotoolkit:masterfrom
Conversation
This reverts commit bfb1a1c.
- Now it will make "" instead of "None"
…crash but gives a warning when shared_memory = true and data is string
|
Hello, I would request maintainers to review this code and please tell me if any more changes are required. The code currently handles shared_memory = True, by giving a warning using std::cerr, is this correct way? |
p-wysocki
left a comment
There was a problem hiding this comment.
Overall LGTM, thank you for your contribution! I added a few comments before approval.
| std::cerr << "Warning: Creating a String Constant with shared memory is not supported. " | ||
| "Data will be copied." | ||
| << std::endl; |
There was a problem hiding this comment.
<iostream> is a big header which we don't really need included in our bindings. Instead of std::cerr please use OPENVINO_WARN macro. Example usage: https://github.com/openvinotoolkit/openvino/blob/master/src/bindings/python/src/pyopenvino/graph/node_factory.cpp#L92
| } | ||
| // convert NumPy array to flattened list of strings | ||
| std::string* tensor_data = tensor.data<std::string>(); | ||
| // std::vector<std::string> strings; |
There was a problem hiding this comment.
| // std::vector<std::string> strings; |
|
Hey @p-wysocki , I will resolve the issue by today |
There was a problem hiding this comment.
Pull request overview
This PR fixes a segmentation fault when creating constants from NumPy arrays containing strings. It adds support for string-based NumPy arrays in the Python API constructor and converts None values to empty strings.
Changes:
- Added string array handling logic in the C++ constant constructor to detect and process string/object dtype arrays
- Implemented conversion of
Nonevalues to empty strings and type conversion for mixed-type object arrays - Added comprehensive test coverage for string constants including edge cases (empty arrays, None values, Unicode, mixed types)
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| src/bindings/python/src/pyopenvino/graph/ops/constant.cpp | Implements string array detection and conversion logic in the constant constructor |
| src/bindings/python/tests/test_graph/test_constant.py | Adds 8 test cases covering various string constant scenarios |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| auto flat_array = array.reshape({-1}); | ||
|
|
||
| // copy data | ||
| for (int i = 0; i < flat_array.size(); ++i) { |
There was a problem hiding this comment.
Use size_t or py::ssize_t instead of int for the loop counter. The flat_array.size() returns a size type, and using int could cause issues with large arrays or sign comparison warnings.
| for (int i = 0; i < flat_array.size(); ++i) { | |
| for (py::ssize_t i = 0; i < flat_array.size(); ++i) { |
| # Note: Warning goes to stderr, check if it appears | ||
| captured = capfd.readouterr() | ||
| # The warning should be printed to stderr | ||
| assert "Warning" in captured.err or len(captured.err) == 0 # May not always capture in test |
There was a problem hiding this comment.
This assertion doesn't reliably verify the warning behavior. The condition or len(captured.err) == 0 makes the test pass even when no warning is produced, defeating the purpose of testing the warning. Consider using pytest's warns context manager or making the assertion more strict.
| } | ||
| // convert NumPy array to flattened list of strings | ||
| std::string* tensor_data = tensor.data<std::string>(); | ||
| // std::vector<std::string> strings; |
There was a problem hiding this comment.
Remove commented-out code. This appears to be leftover from development and should be cleaned up.
| // std::vector<std::string> strings; |
- OPENVINO_WARN outputs to stderr and is not visible when calling from python, so this method is now used. It is still kept in code and can be used while debugging when ENABLE_OPENVINO_DEBUG flag is set to ON
|
Hey @p-wysocki, I have made the necessary changes, please review. I have added both OPENVINO_WARN and PyErr_WarnEx. I have also updated the test_constant.py to detect updated RuntimeWarning using pytest.warns() |
| if (shared_memory) { | ||
| OPENVINO_WARN( | ||
| "Creating a String Constant with shared memory is not supported. Data will be copied."); | ||
| PyErr_WarnEx(PyExc_RuntimeWarning, | ||
| "Creating a String Constant with shared memory is not supported. Data will be copied.", | ||
| 1); | ||
| } |
There was a problem hiding this comment.
| if (shared_memory) { | |
| OPENVINO_WARN( | |
| "Creating a String Constant with shared memory is not supported. Data will be copied."); | |
| PyErr_WarnEx(PyExc_RuntimeWarning, | |
| "Creating a String Constant with shared memory is not supported. Data will be copied.", | |
| 1); | |
| } | |
| if (shared_memory) { | |
| OPENVINO_WARN( | |
| "Creating a String Constant with shared memory is not supported. Data will be copied."); | |
| } |
-The test currently does not assert the statements caught by capfd, this is because the logs are getting supressed in windows build and i cannot find a way to fix it
|
Hello, |
|
@KarSri7694 I see you attempted it with |
|
Linters failed: tests/test_graph/test_constant.py:419:121: E501 line too long (154 > 120 characters)
[4.75, 4.5, -5.25, 0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1, -0.0, -0.1, -0.2, -0.3, -0.4, -0.5, -0.6, -0.7, -0.8, -0.9, -1, 448, 512],
^
tests/test_graph/test_constant.py:538:121: E501 line too long (153 > 120 characters)
[4.75, 4.5, 5.25, 0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1, -0.0, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2.0, 448, 512, np.nan],
^
tests/test_graph/test_constant.py:552:121: E501 line too long (160 > 120 characters)
target = [4.0, 4.0, 4.0, 0.0, 0.125, 0.25, 0.25, 0.5, 0.5, 0.5, 0.5, 1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 512, 512, np.nan]
^
tests/test_graph/test_constant.py:661:121: E501 line too long (154 > 120 characters)
[4.75, 4.5, -5.25, 0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1, -0.0, -0.1, -0.2, -0.3, -0.4, -0.5, -0.6, -0.7, -0.8, -0.9, -1, 448, 512],
^
tests/test_graph/test_constant.py:722:121: E501 line too long (153 > 120 characters)
[4.75, 4.5, 5.25, 0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1, -0.0, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2.0, 448, 512, np.nan],
^
tests/test_graph/test_constant.py:737:121: E501 line too long (160 > 120 characters)
target = [4.0, 4.0, 4.0, 0.0, 0.125, 0.25, 0.25, 0.5, 0.5, 0.5, 0.5, 1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 512, 512, np.nan]
^
tests/test_graph/test_constant.py:1029:5: F841 local variable 'captured' is assigned to but never used
captured = capfd.readouterr()
^You can run linters by going to the Python directory and running |
I looked into using pytest.warns, but there is a compatibility issue:
To use pytest.warns as requested, I would need to revert the implementation to use PyErr_WarnEx. To keep OPENVINO_WARN, I must use capfd to verify the log output. Please let me know which implementation you prefer. |
Will fix it |
|
Hey @p-wysocki, I have implemented the |
p-wysocki
left a comment
There was a problem hiding this comment.
LGTM, please take a look at my last nit. Thank you for your contribution!
|
|
||
| #include "openvino/core/shape.hpp" | ||
| #include "openvino/runtime/tensor.hpp" | ||
| #include "openvino/util/log.hpp" |
There was a problem hiding this comment.
Which utility is this required for? PyErr_WarnEx is used, but it's a pybind util. If it's not necessary, please remove it.
There was a problem hiding this comment.
Previously I was using OPENVINO_WARN, then switched to PyErr_WarnEx, but forgot to remove the header, Will remove its no longer required.
|
I have made the required changes and removed log header. |
|
Hey @p-wysocki , |
| return std::make_shared<ov::op::v0::Constant>( | ||
| Common::object_from_data<ov::op::v0::Constant>(array, shared_memory)); |
There was a problem hiding this comment.
The return statement creates a Constant from another Constant object returned by object_from_data. Since object_from_data already returns an ov::op::v0::Constant, wrapping it in another std::make_shared<ov::op::v0::Constant>() is redundant and may cause issues. Return the result directly instead.
| return std::make_shared<ov::op::v0::Constant>( | |
| Common::object_from_data<ov::op::v0::Constant>(array, shared_memory)); | |
| return Common::object_from_data<ov::op::v0::Constant>(array, shared_memory); |
This PR fixes #23611
Related tests have also been added to test_constant.py.