Collection of Bug Reports and Potential Fixes - Request for Support


Hello Marabou Team,

this ticket reports (and proposes fixes for) some bugs that arose during the attempt to integrate a generative flow model into Marabou.
Since the integration of the generative flow model did not succeed, we also hope to receive some advice on how to integrate the neural network(s). In particular,  most valuable to us is how to fix the Problem in Bug 2. 

Our goal is to verify a property on a combination of two neural networks, one flow model and one classifier. While the
exact verification property is irrelevant, it is important to node that the overall goal is to wire the outputs of the 
flow network into the inputs of the classifier network.

To this end, we have tried to integrate two different architectures of flow models into Marabou, one that contains convolutions and one that does not have any convolutions.
We could not succeed with either of the flow models. The problems, along with potential fixes for some of them, are described below.
I am aware that some of the bug description are not very simple; It was unfortunately not possible to simplify them further.
However, I can provide better and more detailed descriptions wherever helpful. 
All neural networks and scripts referenced in this ticket can be found in the attachment.
[Bugreport.zip](https://github.com/user-attachments/files/19931009/Bugreport.zip)

# Setup:
I use Python 3.10, Ubuntu 22.04.
Everything in this ticket refers to the latest available PIP package of Marabou (https://pypi.org/project/maraboupy/).
With that said, I also built Marabou from source and also tried out the Gurobi backend.
However, all bugs that are reported here, also occurred on the manual installation of marabou.
I report the bugs based on the PIP installation of Marabou to avoid concerns regarding incorrect installation.

I also already applied the patch from the ticket on [mulEquations](https://github.com/NeuralNetworkVerification/Marabou/issues/751). 
The bug seems to have been fixed in the main branch, but not yet in the PIP package of marabou.

# Without Convolutions
This section is focuses on the flow models that have no convolutions in them.
The first bug (1) reports the simple problem that the constant input is expected to be at the second position in the Marabou parser.
The second bug (2) reports that wiring the output of the flow model into the input of the classifier, results in 
assignment to the variables, that do not satisfy the constraints.

## 1. Parsing SimpleFlow - Constant input expected to be second parameter
Marabou assumes that in the onnx mul node, the variable input is the first parameter, and the constant input is always the second parameter.
However, this is not always the case, an example is `initial_simple_flow.onnx`.
Steps to reproduce the error:
1. Run the script `Bug1.py`, make sure the network referenced in the script in the respective directory.
2. Expected behavior: network is parsed without error
3. Actual behavior: the following error message appears: ``line 1057, in mulEquations multiple = self.constantMap[inputName2] KeyError: '/conditioner/layers.2/Add_output_0'``

As a workaround, the inputs for all `mul` nodes are swapped in `initial_simple_flow.onnx` if the order is wrong, and saved as network by the name `simple_flow_swp_inputs.onnx`.

## 2. Verification of Network with multiple inputs (possibly incorrect SAT witness)
We use the script `stitch_side_by_side.py` to stitch the neural networks `simple_flow_swp_inputs.onnx` and 
the classifier `classifier.onnx` into one new onnx file named `merged_sideways_swp_inputs.onnx`. Consequently, `merged_sideways_swp_inputs.onnx` has two inputs and two outputs.
Please view the `merged_sideways_swp_inputs.onnx` network, for example on netron for a better understanding.
This subsection focuses only on the `merged_sideways_swp_inputs.onnx` neural network.

On a high level, we observe the following bug:
Pushing a tensor through the flow model in `merged_sideways_swp_inputs.onnx` gives a
different output depending on whether the output of the flow model is wired into the classifier or not.
To show this, the script `bug2.py` first runs a tensor through the inputs of the neural network without rewiring anything and saves the outputs. 
Then, after wiring the output of the flow into the input of the classifier,  it runs the same tensor through the flow 
model and observes different assignments for the outputs of the flow model.
In both cases, the ouputs of the flow model were unambiguously defined by its inputs, therefore they shouldn't differ, which they do.
To make sure that the bug is not on our side, we saved the input queries before and after rewiring and confirmed that the 
additional equations correctly capture the rewiring as expected.

To be more precise, the script `bug2.py` implements the following procedure:

1. Parse the network `merged_sideways_swp_inputs.onnx` in Marabou.
2. Set the input values in the parsed network all to 0 (both for `flowonnx::MatMul_0` and `clfonnx::MatMul_0`)
3. Solve the network.
4. Save the value of output node `flow80`, which is unambiguously determined by the input all-zero input to `flowonnx::MatMul_0`. This output will serve as reference.
5. Parse the network `merged_sideways_swp_inputs.onnx` in Marabou, again, creating a new instance.
6. Set the input variables of the input node `flowonnx::MatMul_0` to zero. Set very high and very low bounds (+-100000) to the input of the classifier `clfonnx::MatMul_0`.
7. Set the input variables of the input node `clfonnx::MatMul_0` equal to the output variables of the flow model `flow80`
8. Solve the network
9. Save the values of the output node `flow80`
10. Compare the values of the output node `flow80` from step 4, and the values of the output node `flow80` from step 9.

Expected behavior: The output values should be the same, because in both cases the input values to the flow model are the same.
Actual behavior: The output values differ. In particular, they differ only in the first 50 values and are equal in the last 50 values of the vector. 


### 2.1 Potential workaround failed + Bug report for flatten/concat
Speculating that marabou might not be able to handle multiple inputs, we had also created a network named `flatten_concat_merged_network.onnx`
that is semantically equivalent to `merged_sideways_swp_inputs.onnx` but the only difference that the inputs and outputs 
are merged together, making use of additional nodes to mask and concatenate the relevant parts of the input/output.
For a deeper understanding, view both networks with netron.
This, however, did not solve the problem from above, resulting in the same behaviour as before.
The script `bug2.1.py` shows the exact same error as in `bug2.py`

On a sidenote:  The network `flatten_concat_merged_network.onnx` can not be parsed at all. This error can be reproduced using the `bug1.py` script.
The bug seems to be because Marabou handles `flatten` nodes differently than ONNX, as documented in the `flatten` function of the ONNX parser in Marabou.
As a quick-and-dirty fix/workaround, we hardcoded the axis in the method `concatEquations(..)` to axis=1, despite the networks actual axis concat value being 0.


# With Convolutions 
In this section, only the two neural networks `conv_with_error.onnx` and `conv_without_error.onnx` are relevant, all other neural networks can be disregarded. 
Both networks are very similar to each other. The only difference is that `conv_with_error.onnx` contains a single 
convolutional node more than `conv_without_error.onnx`, which causes the error in Bug 4 to occur.
Unfortunately, this bug report could not be simplified by simply extracting only the trouble-making convolutional layer into a new network.
That is because the final convolutional layer by itself, does not trigger the bug, only in the context of the full network this error occurs.

## 3. Bug/feature support:  Broadcasting in `mul` and `sub` nodes
Parsing both models `conv_with_error.onnx` and `conv_without_error.onnx` fails. 
For example, when parsing `conv_without_error.onnx` Marabou will crash with an IndexError in the `mulEquations()` method. 
This is because of `mul` and `sub` nodes in the neural network that rely on broadcasting. 
we changed the methods `mulEquations()` and `subEquations()` to also support broadcasting and attached my implementation 
of these methods in the script named `mul-sub-equations-suggestion.py`.
We tested the changes on multiple neural networks successfully and continued to work with these changes in the following.

## 4. Potentially Incorrect SAT witness
The script `bug4.py` compares for each of the neural networks  `conv_with_error.onnx` and `conv_without_error.onnx` the 
output of Marabou with the respective output of the onnx library for the same input.
The script shows that while for the network `conv_without_error.onnx`, Marabou produces the same outputs, the outputs
of Marabou differ from the onnx-runtime outputs for the other network `conv_with_error.onnx`.
Comparing the input queries of both networks (kdiff3), and manually re-calculating the first equation of the additional
convolutional node in the erroneous network `conv_with_error.onnx` shows that the computed assignments of Marabou for 
`conv_with_error.onnx` do not satisfy the constraints in the query.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Collection of Bug Reports and Potential Fixes - Request for Support #879

Setup:

Without Convolutions

1. Parsing SimpleFlow - Constant input expected to be second parameter

2. Verification of Network with multiple inputs (possibly incorrect SAT witness)

2.1 Potential workaround failed + Bug report for flatten/concat

With Convolutions

3. Bug/feature support: Broadcasting in `mul` and `sub` nodes

4. Potentially Incorrect SAT witness

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Collection of Bug Reports and Potential Fixes - Request for Support #879

Description

Setup:

Without Convolutions

1. Parsing SimpleFlow - Constant input expected to be second parameter

2. Verification of Network with multiple inputs (possibly incorrect SAT witness)

2.1 Potential workaround failed + Bug report for flatten/concat

With Convolutions

3. Bug/feature support: Broadcasting in mul and sub nodes

4. Potentially Incorrect SAT witness

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

3. Bug/feature support: Broadcasting in `mul` and `sub` nodes