-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Description
Author: Kaushal Kumar
Pre-read for this issue: #16797
Please describe the end goal of this project
This document covers the Rule Matching for incoming search requests in depth. After reading this document, reader will have sound understanding of the following
- How the request attributes are extracted ?
- How exactly does the matching algorithm works on the extracted attributes ?
- What are the new constructs and how they work with each other ?
Assumptions
- Rule attributes (cardinality) will be limited in nature at feature level and aggregated count.
- When creating Rules the control will first go to the feature to do the
- Cardinality validation
- Attributes validation
- When creating Rules the control will first go to the feature to do the
- Each feature will define their own attributes which the feature will be responsible for validating.
New Rule insertion process
Rule schema
{
"attribute1": ["value*"],
"attribute2": ["value*"],
"label": "fjagjag9243421_425285",
"updatedAt": "12-03-2024T18:00:23Z",
"feature": "WLM"
}
New rule insertion works pretty much how insertion in compressed trie works. But given each attribute can take N(this will be limited e,g; 5 at max) number there is a chance that two or more rules could share one of the values in their respective attribute values list.
In this case whichever rule was created or modified last will override the label for that attribute in the trie.
Attribute extraction process
Given that the Rule attributes are specific to a feature, feature will provide a construct to extract the attributes. For example WLM feature would use threadContext to retrieve the authN info and request itself to retrieve the indices.
Rule Matching Process
In a nutshell the rule matching will be per attribute trie (prefix tree) based matching. Hence ultimately we will be dealing with a list of values per attribute. How this per attribute based list is calculated, we will understand that with the help of some examples

lets say above picture captures a subpart of the in-memory rules Trie. Now following scenarios can occur while calculating possible labels for an attribute value.
Scenario - 1
When the search for the attribute value ends on a node which has a label
search_value: log_q3_2024
In this case the resulting label would be abc.
Scenario - 2
When the search ends on a node which doesn’t have a label then there are following sub scenarios
- There is an empty prefix match for this value i,e the search ends at the root node which holds the empty value ( e,g;
order_q3_2024)- In this case we assign nothing and downstream systems can treat this as default behavior
- There is a partial match, then it should return top 10 results from the subtree. The order is decided by depth of the node.
- for example for search string
logs_q2_2024, in this case it would return[abc, pqr]
- for example for search string
Now the above run down shows how the labels are decided per attribute value but given the request carries multiple values and rule also contains multiple attributes, how to calculate the final value. Now we will select the one which is occurring in all attribute based return label lists .
Lets say for a feature based label evaluation we are considering two attributes and corresponding labels as per the above algorithm are [abc, xyz] and [abc,pqr] now the resulting label here will be abc for this request.
Even with multiple lists following scenarios can occur
- All lists have nothing in common ==>> In this case the request will be treated with a default label
- All lists have only one value in common ==>> This will be the resultant label for the request
- All lists have multiple values in common ==>> In this case to decide the resulting label. we will consider the depth as the tie breaker i,e; label with the min depth will be selected.
Low Level Design
Following UML and interaction diagrams shows the new constructs, their relationships and how they interact with each other.


InMemoryRuleEngine will save the label in ThreadContext which downstream transport actions can consume.
Supporting References
Issues
Related component
Search
Metadata
Metadata
Assignees
Labels
Type
Projects
Status