Skip to content

Workload updates to test and support OpenSearch Approximation Framework#655

Closed
prudhvigodithi wants to merge 7 commits intoopensearch-project:mainfrom
prudhvigodithi:main
Closed

Workload updates to test and support OpenSearch Approximation Framework#655
prudhvigodithi wants to merge 7 commits intoopensearch-project:mainfrom
prudhvigodithi:main

Conversation

@prudhvigodithi
Copy link
Copy Markdown
Member

Description

  • Workload Updates to Test Approximation Queries.
  • Updated Workloads with New Queries for Approximation Testing.
  • New Iterations and Queries Added to Test Approximation Framework.

Issues Resolved

Testing

  • New functionality includes testing

[Describe how this change was tested]

Backport to Branches:

  • 6
  • 7
  • 1
  • 2
  • 3

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@prudhvigodithi prudhvigodithi changed the title Workload Updates to Test Approximation Queries Workload Updates to Test and Support OpenSearch Approximation Framework Jun 4, 2025
@prudhvigodithi
Copy link
Copy Markdown
Member Author

Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com>
@prudhvigodithi
Copy link
Copy Markdown
Member Author

Not able to reproduce the lint failures on my local, @IanHoang @gkamat anything I'm missing?

 for file in http_logs/operations/default.json http_logs/test_procedures/default.json nyc_taxis/operations/default.json nyc_taxis/test_procedures/default.json pmc/operations/default.json pmc/test_procedures/default.json; do
    if [[ $file == *.json ]]; then
      echo "Checking $file"
      j2lint "$GITHUB_WORKSPACE/$file" --extensions json --json
    fi
  done
Checking http_logs/operations/default.json
{
  "ERRORS": [],
  "WARNINGS": []
}
Checking http_logs/test_procedures/default.json
{
  "ERRORS": [],
  "WARNINGS": []
}
Checking nyc_taxis/operations/default.json
{
  "ERRORS": [],
  "WARNINGS": []
}
Checking nyc_taxis/test_procedures/default.json
{
  "ERRORS": [],
  "WARNINGS": []
}
Checking pmc/operations/default.json
{
  "ERRORS": [],
  "WARNINGS": []
}
Checking pmc/test_procedures/default.json
{
  "ERRORS": [],
  "WARNINGS": []
}

@IanHoang
Copy link
Copy Markdown
Collaborator

IanHoang commented Jun 4, 2025

@prudhvigodithi Have you tried running them individually? The errors show up for me when I run it

j2lint ./http_logs/operations/default.json --extensions json --json
{
  "ERRORS": [
    {
      "id": "S1",
      "message": "A single space should be added between Jinja2 curly brackets and a variable name: {{ ethernet_interface }}",
      "filename": "default.json",
      "line_number": 4,
      "line": "      \"bulk-size\": {{bulk_size | default(5000)}},",
      "severity": "LOW"
    },
    {
      "id": "S1",
      "message": "A single space should be added between Jinja2 curly brackets and a variable name: {{ ethernet_interface }}",
      "filename": "default.json",
      "line_number": 5,
      "line": "      \"ingest-percentage\": {{ingest_percentage | default(100)}},",
      "severity": "LOW"
    },
    {
      "id": "S1",
      "message": "A single space should be added between Jinja2 curly brackets and a variable name: {{ ethernet_interface }}",
      "filename": "default.json",
      "line_number": 11,
      "line": "      \"bulk-size\": {{bulk_size | default(5000)}},",
      "severity": "LOW"
    },
    {
      "id": "S1",
      "message": "A single space should be added between Jinja2 curly brackets and a variable name: {{ ethernet_interface }}",
      "filename": "default.json",
      "line_number": 12,
      "line": "      \"ingest-percentage\": {{ingest_percentage | default(100)}},",
      "severity": "LOW"
    },
    {
      "id": "S1",
      "message": "A single space should be added between Jinja2 curly brackets and a variable name: {{ ethernet_interface }}",
      "filename": "default.json",
      "line_number": 19,
      "line": "      \"bulk-size\": {{bulk_size | default(5000)}},",
      "severity": "LOW"
    },
    {
      "id": "S1",
      "message": "A single space should be added between Jinja2 curly brackets and a variable name: {{ ethernet_interface }}",
      "filename": "default.json",
      "line_number": 20,
      "line": "      \"ingest-percentage\": {{ingest_percentage | default(100)}},",
      "severity": "LOW"
    },
    {
      "id": "S1",
      "message": "A single space should be added between Jinja2 curly brackets and a variable name: {{ ethernet_interface }}",
      "filename": "default.json",
      "line_number": 27,
      "line": "      \"bulk-size\": {{bulk_size | default(5000)}},",
      "severity": "LOW"
    },
    {
      "id": "S1",
      "message": "A single space should be added between Jinja2 curly brackets and a variable name: {{ ethernet_interface }}",
      "filename": "default.json",
      "line_number": 28,
      "line": "      \"ingest-percentage\": {{ingest_percentage | default(100)}},",
      "severity": "LOW"
    },
    {
      "id": "S1",
      "message": "A single space should be added between Jinja2 curly brackets and a variable name: {{ ethernet_interface }}",
      "filename": "default.json",
      "line_number": 35,
      "line": "      \"bulk-size\": {{bulk_size | default(5000)}},",
      "severity": "LOW"
    },
    {
      "id": "S1",
      "message": "A single space should be added between Jinja2 curly brackets and a variable name: {{ ethernet_interface }}",
      "filename": "default.json",
      "line_number": 36,
      "line": "      \"ingest-percentage\": {{ingest_percentage | default(100)}},",
      "severity": "LOW"
    },
    {
      "id": "S1",
      "message": "A single space should be added between Jinja2 curly brackets and a variable name: {{ ethernet_interface }}",
      "filename": "default.json",
      "line_number": 43,
      "line": "      \"bulk-size\": {{bulk_size | default(5000)}},",
      "severity": "LOW"
    },
    {
      "id": "S1",
      "message": "A single space should be added between Jinja2 curly brackets and a variable name: {{ ethernet_interface }}",
      "filename": "default.json",
      "line_number": 44,
      "line": "      \"ingest-percentage\": {{ingest_percentage | default(100)}},",
      "severity": "LOW"
    },
    {
      "id": "S1",
      "message": "A single space should be added between Jinja2 curly brackets and a variable name: {{ ethernet_interface }}",
      "filename": "default.json",
      "line_number": 45,
      "line": "      \"conflicts\": \"{{conflicts | default('random')}}\",",
      "severity": "LOW"
    },
    {
      "id": "S1",
      "message": "A single space should be added between Jinja2 curly brackets and a variable name: {{ ethernet_interface }}",
      "filename": "default.json",
      "line_number": 46,
      "line": "      \"on-conflict\": \"{{on_conflict | default('update')}}\",",
      "severity": "LOW"
    },
    {
      "id": "S1",
      "message": "A single space should be added between Jinja2 curly brackets and a variable name: {{ ethernet_interface }}",
      "filename": "default.json",
      "line_number": 47,
      "line": "      \"conflict-probability\": {{conflict_probability | default(25)}},",
      "severity": "LOW"
    },
    {
      "id": "S1",
      "message": "A single space should be added between Jinja2 curly brackets and a variable name: {{ ethernet_interface }}",
      "filename": "default.json",
      "line_number": 48,
      "line": "      \"recency\": {{recency | default(0)}},",
      "severity": "LOW"
    },
    {
      "id": "S1",
      "message": "A single space should be added between Jinja2 curly brackets and a variable name: {{ ethernet_interface }}",
      "filename": "default.json",
      "line_number": 92,
      "line": "              \"gte\": \"now-{{'15-05-1998' | days_ago(now)}}d/d\",",
      "severity": "LOW"
    }
  ],
  "WARNINGS": []
}

I'm pretty sure all the workloads do not adhere to lint standards but we should start following the ones where severity == HIGH.

I'll update the workflow to ignore low severity lint issues

@prudhvigodithi
Copy link
Copy Markdown
Member Author

Thanks @IanHoang now I was able to see, but looks like the errors are not related to my changed? can you confirm ?

@OVI3D0
Copy link
Copy Markdown
Member

OVI3D0 commented Jun 4, 2025

I'm pretty sure all the workloads do not adhere to lint standards but we should start following the ones where severity == HIGH.

I'll update the workflow to ignore low severity lint issues

Can the low priority issues be turned into warnings?

@prudhvigodithi prudhvigodithi force-pushed the main branch 2 times, most recently from ee0d8a5 to 5a56cbb Compare June 5, 2025 02:29
Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com>
@prudhvigodithi
Copy link
Copy Markdown
Member Author

prudhvigodithi commented Jun 5, 2025

Yes I can confirm the lint test failures are not related to the PR changes. For most of the existing workloads the formatting is not properly done.

I have fixed with this PR, please take a look.

Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com>
Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com>
Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com>
Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com>
@prudhvigodithi prudhvigodithi self-assigned this Jun 5, 2025
@prudhvigodithi prudhvigodithi changed the title Workload Updates to Test and Support OpenSearch Approximation Framework Workload Updates to test and support OpenSearch Approximation Framework Jun 5, 2025
@prudhvigodithi prudhvigodithi changed the title Workload Updates to test and support OpenSearch Approximation Framework Workload updates to test and support OpenSearch Approximation Framework Jun 5, 2025
@getsaurabh02 getsaurabh02 moved this from Todo to In Progress in Performance Roadmap Jun 5, 2025
Copy link
Copy Markdown
Collaborator

@IanHoang IanHoang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prudhvigodithi Seeing the changes to default values in pre-existing operations, it might be better to split up the actions in this PR into two separate PRs (will help with roll-backs if needed in the future).

First PR focuses on adding new operations to workloads while second PR focuses on proposing modifications to existing default values. See comment regarding changing pre-existing default values in range operation in http_logs.

"warmup-iterations": {{ range_warmup_iterations or warmup_iterations | default(100) | tojson }},
"iterations": {{ range_iterations or iterations | default(100) | tojson }},
"target-throughput": {{ range_target_throughput or target_throughput | default(1) | tojson }},
"target-throughput": {{ range_target_throughput or target_throughput | default(2) | tojson }},
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prudhvigodithi was this arbitrarily changed? Seems like it was done to match the range operations added below.

@OVI3D0 brought up a good point offline: if this isn't backported to previous PRs, there will be a discrepancy when comparing range operations across OpenSearch versions. We should also be careful about making changes to default values in pre-existing operations for legacy workloads.

"warmup-iterations": {{ desc_sort_size_warmup_iterations or warmup_iterations | default(200) | tojson }},
"iterations": {{ desc_sort_size_iterations or iterations | default(100) | tojson }},
"target-throughput": {{ desc_sort_size_target_throughput or target_throughput | default(0.5) | tojson }},
"target-throughput": {{ desc_sort_size_target_throughput or target_throughput | default(2) | tojson }},
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above for range

"warmup-iterations": {{ asc_sort_tip_amount_warmup_iterations or warmup_iterations | default(50) | tojson }},
"iterations": {{ asc_sort_tip_amount_iterations or iterations | default(100) | tojson }},
"target-throughput": {{ asc_sort_tip_amount_target_throughput or target_throughput | default(0.5) | tojson }},
"target-throughput": {{ asc_sort_tip_amount_target_throughput or target_throughput | default(2) | tojson }},
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above for range in http_logs

@prudhvigodithi
Copy link
Copy Markdown
Member Author

I assume the lint errors are fixed with this PR #662. Let me close this and open 2 PR's

  • 1st PR for adding new operations to workloads.
  • 2nd PR for proposing modifications to existing default values.

@prudhvigodithi
Copy link
Copy Markdown
Member Author

Here is the 1st PR #674 for adding new operations to workloads.
@IanHoang @OVI3D0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: ✅ Done
Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants