[BUG] synonym_graph filter fails with word_delimiter_graph when using whitespace or classic tokenizer in synonym_analyzer – similar to #16263

### Describe the bug

I’m encountering a bug similar to [#16263](https://github.com/opensearch-project/OpenSearch/issues/16263) while configuring analyzers that use both word_delimiter_graph and synonym_graph. I'm currently migrating from Solr to OpenSearch 2.19 and encountered a limitation while working with the synonym_graph filter that uses a custom synonym_analyzer(whitespace tokenizer).

When I define a simple synonym analyzer using the whitespace tokenizer (i.e., no_split_synonym_analyzer) and apply the synonym_graph filter using this, everything works as expected.

However, the moment I add any additional filters such as word_delimiter_graph, asciifolding, or hunspell, I encounter the following error:
```
{
  "error": {
    "root_cause": [
      {
        "type": "illegal_argument_exception",
        "reason": "Token filter [custom_word_delimiter] cannot be used to parse synonyms"
      }
    ],
    "type": "illegal_argument_exception",
    "reason": "Token filter [custom_word_delimiter] cannot be used to parse synonyms"
  },
  "status": 400
}

```
Use Case
In our Solr configuration, we handle synonym normalization for terms like:

```
covid, covid-19, covid 19

skydiving, sky diving, sky-diving

handheld, hand-held
```

This works seamlessly there even when using filters like WordDelimiterGraphFilterFactory, Hunspell, etc.

We want to achieve similar behavior in OpenSearch, including using a synonym_graph filter with a custom analyzer that includes:

word_delimiter_graph (with preserve_original or catenate_all)

asciifolding (with preserve_original)

hunspell

and a pattern_replace filter

Sample Config (Works):

```
"analyzer": {
  "test_analyzer": {
    "type": "custom",
    "tokenizer": "whitespace",
    "filter": [
      "lowercase",
      "custom_synonym_graph-replacement_filter"
    ]
  },
  "no_split_synonym_analyzer": {
    "type": "custom",
    "tokenizer": "whitespace"
  }
}
```
Sample Config (Fails):
When adding custom_word_delimiter, asciifolding, or hunspell to the same analyzer:

```
"test_analyzer": {
  "type": "custom",
  "tokenizer": "whitespace",
  "filter": [
    "lowercase",
    "custom_word_delimiter",
    "custom_hunspell_stemmer",
    "custom_synonym_graph-replacement_filter"
  ]
}
```
Results in:

Token filter [custom_word_delimiter] cannot be used to parse synonyms

It would be great if OpenSearch could enhance the synonym_graph behavior to:

Allow more flexible use of filters in synonym_analyzer, especially word_delimiter_graph, which is commonly used in language normalization pipelines.

A similar issue was resolved in the past here: https://github.com/opensearch-project/OpenSearch/issues/16263 — perhaps this one can be handled in a similar fashion.

### Related component

Indexing

### To Reproduce

**Create Mapping**

```
{
  "settings": {
    "analysis": {
        "char_filter": {
        "custom_pattern_replace": {
          "type": "pattern_replace",
          "pattern": "[({.,\\[\\]“”/})]",
          "replacement": " "
        }
        },
      "filter": {
        "custom_ascii_folding": {
          "type": "asciifolding",
          "preserve_original": true
        },
        "custom_pattern_replace_filter": {
          "type": "pattern_replace",
          "pattern": "(-)",
          "replacement": " ",
          "all": true
        },
        "custom_synonym_graph-replacement_filter": {
          "type": "synonym_graph",
          "synonyms": [
            "laptop, notebook",
            "covid, covid-19, covid 19",
            "skydiving,sky diving,sky-diving",
            "handheld,hand-held"
          ],
         "synonym_analyzer": "no_split_synonym_analyzer"
        },
         "custom_word_delimiter": {
          "type": "word_delimiter_graph",
          "generate_word_parts": true,
          "catenate_all": true,
          "split_on_numerics": false,
          "split_on_case_change": false
        },
        "custom_hunspell_stemmer": {
          "type": "hunspell",
          "locale": "en_US"
        }
      },
      "analyzer": {
        "test_analyzer": {
          "type": "custom",
          "char_filter": [
            "custom_pattern_replace"
          ],
          "tokenizer": "whitespace",
          "filter": [
            "custom_ascii_folding",
            "lowercase",
            "custom_word_delimiter",
            "custom_hunspell_stemmer",
            "custom_synonym_graph-replacement_filter",
            "custom_pattern_replace_filter",
            "flatten_graph"
          ]
        },
         "no_split_synonym_analyzer":{
            "type":"custom",
            "tokenizer":"whitespace"
        }
      }
    }
  }
}

```

**Error:**
```
{
    "error": {
        "root_cause": [
            {
                "type": "illegal_argument_exception",
                "reason": "Token filter [custom_word_delimiter] cannot be used to parse synonyms"
            }
        ],
        "type": "illegal_argument_exception",
        "reason": "Token filter [custom_word_delimiter] cannot be used to parse synonyms"
    },
    "status": 400
}
```

### Expected behavior

- The synonym_graph filter with whitespace/classic tokenizer should support analyzers that use filters like word_delimiter_graph, asciifolding, or hunspell in the main analyzer chain.
- It should not throw errors when a custom synonym_analyzer is provided.
- Currently, it works only if the synonym_analyzer uses the standard tokenizer with other filters.
- It should also work with whitespace or classic tokenizer, allowing more flexibility

### Additional Details

**Host/Environment (please complete the following information):**
 - Opensearch Version:2.19



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] synonym_graph filter fails with word_delimiter_graph when using whitespace or classic tokenizer in synonym_analyzer – similar to #16263 #18037

Describe the bug

Related component

To Reproduce

Expected behavior

Additional Details

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG] synonym_graph filter fails with word_delimiter_graph when using whitespace or classic tokenizer in synonym_analyzer – similar to #16263 #18037

Description

Describe the bug

Related component

To Reproduce

Expected behavior

Additional Details

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions