Update Fuzzy Query docs to clarify default behavior re max_expansions#30819
Update Fuzzy Query docs to clarify default behavior re max_expansions#30819jtibshirani merged 2 commits intoelastic:6.2from wpbonelli:patch-1
Conversation
Stating that the Fuzzy Query generates "all possible" matching terms is misleading, given that the query's default behavior is to generate a maximum of 50 matching terms. Maybe it's worthwhile to use one of those Warning boxes somewhere towards the top of the page to clarify this? I suspect many users will read "generates all possible matching terms" and immediately begin using the query, expecting it to generate all possible terms by default (having done exactly that myself).
|
Pinging @elastic/es-search-aggs |
There was a problem hiding this comment.
Thank you @w-bonelli for the PR, and I’m sorry this review is coming so late.
I agree that this wording is confusing and I think your edit helps. Instead of a warning box, what do you think about also updating the end of the paragraph to mention max_expansions? We could add something like ‘The final query is executed using up to max_expansions matching terms’.
Lastly, would you be able to sign the contributor license agreement? The build check is currently failing.
|
Thanks @jtibshirani for your reply, I've signed the contributor agreement and updated the PR per your suggestion. Incidentally, I think a single line like this could also work: The Either seems sufficient clarification though. |
|
It looks good to me -- I'll merge this and cherry-pick it to the right branches. I prefer the current wording since it's more precise: we don't first generate |
* master: Logging: Make node name consistent in logger (#31588) Mute SSLTrustRestrictionsTests on JDK 11 Increase max chunk size to 256Mb for repo-azure (#32101) Docs: Fix README upgrade mention (#32313) Changed ReindexRequest to use Writeable.Reader (#32401) Mute KerberosAuthenticationIT Fix AutoIntervalDateHistogram.testReduce random failures (#32301) fix no=>not typo (#32463) Mute QueryProfilerIT#testProfileMatchesRegular() HLRC: Add delete watch action (#32337) High-level client: fix clusterAlias parsing in SearchHit (#32465) Fix calculation of orientation of polygons (#27967) [Kerberos] Add missing javadocs (#32469) [Kerberos] Remove Kerberos bootstrap checks (#32451) Make get all app privs requires "*" permission (#32460) Switch security to new style Requests (#32290) Switch security spi example to new style Requests (#32341) Painless: Add PainlessConstructor (#32447) update rollover to leverage write-alias semantics (#32216) Update Fuzzy Query docs to clarify default behavior re max_expansions (#30819) INGEST: Clean up Java8 Stream Usage (#32059) Ensure KeyStoreWrapper decryption exceptions are handled (#32464)
* 6.x: Fix scriptdocvalues tests with dates Correct minor typo in explain.asciidoc for HLRC Fix painless whitelist and warnings from backporting #31441 Build: Add elastic maven to repos used by BuildPlugin (#32549) Scripting: Conditionally use java time api in scripting (#31441) [ML] Improve error when no available field exists for rule scope (#32550) [ML] Improve error for functions with limited rule condition support (#32548) [ML] Remove multiple_bucket_spans [ML] Fix thread leak when waiting for job flush (#32196) (#32541) Painless: Clean Up PainlessField (#32525) Add @AwaitsFix for #32554 Remove broken @link in Javadoc Add AwaitsFix to failing test - see #32546 SQL: Added support for string manipulating functions with more than one parameter (#32356) [DOCS] Reloadable Secure Settings (#31713) Fix compilation error introduced by #32339 [Rollup] Remove builders from TermsGroupConfig (#32507) Use hostname instead of IP with SPNEGO test (#32514) Switch x-pack rolling restart to new style Requests (#32339) [DOCS] Small fixes in rule configuration page (#32516) Painless: Clean up PainlessMethod (#32476) SQL: Add test for handling of partial results (#32474) Docs: Add missing migration doc for logging change Build: Remove shadowing from benchmarks (#32475) Docs: Add all JDKs to CONTRIBUTING.md Logging: Make node name consistent in logger (#31588) High-level client: fix clusterAlias parsing in SearchHit (#32465) REST high-level client: parse back _ignored meta field (#32362) backport fix of reduceRandom fix (#32508) Add licensing enforcement for FIPS mode (#32437) INGEST: Clean up Java8 Stream Usage (#32059) (#32485) Improve the error message when an index is incompatible with field aliases. (#32482) Mute testFilterCacheStats Scripting: Fix painless compiler loader to know about context classes (#32385) [ML][DOCS] Fix typo applied_to => applies_to Mute SSLTrustRestrictionsTests on JDK 11 Changed ReindexRequest to use Writeable.Reader (#32401) Increase max chunk size to 256Mb for repo-azure (#32101) Mute KerberosAuthenticationIT fix no=>not typo (#32463) HLRC: Add delete watch action (#32337) Fix calculation of orientation of polygons (#27967) [Kerberos] Add missing javadocs (#32469) Fix missing JavaDoc for @throws in several places in KerberosTicketValidator. Make get all app privs requires "*" permission (#32460) Ensure KeyStoreWrapper decryption exceptions are handled (#32472) update rollover to leverage write-alias semantics (#32216) [Kerberos] Remove Kerberos bootstrap checks (#32451) Switch security to new style Requests (#32290) Switch security spi example to new style Requests (#32341) Painless: Add PainlessConstructor (#32447) Update Fuzzy Query docs to clarify default behavior re max_expansions (#30819) Remove > from Javadoc (fatal with Java 11) Tests: Fix convert error tests to use fixed value (#32415) IndicesClusterStateService should replace an init. replica with an init. primary with the same aId (#32374) auto-interval date histogram - 6.x backport (#32107) [CI] Mute DocumentSubsetReaderTests testSearch [TEST] Mute failing InternalEngineTests#testSeqNoAndCheckpoints TEST: testDocStats should always use forceMerge (#32450) TEST: Avoid deletion in FlushIT AwaitsFix IndexShardTests#testDocStats Painless: Add method type to method. (#32441) Remove reference to non-existent store type (#32418) [TEST] Mute failing FlushIT test Fix ordering of bootstrap checks in docs (#32417) Wrong discovery.type for azure in breaking changes (#32432) Mute ConvertProcessorTests failing tests TESTS: Move netty leak detection to paranoid level (#32354) (#32425) Upgrade to Lucene-7.5.0-snapshot-608f0277b0 (#32390) [Kerberos] Avoid vagrant update on precommit (#32416) TEST: Avoid triggering merges in FlushIT [DOCS] Fixes formatting of scope object in job resource Switch x-pack/plugin to new style Requests (#32327) Release requests in cors handle (#32410) Remove BouncyCastle dependency from runtime (#32402) Copy missing segment attributes in getSegmentInfo (#32396) Rest HL client: Add put license action (#32214) Docs: Correcting a typo in tophits (#32359) Build: Stop double generating buildSrc pom (#32408) Switch x-pack full restart to new style Requests (#32294) Painless: Clean Up PainlessClass Variables (#32380) [ML] Consistent pattern for strict/lenient parser names (#32399) Add Restore Snapshot High Level REST API Update update-settings.asciidoc (#31378) Introduce index store plugins (#32375) Rank-Eval: Reduce scope of an unchecked supression Make sure _forcemerge respects `max_num_segments`. (#32291)
Stating that the Fuzzy Query generates "all possible" matching terms is misleading, given that the query's default behavior is to generate a maximum of 50 matching terms.
Maybe it's worthwhile to use one of those Warning boxes somewhere towards the top of the page to clarify this? I suspect many users will read "generates all possible matching terms" and immediately begin using the query, expecting it to generate all possible terms by default (having done exactly that myself).