[ML] Copy more settings when creating DF analytics destination index#91546
Merged
edsavage merged 4 commits intoelastic:mainfrom Nov 14, 2022
Merged
[ML] Copy more settings when creating DF analytics destination index#91546edsavage merged 4 commits intoelastic:mainfrom
edsavage merged 4 commits intoelastic:mainfrom
Conversation
Currently, when a data frame analytics job is created, just two settings from the source index are copied to the auto-created destination index - index.number_of_shards and index.number_of_replicas. To cater for slightly more complex source indices this PR makes changes to also copy/merge additional settings from the source indices to the destination index - index.analysis, index.similarity and index.mapping. In the case of the index.mapping settings, when multiple source indices are involved, the settings are merged in a similar manner as for index.number_of_shards & index.number_of_replicas, i.e. by taking the maximum value of the setting across all source indices. For index.similarity, when merging multiple indices, the similarity objects must be identical else an exception is thrown. index.analysis is comprised of sub-objects index.analysis.filter and index.analysis.analyzer, which may in turn be comprised of multiple filter and analyzer objects. The merge procedure here is to throw an exception if identically named objects differ in content, else all filter and analyzer objects are copied over to the destination index.
Collaborator
|
Pinging @elastic/ml-core (Team:ML) |
Collaborator
|
Hi @edsavage, I've created a changelog YAML for you. |
…icsearch into transforms_merge_settings
weizijun
added a commit
to weizijun/elasticsearch
that referenced
this pull request
Nov 15, 2022
* main: (163 commits) [DOCS] Edits frequent items aggregation (elastic#91564) Handle providers of optional services in ubermodule classloader (elastic#91217) Add `exportDockerImages` lifecycle task for exporting docker tarballs (elastic#91571) Fix CSV dependency report output file location in DRA CI job Fix variable placeholder for Strings.format calls (elastic#91531) Fix output dir creation in ConcatFileTask (elastic#91568) Fix declaration of dependencies in DRA snapshots CI job (elastic#91569) Upgrade Gradle Enterprise plugin to 3.11.4 (elastic#91435) Ingest DateProcessor (small) speedup, optimize collections code in DateFormatter.forPattern (elastic#91521) Fix inter project handling of generateDependenciesReport (elastic#91555) [Synthetics] Add synthetics-* read to fleet-server (elastic#91391) [ML] Copy more settings when creating DF analytics destination index (elastic#91546) Reduce CartesianCentroidIT flakiness (elastic#91553) Propagate last node to reinitialized routing tables (elastic#91549) Forecast write load during rollovers (elastic#91425) [DOCS] Warn about potential overhead of named queries (elastic#91512) Datastream unavailable exception metadata (elastic#91461) Generate docker images and dependency report in DRA ci job (elastic#91545) Support cartesian_bounds aggregation on point and shape (elastic#91298) Add support for EQL samples queries (elastic#91312) ... # Conflicts: # x-pack/plugin/rollup/src/main/java/org/elasticsearch/xpack/downsample/RollupShardIndexer.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Currently, when a data frame analytics job is created, just two settings from the source index are copied to the auto-created destination index -
index.number_of_shardsandindex.number_of_replicas.To cater for slightly more complex source indices this PR makes changes to also copy/merge additional settings from the source indices to the destination index -
index.analysis,index.similarityandindex.mapping.In the case of the
index.mappingsettings, when multiple source indices are involved, the settings are merged in a similar manner as forindex.number_of_shards&index.number_of_replicas, i.e. by taking the maximum value of the setting across all source indices.For
index.similarity, when merging multiple indices, the similarity objects must be identical else an exception is thrown.index.analysisis comprised of the sub-objectsindex.analysis.filterandindex.analysis.analyzer, which may in turn be comprised of multiple filter and analyzer objects. The merge procedure here is to throw an exception if identically named objects differ in content, else all filter and analyzer objects are copied over to the destination index.Fixes #89795