Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change ES Field Behaviour #787

Closed
1 of 4 tasks
Jotschi opened this issue Jun 6, 2019 · 2 comments
Closed
1 of 4 tasks

Change ES Field Behaviour #787

Jotschi opened this issue Jun 6, 2019 · 2 comments

Comments

@Jotschi
Copy link
Contributor

Jotschi commented Jun 6, 2019

Outline

It would be more efficient to exclude content fields from the search document by default. Only fields which have a dedicated ES mapping should be added to the index.

Variations

I - Add exclude setting per field / schema

Add extra flag per field / schema which can be used to exclude documents from the search index.

II - Add mesh.yml setting

Add a setting in the mesh.yml which controls the behaviour (e.g. exclude by default). This would remove all default ES mappings.

III - Only add fields which have custom mapping

We could only add fields to the index which have a custom elasticsearch mapping defined.

Problems:

  • This would break existing installations
  • Should this mapping override the default mapping? Would this remove the default property and change the document structure (e.g. fields.name.raw -> fields.name)

IV - Combine variants

  1. Exclude schemas from ES
    Add extra flag to disable ES handling for certain schemas.

  2. Add extra setting to control ES mapping mode

  • search.mappingMode = "dynamic|strict

dynamic mode (default)

field.stringField.elasticsearch contains:

{
    "raw" : {
      "index" : true,
      "type" : "keyword"
    }
}

The old mapping:

{
  "type" : "text",
  "index" : true,
  "analyzer" : "trigrams",
  "fields" : {
    "raw" : {
      "index" : true,
      "type" : "keyword"
    }
  }
}

strict mode

field.stringField.elasticsearch contains:

{
  "index" : true,
  "type" : "keyword"
}

Turns into

{
  "index" : true,
  "type" : "keyword"
}

Questions:

  • How should we detect empty mappings? What should happen when the mapping is empty?
  • What happens with field which have no ES mapping?
    --> Fields with null or empty mapping should not be index at all.

Motivation

Reducing the amount of data in the docs will speedup indexing, reduce memory footprint, reduce disk usage. Removing the default trigram mapping will also improve this situation.

Questions

Removing the default mapping. Would maybe confuse new users which expect the field to be searchable. Maybe we should add a searchable flag to the schema field?

Tasks

  • Remove default mapping for fields (e.g. trigram mapping)
  • Exclude fields from doc and mapping which don't have a dedicated ES setting
  • Find a way to handle field nesting (e.g. .raw field)
  • Update tests
@Jotschi Jotschi transferred this issue from gentics/mesh-incubator Jul 2, 2019
@Jotschi Jotschi self-assigned this Jul 3, 2019
@Jotschi
Copy link
Contributor Author

Jotschi commented Jul 3, 2019

Implemented in #788

@Jotschi
Copy link
Contributor Author

Jotschi commented Aug 27, 2019

Added III in 0.40.0

@Jotschi Jotschi closed this as completed Aug 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant