Elastic search

Intro about internals (С чего начинается Elasticsearch)

This is a document oriented storage with Lucene as an index.

Each shard is a Lucene index. How to control number of shards in index:

PUT _template/all
{
  "template": "*",
      "settings": {
        "number_of_shards": 1
      }
}
  • data node

    • hot (SDD is better)

    • warm (HDD is enough)

    • cold (HDD is enough)

  • coordinating node

  • master node

    • active master may be only one

    • master manages a topology of the cluster

      • create new index

      • extract shards

      • move shards and join if necessary

    • knows all about cluster state

    • node.master: true

Each ElasticSearch instance is a node. To join nodes in cluster:

  • Nodes need to have same version

  • cluster.name should be equal

How to control number of replicas

PUT / _settings {
  "index": {
      "number_of_replicas": someVal
  }
}

Deletion of data from node

First deletion happens only in primary shard. And after flush and commit in primary shard => internal request happens for changing replicas.

Cluster health status

  • green - all good

  • yellow - there are lost shards. Cluster is fully operating, but uses replicas

  • red - there are lost shards. Cluster is broken or part of the data is not available

num(data nodes) >= num(replicas)

"Replica" is applicable for shards.

Fault tolerance

Split-brain problem.

КОЛИЧЕСТВОКАНДИДАТОВ = ОБЩЕЕКОЛИЧЕСТВО_НОД/2 + 1

Last updated