UPDATE: This tutorial is based on Solr 5. If you want to use Solr 8, we strongly recommend to use our recent blog entry to set up Solrcloud 8 on Amazon EC2
NOTE: There is a French version to this tutorial, which you’ll find on the second half of this blog entry.
In this tutorial, we’ll be setting up a Solrcloud cluster on Amazon EC2.
We’ll be using Solr 5.1, the embedded Jetty, Zookeeper 3.4.6 on Debian 7 instances.
This tutorial explains step by step how to reach this objective.
We’ll be installing a set of 3 machines, with 3 shares and 2 replicas per shard, which gives us a total of 9 shards.
We’ll also be installing a Zookeeper ensemble of 3 machines.
This architecture will be flexible enough to allow for a fail-over of one or two machines, depending on whether we’re at the indexing phase or at the querying phase:
- Indexing: a machine can fail without impacting the cluster (the zookeeper ensemble of 3 machines allows for one machine down). The updates are successfully broadcasted to the machines still running.
- Querying: two machines can fail without impacting the cluster. Since each machine hosts 3 shards, a search query can be processed without problems, the only constraints being a slower response time due to the higher load on the remaining machine.