is now

What is Elasticsearch?

Elasticsearch is highly scalable, broadly distributed open-source full text search and analytics engine. You can in very near real-time search, store and index big volume of data. It internally use Apache Lucene for indexing and storing data. Below are few use cases for it.

  • Product search for e-commerce website
  • Collecting application logs and transaction data for analyzing it for trends and anomalies.
  • Indexing instance metrics(health, stats) and doing analytics, creating alerts for instance health on regular interval.
  • For analytics/ business-intelligence applications

Elasticsearch basic concepts

We will be using few terminologies while talking about Elasticsearch. Let's see basic building blocks of Elasticsearch.

Near real-time

Elasticsearch is near real-time. What it means is that the time (latency) between the indexing of document and its availability for searching.


It is a collection of one or multiple nodes (servers) that together holds the entire data and provide you the ability to indexing and searching the cluster for data.


It is a single server that is part of your cluster. It can store data, participate in indexing and searching and overall cluster management. Node could have four different flavours i.e. master, htttp, data, coordinating/client nodes.


An index is collection of similar kind/characteristics of documents. It is identified by name(all lowercase) and is refer to by name to perform indexing, search, update and delete operations against documents.


It is a single unit of information that can be indexed.

Shards and Replicas

Single index can store billions of documents which can lead to storage taking up TB's of space. Single server could exceed its limitation to store such a massive information or performing search operation on that data. To solve this problem, Elasticsearch sub-divide your index into multiple units called shards.

Replication is important primarily to have high availability in case of node/shard failure and to allow to scale out your search throughput. By default Elasticsearch have 5 shards and 1 replicas which could be configured at the time of creating index.

Installing Elasticsearch

Elasticsearch requiresJava to run. As of writing this article Elasticsearch 6.2.X+ requires at least Java 8.

Installing Java 8
// Installing Open JDK
sudo apt-get install openjdk-8-jdk
// Installing Oracle JDK
sudo add-apt-repository -y ppa:webupd8team/java
sudo apt-get update
sudo apt-get -y install oracle-java8-installer
Installing Elasticsearch with tar file

curl -L -O

tar -xvf elasticsearch-6.2.4.tar.gz
Installing Elasticsearch with package manager
// import the Elasticsearch public GPG key into apt:
wget -qO - | sudo apt-key add -

//Create the Elasticsearch source list
echo "deb stable main" | sudo tee -a /etc/apt/sources.list.d/elasticsearch-6.x.list
sudo apt-get update
sudo apt-get -y install elasticsearch
Configuring Elasticsearch cluster

Configuration file location if you have downloaded the tar file

vi /[YOUR_TAR_LOCATION]/config/elasticsearch.yml

Configuration file location if you used package manager to install Elasticsearch

vi /etc/elasticsearch/elasticsearch.yml
Cluster Name

Use some descriptive name for cluster. Elasticsearch node will use this name to form and join cluster. lineofcode-prod
Node name

To uniquely identify node in the cluster ${HOSTNAME}
Custom attributes to node

Adding a rack to node to logically group the nodes placed on same data center/ physical machine

node.attr.rack: us-east-1
Network host

Node will bind to this hostname or IP address and advertise this host to other nodes in the cluster. [_VPN_HOST_, _local_]
Elasticsearch does not come with authentication and authorization. So, it is suggested to never bind network host property to public IP address.
Cluster finding settings

To find and join a cluster, you need to know at least few other hostname or IP addresses. This could easily be set by proeprty.

Changing the http port

You can configure the port number on which Elasticsearch is accessible over HTTP with http.port property.

Configuring JVM options (Optional for local/test)

You need to tweak JVM options as per your hardware configuration. It is advisable to allocate half the memory of total server available memory to Elasticsearch and rest will be taken up by Lucene and Elasticsearch threads.

// For example if your server have eight GB of RAM then set following property as

Also, to avoid performance hit let elasticsearch block the memory with bootstrap.memory_lock: true property.

Elasticsearch uses concurrent mark and sweep GC and you can change it to G1GC with following configurations.

Starting Elasticsearch
sudo service elasticsearch restart

TADA! Elasticsearch is up and running on your local.

To have a production grade setup, I would recommend to visit following articles.

Digitalocean guide to setup production elasticsearch

Elasticsearch - Fred Thoughts

We have learnt about What is Apache Ignite?, Setting up Apache Ignite and few quick examples in last few posts. In this post, we will deep dive into Apache Ignite core Ignite classes and discuss about following internals.

  • Core classes
  • Lifecycle events
  • Client and Server mode
  • Thread pools configurations
  • Asynchronous support in Ignite
  • Resource injection

Core classes

Whenever you will be interacting with Apache Ignite in application, you will always encounter Ignite interface and Ignition class. Ignition is the main entry point to create a Ignite node. This class provides various methods to start a grid node in the network topology.

// Starting with default configuration
Ignite igniteWithDefaultConfig = Ignition.start();

// Ignite with Spring configuration xml file
Ignite igniteWithSpringCfgXMLFile = Ignition.start("/path_to_spring_configuration_xml.xml");

// ignite with java based configuration
IgniteConfiguration icfg = ...;
Ignite igniteWithJavaConfiguration = Ignition.start(icfg);

There are also other useful methods in Ignition class which we will discuss below. Ignite interface provide control over node. It has various methods to interact as data-grid, service-grid, compute-grid, schedular and many more.

Lifecycle events

Apache Ignite provides four LifecyleEvents i.e. BEFORE_NODE_START, AFTER_NODE_START, BEFORE_NODE_STOP and AFTER_NODE_STOP. It provide hook to tap these events. You need to implement LifecycleBean and set the implementation in the ignite configuration.

class IgniteLifecycleEventListener implements LifecycleBean {

    public void onLifecycleEvent(LifecycleEventType evt) throws IgniteException {
        String message;
        switch (evt) {
            case BEFORE_NODE_START:
                message = "before_node_start event is called!";
            case AFTER_NODE_START:
                message = "after_node_start event is called!";
            case BEFORE_NODE_STOP:
                message = "before_node_stop event is called!";
            case AFTER_NODE_STOP:
                message = "after_node_stop event is called!";
                message = "Unknown event";

Client and Server mode

Apache Ignite node can be run in client or server mode. Server nodes participates in Computing, Caching, data grid, service grid etc. and client nodes are way to interact with server nodes to have near time caching, transaction, computing, service grid functionality. You need to explicitly define the client and server mode.



Thread pool configurations

System thread pool

It processes all cache related operations except SQL and some other queries and also handles computing cancellation tasks.

//By default it has size equals to max(8, total_no_of_cores)

Public thread pool

All computations are received by processed in this thread pool.

//By default it has size equals to max(8, total_no_of_cores)

Queries pool

Handles the SQL queries and SCAN operation executed across the cluster.

//By default it has size equals to max(8, total_no_of_cores)

Services Pool

Handles service-grid calls.

//By default it has size equals to max(8, total_no_of_cores)

Striped Pool

Accelerate basic caching operations and transactions by spreading execution on multiples stripes that don't contend with each other.

//By default it has size equals to max(8, total_no_of_cores)

Data stream pool

Used in data streaming.

//By default it has size equals to max(8, total_no_of_cores)

Custom thread pool

You can define your own custom thread pools. These are used in compute grid. For example, you want to run another task from compute grid task and you also want to avoid the deadlocks. This could be done with custom thread pools synchronously.

IgniteConfiguration icfg = ...;
icfg.setExecutorConfiguration(new ExecutorConfiguration("myCustomThreadPool").setSize(16));
class InternalTask implements IgniteRunnable {
    private static final long serialVersionUID = 5169676352276118235L;
    public void run() {
        System.out.println("Internal task executed!");

class OuterTask implements IgniteRunnable {
    private static final long serialVersionUID = 602712410415356484L;

    private Ignite ignite;
    public void run() {
        System.out.println("Ignite Outer task!");
        ignite.compute().withExecutor("myCustomThreadPool").run(new InternalTask());

// Ignite main example class
IgniteConfiguration icfg = defaultIgniteCfg("custom-thread-pool-grid");
icfg.setExecutorConfiguration(new ExecutorConfiguration("myCustomThreadPool").setSize(16));
try (Ignite ignite = Ignition.start(icfg)) {
    ignite.compute().run(new OuterTask());

Asynchronous support in Ignite

Ignite API comes with synchronous and asynchronous support. Asynchronous calls return IgniteFuture or one of its implementations. You can call the blocking get method to get value or can add listener(IgniteInClosure) which will get executed as soon as the IgniteFuture has the result.

IgniteCompute compute = ignite.compute();
IgniteFuture fut = compute.callAsync(() -> "Hello from Callable");
//blocking call
String result = fut.get();
//added listener to future which will get executed as soon as future has result.
fut.listener(f -> System.out.println(f.get());
If the IgniteFuture is already have the result from asynchronous operation by the time IgniteInClosure is passed to listen or chain method, then it will be executed synchronously with the caller thread. Otherwise closure will get executed when the asynchronous operation finishes. The closure will be called in system thread pool for asynchronous cache related operations or public thread pool in case of compute operations. So, it is recommended(at least avoid) calling cache/ compute related operations from the closure to avoid deadlocks due to thread starvations.

Resource Injection

Ignite support dependency injection of pre-defined resources which could be used in the task, jo, closure or SPI. It supports both field and method based injection.

IgniteRunnable task = new IgniteRunnable() {
    private static final long serialVersionUID = 787726700536869271L;

    private transient Ignite ignite;
    public void run() {
        System.out.println("Hello Gaurav Bytes from: " +;

In the above example code, we have used @IgniteInstanceResource annotation to inject current Ignite instance in the IgniteRunnable object. There are other pre-defined resources that you can inject in the jobs, tasks, closures and SPI.

Resource Name Description
@IgniteInstanceResource Injects current instance of Ignite API
@CacheNameResource Injects the grid-cache name provided by the CacheConfiguration.getName()
@CacheStoreSessionResource Injects the CacheStoreSession instance
@LoadBalancerResource Injects the ComputeLoadBalancer instance for load-balancing
@SpringApplicationContextResource Injects the Spring's ApplicationContext

Apart from this, there are few other resources like TaskContinuousMapperResource, TaskSessionResource, SpringResource, ServiceResource and JobContextResource.

In this article, we will show few examples on using Apache Ignite as Compute Grid, Data Grid, Service Grid and executing SQL queries on Apache Ignite. These are basic examples and use the basic api available. There will be few posts in near future which explains the available API in Compute Grid, Service Grid and Data Grid.

Ignite SQL Example

Apache Ignite comes with JDBC Thin driver support to execute SQL queries on the In memory data grid. In the example below, we will create tables, insert data into tables and get data from tables. I will assume that you are running Apache Ignite on your local environment otherwise please read setup guide for running Apache Ignite server.

Creating Tables
try (Connection conn = DriverManager.getConnection("jdbc:ignite:thin://");
     Statement stmt = conn.createStatement();) {
    //line 1
    stmt.executeUpdate("CREATE TABLE City (id LONG PRIMARY KEY, name VARCHAR) WITH \"template=replicated\"");

    //line 2
    stmt.executeUpdate("CREATE TABLE Person (id LONG, name VARCHAR, city_id LONG, PRIMARY KEY (id, city_id)) WITH \"backups=1, affinityKey=city_id\"");

    stmt.executeUpdate("CREATE INDEX idx_city_name ON City (name)");

    stmt.executeUpdate("CREATE INDEX idx_person_name ON Person (name)");

In line 1, we are creating a City table with CacheMode as replicated which means it will be replicated on whole cluster. There are three possible values for CacheMode which is LOCAL, REPLICATED and PARTITIONED. We will discuss about this later in detail.

In line 2, we are creating Person table. You might have noticed affinityKey being used. The purpose of affinityKey is to collate the data together.

Inserting data in tables
try (PreparedStatement stmt = conn.prepareStatement("INSERT INTO City (id, name) VALUES (?, ?)")) {

    stmt.setLong(1, 1L);
    stmt.setString(2, "Forest Hill");

    stmt.setLong(1, 2L);
    stmt.setString(2, "Denver");

    stmt.setLong(1, 3L);
    stmt.setString(2, "St. Petersburg");

try (PreparedStatement stmt = conn.prepareStatement("INSERT INTO Person (id, name, city_id) VALUES (?, ?, ?)")) {

    stmt.setLong(1, 1L);
    stmt.setString(2, "John Doe");
    stmt.setLong(3, 3L);

    stmt.setLong(1, 2L);
    stmt.setString(2, "Jane Roe");
    stmt.setLong(3, 2L);

    stmt.setLong(1, 3L);
    stmt.setString(2, "Mary Major");
    stmt.setLong(3, 1L);

    stmt.setLong(1, 4L);
    stmt.setString(2, "Richard Miles");
    stmt.setLong(3, 2L);
Querying data from tables
try (Connection conn = DriverManager.getConnection("jdbc:ignite:thin://");
     Statement stmt = conn.createStatement()) {
    try (ResultSet rs = stmt.executeQuery("SELECT, FROM Person p, City c WHERE p.city_id =")) {
        while (
            System.out.println(rs.getString(1) + ", " + rs.getString(2));

You can find the full example code here.

Ignite Compute Grid Example

In this example, we will use Ignite's compute grid to fetch data.

try (Ignite ignite = Ignition.start(defaultIgniteCfg("cache-reading-compute-engine"))) {
    long cityId = 1;

    ignite.compute().affinityCall("SQL_PUBLIC_CITY", cityId, new IgniteCallable<List<String>>() {
        private static final long serialVersionUID = -131151815825938052L;

        private Ignite currentIgniteInstance;

        public List<String> call() throws Exception {
            List<String> names = new ArrayList<>();
            IgniteCache<BinaryObject, BinaryObject> personCache = currentIgniteInstance.cache("SQL_PUBLIC_PERSON").withKeepBinary();
            IgniteBiPredicate<BinaryObject, BinaryObject> filter = (BinaryObject key, BinaryObject value) -> {
                return key.hasField("CITY_ID") && key.<Long>field("CITY_ID") == cityId;

            ScanQuery<BinaryObject, BinaryObject> query = new ScanQuery<>(filter);

            try (QueryCursor<Entry<BinaryObject, BinaryObject>> cursor = personCache.query(query)) {
                Iterator<Entry<BinaryObject, BinaryObject>> itr = cursor.iterator();

                while (itr.hasNext()) {
                    Entry<BinaryObject, BinaryObject> cache =;

            return names;

In this example, we are getting list of person residing in same city. We are calling compute grid on SQL_PUBLIC_CITY cache to query with affinitykey cityId and the IgniteCallable task. In the IgniteCallable task, we have @IgniteInstanceResource which will be injected by the Ignite server running this task.

Ignite Data example

This example will usage of Ignite as in memory data grid.

try (Ignite ignite = Ignition.start(defaultIgniteCfg("ignite-data-grid"))) {
    IgniteCache personCache = ignite.getOrCreateCache("personCache");
    for (int i = 0; i < 10; i++) {
        personCache.put(i, "Gaurav " + i);
    for (int i = 0; i < 10; i++) {

Ignite Service grid example

interface TimeService extends Service {
    public LocalDateTime currentDateTime();
static class TimeServiceImpl implements TimeService {
    private static final long serialVersionUID = 3977097368864906176L;

    public void cancel(ServiceContext ctx) {
        System.out.println("Service is cancelled!");

    public void init(ServiceContext ctx) throws Exception {
        System.out.println("Service is initialized!");

    public void execute(ServiceContext ctx) throws Exception {
        System.out.println("Service is deployed!");

    public LocalDateTime currentDateTime() {

try (Ignite ignite = Ignition.start(defaultIgniteCfg("ignite-service-grid"))) {"timeServiceImpl", new TimeServiceImpl());
    TimeService timeService ="timeServiceImpl");
    System.out.println("Current time is: " + timeService.currentDateTime());

If you want to deploy some service on grid than it should implement Service interface. Also, service grid deployments are not zero deployments. You need to put the compiled jars to the Ignite server instance and than need to restart the instance as well.

In this post, we will discuss about setting up Apache Ignite.


You can download the Apache Ignite from its official site. You can download the binary, sources, Docker or Cloud images and maven. There is also a third party support from GridGain.

Steps for binary installation

This is pretty straightforward installation. Download the binary from website. You can optionally setup installation path as IGNITE_HOME. To run Ignite as server, you need to run below command on terminal.

/bin/ignite.bat // If it is Windows
/bin/ //if it is Linux

The above command will run the Ignite with default configuration file under $IGNITE_HOME/config/default-config.xml, you can pass your own configuration file with following command

/bin/ config/ignite-config.xml

Steps for building from sources

If you are likely to build everything from sources, than follow the steps listed below.

# Unpack the source package
$ unzip -q apache-ignite-{version}
$ cd apache-ignite-{version}-src
# Build In-Memory Data Fabric release (without LGPL dependencies)
$ mvn clean package -DskipTests
# Build In-Memory Data Fabric release (with LGPL dependencies)
$ mvn clean package -DskipTests -Prelease,lgpl
# Build In-Memory Hadoop Accelerator release
# (optionally specify version of hadoop to use)
$ mvn clean package -DskipTests -Dignite.edition=hadoop [-Dhadoop.version=X.X.X]

Steps for maven

You just need to add the maven dependencies to make it work in your project. Ignite has many integration support with other libraries and almost all of them are optional. The only mandatory one is ignite-core. You can add ignite-spring for configuring Ignite with Spring XML like configurations and ignite-indexing for SQL querying.


You can download the docker image or Cloud AMI from this link.

This is an introduction series to Apache Ignite. We will discuss about Apache Ignite, its features, usage as in-memory data grid, compute grid, distributed caching, near real-time caching and persistence distributed database.

What is Ignite?

  • It is in-memory compute platform.
  • It is in-memory data grid.
  • Durable, strongly consistent and highly available.
  • Providing option to run SQL like queries on cache (Providing JDBC API to support this).

Durable memory

Apache Ignite is memory-centric platform based on durable memory architecture. It allows you to store and processing data on in-memory(RAM) and on disk (If Ignite Native persistence is enabled). When the Ignite native persistence is enabled, it will treat disk as superset of data, which is cable of surviving crash and restarts.

In-memory features

RAM is always treated as first memory tier, all the processing happens there. It has following characteristics.

  • Off-heap based: All the data and indexes are stored outside of Java heap which helps in processing petabytes of data.
  • Since all data and indexes are off-heap based, it removes noticeable GC pauses since application code is only source possible for pause-the-world events.
  • It has predictable memory usage. You can configure memory usage with MemoryConfiguration
  • It uses memory as efficient as possible and runs defragmentation routines in the background.
  • Data and indexes on disk and in-memory are stored as same page format which improved the performance and avoids unnecessary data format conversion.

Persistence features

Here are few high-level persistence features.

  • Persistence is optional to disk. You can enable or disable it.
  • It provides data resiliency. If persistence is enabled, full dataset will be stored on physical disk and you can survives cluster restarts, crashes.
  • It can execute SQL queries on full dataset.
  • Cluster restarts are instantaneous. In-memory data will be cached automatically.

In this post, we will use Spring security to handle form based authentication. You can also read my previous posts on Basic Authentication and Digest Authentication.

Technologies/ Frameworks used

Spring Boot, Spring Security, Thymeleaf, AngularJS, Bootstrap

Adding depedencies in pom.xml

In the example, we will use Spring Boot, Spring Security, Undertow and thymeleaf and will add their starters as shown below.


Spring Security Configurations

We will extend WebSecurityConfigurerAdapter class which is a convenient base class to create WebSecurityConfigurer.

public class SecurityConfig extends WebSecurityConfigurerAdapter {
  protected void configure(HttpSecurity http) throws Exception {
            .antMatchers("/static/**", "/", "/index", "/bower_components/**").permitAll()
  public UserDetailsService userDetailsService() {
    InMemoryUserDetailsManager manager = new InMemoryUserDetailsManager();
    return manager;
  SpringSecurityDialect securityDialect() {
    return new SpringSecurityDialect();

@EnableWebSecurity annotation enables the Spring Security. We have overridden the configure method and configured the security. In the above code, we have disabled the csrf request support (By default it is enabled). We are authorizing all the requests to /index, /,/static folder and sub-folders, bower_components folder and its sub-folder accessible without authentication but all other should be authenticated. We are referring /login as our login page for authentication.

In the above code snippet, we are also registering the UserDetailsService. When we enable web-security in Spring, it expects a bean of type UserDetailsService which is used to get UserDetails. For example purpose, I am using InMemoryUserDetailsManager provided by the Spring.

MVC configuration

public class MvcConfig extends WebMvcConfigurerAdapter {
  public void addViewControllers(ViewControllerRegistry registry) {

In the above configurations, we are registering ViewController and setting their names. This is all configuration that we need to do to enable Spring Security. You can find the full working project including the html files on Github.

In this post, we will externalize the properties used in the application in a property file and will use PropertyPlaceHolderConfigurer to resolve the placeholder at application startup time.

Java Configuration for PropertyPlaceHolderConfigurer

public class AppConfig {

  public PropertySourcesPlaceholderConfigurer propertySourcesPlaceholderConfigurer() {
    PropertySourcesPlaceholderConfigurer propertySourcesPlaceholderConfigurer = new PropertySourcesPlaceholderConfigurer();
    propertySourcesPlaceholderConfigurer.setLocations(new ClassPathResource(""));
    return propertySourcesPlaceholderConfigurer;

We created object of PropertySourcesPlaceholderConfigurer and set the Locations to search. In this example we used ClassPathResource to resolve the properties file from classpath. You can use file based Resource which need absolute path of the file.

DBProperties file

public class DBProperties {
  private String userName;
  private String password;
  private String url;

  //getters for instance fields

We used @Value annotation to resolve the placeholders.

Testing the configuration

public class Main {
  private static final Logger logger = Logger.getLogger(Main.class.getName());
  public static void main(String[] args) {
    try (ConfigurableApplicationContext context = new AnnotationConfigApplicationContext(AppConfig.class, DBProperties.class);) {
      DBProperties dbProperties = context.getBean(DBProperties.class);"This is dbProperties: " + dbProperties.toString());

For testing, we created object of AnnotationConfigApplicationContext and got DBProperties bean from it and logged it using Logger. This is the simple way to externalize the configuration properties from framework congfiguration. You can also get the full example code from Github.