Apache Nifi Installation on Ubuntu

Apache NiFi is a dataflow system based on the concepts of flow-based programming. It supports powerful and scalable directed graphs of data routing, transformation, and system mediation logic. NiFi has a web-based user interface for design, control, feedback, and monitoring of dataflows. It is highly configurable along several dimensions of quality of service, such as loss-tolerant versus guaranteed delivery, low latency versus high throughput, and priority-based queuing. NiFi provides fine-grained data provenance for all data received, forked, joined cloned, modified, sent, and ultimately dropped upon reaching its configured end-state.

NiFi can be downloaded from the NiFi Downloads Page. There are two packaging options available: a “tarball” that is tailored more to Linux and a zip file that is more applicable for Windows users.

Nifi Installation in ubuntu tar ball

Configuration Best Practices

 If you are running on Linux, consider these best practices. Typical Linux defaults are not necessarily well tuned for the needs of an IO intensive application like NiFi. For all of these areas, your distribution’s requirements may vary. Use these sections as advice, but consult your distribution-specific documentation for how best to achieve these recommendations.

Maximum File Handles

NiFi will at any one time potentially have a very large number of file handles open. Increase the limits by editing /etc/security/limits.conf to add something like

* hard nofile 50000

* soft nofile 50000

2_edit_limits-conf

Maximum Forked Processes

NiFi may be configured to generate a significant number of threads. To increase the allowable number edit /etc/security/limits.conf

* hard nproc 10000

* soft nproc 10000

Limitation files

Increase the number of TCP socket ports available

This is particularly important if your flow will be setting up and tearing down a large number of sockets in small period of time.

sudo sysctl -w net.ipv4.ip_local_port_range="10000 65000"

TCP Connection

Installing Nifi

Download and extract Nifi package in home folder

Nifi Extraction

Configuration

NiFi provides several different configuration options. The most important properties are those in the nifi.properties file.

Starting NiFi

Use a Terminal window to navigate to the directory where NiFi was installed. To run NiFi in the foreground, run bin/nifi.sh run. This will leave the application running until the user presses Ctrl-C. At that time, it will initiate shutdown of the application.

To run NiFi in the background, instead run bin/nifi.sh start. This will initiate the application to begin runningNifi Start

To check the status and see if NiFi is currently running, execute the command bin/nifi.sh status. NiFi can be shutdown by executing the command bin/nifi.sh stop.

Nifi Status

Installing as a service

To install the application as a service, navigate to the installation directory in a Terminal window and execute the command bin/nifi.sh install to install the service with the default name nifi. To specify a custom name for the service, execute the command with an optional second argument that is the name of the service. For example, to install NiFi as a service with the name dataflow, use the command bin/nifi.sh install dataflow.

Once installed, the service can be started and stopped using the appropriate commands, such as sudo service nifi start and sudo service nifi stop. Additionally, the running status can be checked via sudo service nifi status

After Nifi Started

Now that NiFi has been started, we can bring up the User Interface (UI) in order to create and monitor our dataflow. To get started, open a web browser and navigate to http://localhost:8080/nifi. The port can be changed by editing the nifi.properties file in the NiFi conf directory, but the default port is 8080.Nifi API

Leave a Reply

Your email address will not be published. Required fields are marked *