Apache NiFi is a dataflow system based on the concepts of flow-based programming. It supports powerful and scalable directed graphs of data routing, transformation, and system mediation logic. NiFi has a web-based user interface for design, control, feedback, and monitoring of dataflows. It is highly configurable along several dimensions of quality of service, such as loss-tolerant versus guaranteed delivery, low latency versus high throughput, and priority-based queuing. NiFi provides fine-grained data provenance for all data received, forked, joined cloned, modified, sent, and ultimately dropped upon reaching its configured end-state.
NiFi can be downloaded from the NiFi Downloads Page. There are two packaging options available: a “tarball” that is tailored more to Linux and a zip file that is more applicable for Windows users.
Configuration Best Practices
If you are running on Linux, consider these best practices. Typical Linux defaults are not necessarily well tuned for the needs of an IO intensive application like NiFi. For all of these areas, your distribution’s requirements may vary. Use these sections as advice, but consult your distribution-specific documentation for how best to achieve these recommendations.
Maximum File Handles
NiFi will at any one time potentially have a very large number of file handles open. Increase the limits by editing /etc/security/limits.conf to add something like
* hard nofile 50000
* soft nofile 50000
Maximum Forked Processes
NiFi may be configured to generate a significant number of threads. To increase the allowable number edit /etc/security/limits.conf
* hard nproc 10000
* soft nproc 10000
Increase the number of TCP socket ports available
This is particularly important if your flow will be setting up and tearing down a large number of sockets in small period of time.
sudo sysctl -w net.ipv4.ip_local_port_range="10000 65000"
Download and extract Nifi package in home folder
NiFi provides several different configuration options. The most important properties are those in the nifi.properties file.
Use a Terminal window to navigate to the directory where NiFi was installed. To run NiFi in the foreground, run
bin/nifi.sh run. This will leave the application running until the user presses Ctrl-C. At that time, it will initiate shutdown of the application.
To check the status and see if NiFi is currently running, execute the command
bin/nifi.sh status. NiFi can be shutdown by executing the command
Installing as a service
To install the application as a service, navigate to the installation directory in a Terminal window and execute the command
bin/nifi.sh install to install the service with the default name
nifi. To specify a custom name for the service, execute the command with an optional second argument that is the name of the service. For example, to install NiFi as a service with the name
dataflow, use the command
bin/nifi.sh install dataflow.
Once installed, the service can be started and stopped using the appropriate commands, such as
sudo service nifi start and
sudo service nifi stop. Additionally, the running status can be checked via
sudo service nifi status
After Nifi Started
Now that NiFi has been started, we can bring up the User Interface (UI) in order to create and monitor our dataflow. To get started, open a web browser and navigate to
http://localhost:8080/nifi. The port can be changed by editing the
nifi.properties file in the NiFi
conf directory, but the default port is 8080.