Apache NiFi Data Flow

Apache Nifi Data Flow Apache NiFi is a dataflow system based on the concepts of flow-based programming. It supports powerful and scalable directed graphs of data routing, transformation, and system mediation logic. NiFi has a web-based user interface for design, control, feedback, and monitoring of dataflows. It is highly configurable along several dimensions of quality[…]

RDDs Transformations and Actions in Apache Spark

RDDs – Resilient Distributed Datasets: Iit is the fundamental unit of data in spark, which is didtributed collection of elements across cluster nodes and can perform parallel operations. RDDs are immutable but can generate new RDD by transforming existing RDD. Two types of Operations: Transformation: Transformations build new RDD(Resilient Distributed Dataset) from previous RDD with[…]

Apache Nifi Installation on Ubuntu

Apache NiFi is a dataflow system based on the concepts of flow-based programming. It supports powerful and scalable directed graphs of data routing, transformation, and system mediation logic. NiFi has a web-based user interface for design, control, feedback, and monitoring of dataflows. It is highly configurable along several dimensions of quality of service, such as[…]

Cloudera certification guidelines for Hadoop Professionals

Become a certified big data professional Demonstrate your expertise with the most sought-after technical skills. Big data success requires professionals who can prove their mastery with the tools and techniques of the Hadoop stack. However, experts predict a major shortage of advanced analytics skills over the next few years. At Cloudera, we’re drawing on our[…]

Installing Ubuntu – VMware Player – Windows

In Installing Ubuntu on VMware, first we install VMware on windows. Intel VT: Intel VT (Virtualization Technology) is the company’s hardware assistance for processors running virtualization platforms. Several Intel CPUs come with the Intel Virtualization Technology(VT). Formerly known as Vanderpool, this technology enables a CPU to act as if you have several independent computers, in[…]