Talend Training Chennai
Geoinsyssoft Pvt Ltd provides Talend Training Chennai.Talend is one of the ETL Tools providers of Open Source Data Integration Software. Its main product is Talend Open Studio. It is an Open Source Project for Data Integration based on Eclipse RCP that primarily supports ETL-oriented implementations and is provided for on-premises deployment as well as in a software-as-a-service (SaaS) delivery model.
Talend Open Studio is mainly used for integration between operational systems, as well as for ETL (Extract, Transform, Load) for Business Intelligence and Data Warehousing, and for migration.
Talend Big Data was designed to simplify the development, integration, and management of big data by removing the need for users to learn, write or maintain complicated Hadoop or Spark code. Talend provides native and optimized code generation to load, transform, enrich, and cleanse data inside Hadoop without additional storage or computing expense.
Talend Data Integration is an open and scalable data integration and data quality solution for integrating, cleansing and profiling all corporate data. The product features over 900 prebuilt components to connect various data sources. It also offers collaboration and management tools.
Master Data Management
Talend Master Data Management was created to help companies consolidate data across their businesses, such as product and customer data, in order to create a single “version of the truth”.
Talend Application Integration provides a common set of application and data integration tools to build a service-oriented architecture, and connect, mediate, and manage services in real-time.
Talend Integration Cloud is a secure and managed integration Platform-as-a-Service or connecting, cleansing and sharing cloud and on-premises data.
Talend Data Preparation is a free, open source data cleansing application that can be used for data discovery, visualization and enrichment.
Module 1: Introduction Talend
1. Overview of the concept of Data Warehouse.
2. Dimensions, Hierarchy, Facts
3. DW models:- Star and Snowflake schemas.
4. Explain talend and how it works
5. Explain talend open studio and its usefulness
6. Explain metadata and repository
Module 2: Components and Jobs
Types of Components
1. Basic Components – Overview
2. Component Properties
3. Database connectivity components
4. Explain how to create a new job
5. Create delimited file and explain whole process behind it
6. Use metadata and explain it
7. Explain concept of propagation.
8. Explain data integration schema
9. Use t filter row and string filter in job creation
10. Input delimitation file creation
Module 3: Schema and Aggregation
1. Explain job design and its features like edit schema and all
2. Explain T map and T merge
3. How to aggregate data
4. Define triplicate and explain how it works
5. Use tlog and explain its working
6. Define T map properties
7. Lab Exercises
Module 4-DataSource Connectivity
1. Data extracted from source
2. Database source and Target (Mysql/Oracle/Postgres)
3. Create connection
4. Import/create schema or metadata
1. Explain functions how to call and use them
2. Define routines
3. Explain XML file and how it is used in Talend
4. Use format data functions and explain its working
5. Define type casting.
Module 6: Transformation
1. Context variable
2. Parameterization in ETL
3. Use trow generator explain with example
-4. Explain sorting with example
5. Define aggregator
6. Publish data using t flow
7. Explain how we can run job in a loop
8. Other main components on palette
Module 7: Hadoop Connectivity TOS BD Edition
1. How to start Thrift Server
2. How ETL tool connect to Hadoop
3. Define ETL method
4. How Hive can be implemented
5. How to import data into hive with example
6. How to partition in hive with example
7. Why cannot overwrite customer table?
8. ETL component
9. Comparison b/w Hive and Pig
10. Loading data into the demo customer
11. ETL tool
12. Parallel execution
Module 8 : Use Cases / Case Studies
1. Data integration and performance improvement
2. Sentiment analysis with Twitter Dataset
3. Log stream analysis using Apache weblogs
4. ETL offloading with Hadoop Ecosystem
5. Recommendation modeling using Apache Spark as ETL