Hadoop Ecosystem

Get Started. It's Free
or sign up with your email address
Rocket clouds
Hadoop Ecosystem by Mind Map: Hadoop Ecosystem

1. Java-based filesystem

2. Bundle provides a way to package multiple coordinator and workflow jobs and to manage the lifecycle of those jobs

2.1. Connects non-Hadoop stores (RDBMS)

2.2. Moves data to & from RDBMS to Hadoop

3. Workflow jobs are Directed Acyclical Graphs (DAGs), specifying a sequence of actions to execute. The Workflow job has to wait

4. Hive

4.1. SQL-like querying

4.2. Combiner can be used to optimize reducer performance

4.3. Structured data warehousing

4.4. Partition columns instead of indexes

5. Pig

5.1. Scripting for Hadoop

6. HBase

6.1. Non-relational

6.2. Column store

6.3. Transactional lookups

7. Flume

7.1. Log collector

7.2. Integrates into Hadoop

8. Oozie

8.1. Links jobs

8.1.1. Workflow processing

8.2. Coordinator jobs are recurrent Oozie Workflow jobs that are triggered by time and data availability.

9. Avro

9.1. Data parsing

9.2. Binary data serialization

9.3. RPC

9.4. language-neutral

9.5. optional codegen

9.6. schema evolution

9.7. untagged data

9.8. dynamic typing

10. Mahout

10.1. Machine learning

10.2. Applied to MR

11. Sqoop

11.1. Autogens Java InputFormat code for data access

12. MapReduce

12.1. Distributed compute

12.2. Maps query onto nodes

12.3. Reduces aggregated results into answers

13. Ambari

13.1. Cluster deployment and admin

13.2. Driven by Hortonworks

14. ZooKeeper

14.1. Coordinator of shared state between apps

14.2. Naming, configuration, and synchronization services

15. YARN

15.1. cluster management

15.2. Hadoop 2

15.3. resource manager

15.4. job scheduler

16. BigTop

16.1. Package Hadoop ecosys

16.2. Test Hadoop ecosys package

17. Related Apache Ecosystems

18. HDFS

18.1. Distributed storage

19. Spark

20. Impala

20.1. SQL query egnine

20.2. Query data stored in HDFS and HBase

20.3. Real time

21. Cascading

21.1. Higher abstraction from MR

21.2. Creates Flow that assembles Map/Reduce jobs