Learn and Understand the Concept of Hadoop Distribution File System

204 Views

Hadoop experts are going to share detailed information about Hadoop Distribution File System (HDFS). It’s a kind of distributed file system that is intended to stick to the advantages of a conventional distributed file system (DFS) and Google File system or GFS. The big data Hadoop development framework has its own file system, which is called Hadoop Distributed File System (HDFS). It’s a self-healing system designed and developed in Java that has ability to handle and store big data (petabytes or terabytes), regardless of format and schema, offering throughput, high scalability, and reliability while running on huge commodity machine clusters.

Concept of Hadoop Distribution File System

Fig- Showing the basic architecture of HDFS

HDFS is intended to locate most complications and issues related to traditional file distributed system. Major characteristics of HDFS are- massive data storage, high throughput, single writer/ multiple readers, fault tolerance, commodity machine usage, scalable and file management system, and more.

[Read More: Ultrasonic Cleaners – Great for Removing Buffing Compounds]

Important terminologies used for HDFS

To understand the functionality of HDFS, you must acquire complete knowledge about key terminologies used by Hadoop professionals-

DataNode

DataNode is affordable commodity machine that is used to store large amount of data. It is also named as the workhorse of the file system, which executes all commands driven by NameNode, say for instance- physically deletion, creation, and replication of a block. It also performs low level operations for I/O requests served for the HDFS client and enables pipelining of data and is used for forwarding data to another data node that is present inside the same cluster.

NameNode

It stores and handles metadata information of the file system, such as size, location, hierarchy, permission, etc. NameNode is efficient reliable machine that has a lot o RAM for easy and quick access and to support persistence.

It manages the file system with the transactional log assistance, i.e. it edits the log. NameNode receives regular block reports and a heartbeat from all data nodes present inside the cluster to operate the HDFS cluster health and assure proper functioning of the system.

[Read More: Adopting Economical Software Technology for Managing Customer Database]

Blocks

These are the smallest writable units present on the file system or disk. The blocks are useful for storing larger files- petabyte or zetabyte on HDFS. Its default size (64 MB) supports storing data of zeta bytes or petabytes over large commodity machine in a cluster and offering high throughput for accessing stored data.

Secondary NameNode

Sometimes Hadoop developers also name it as the CheckPointNode or HelperNode. This is a distinct and highly reliable machine that has lot of CPU power and RAM. NameNode edits the Log/Transaction log and merges the NameNode FSImage file with Edits log and develop a new FSImage file.

You now know the terminologies of HDFS and the architecture design. HDFS is extremely reliable and available data storage that is advanced than traditional distributed file system. We hope you like this article and share your valuable comments in below section.

Get more stuff like this
in your inbox

Subscribe to our mailing list and get interesting stuff and updates to your email inbox.




One Response

  1. Adriene April 14, 2015
Games Security
Gmail
Exciting Feature you need to know about Gmail
Digital Marketing
Expanding Your Digital Marketing in 2017
SME Website Design
Five-Tier Guide to SME Website Design
External Giveaway Freebie TechnoGiants Giveaway
iCare Data Recovery Pro
Giveaway #39: iCare Data Recovery Pro Free Download with License Key
Mother's Day Campaign
Freebie: TechnoGiants Celebrating Mother’s Day
iCare Data Recovery Pro License Key
Giveaway #38: Free iCare Data Recovery Pro License Key for 3 Days Only
Android iPhone
Developing Mobile App
Top Hidden Costs of Developing a Mobile App
iPhone App Developers
5 Essential Things Every iPhone App Developers Must Know
Setting and Achieving Goals
5 Apps that help you with Setting and Achieving Goals
Adsense
Outsource SEO
5 tips that will help you with SEO Outsourcing
Search Engine Optimization
A Beginner’s Guide to SEO
Social Media Tips
7 Best Practices for Organic Link Building
MAC Software
Able2Extract Professional 11
6 Time-Saving Tricks: Featuring Able2Extract Professional 11
Review FotoJet
Review FotoJet: Create Photo Collages and Designs for Free
EHR
What Exactly EHR Companies Are Thinking About The Software Development Technology?
Blogging Social Media
WordPress Security
9 Simple Tips to Help Buff up Your WordPress Security!
Instant Games
Facebook Launches Instant Games- Users Get Ready to Play them in Just a Few Clicks and Touches
Social Media
5 Predictions about Social Media Marketing for the Year 2016
Debit cards
The Rise in use of Credit/Debit cards in India after Demonetization
Data Recovery
Infographic: Data Loss & Data Recovery
Magento Ecommerce Website
Infographic: Magento Ecommerce Website Design and Development
Xamarin Benefits
Infographic: Xamarin Benefits for Business
Read previous post:
Lenovo A6000 vs Yureka
4G Smartphone Competition Between two Smartphones: Lenovo A6000 vs Yu Yureka

Recently, there has been a lot of fuss about the 4G services coming to India. With 4G LTE services users...

Close