Integrating LVM With Hadoop To Provide Elastic Storage Kind Of Feature To DataNode

4 min readMar 14, 2021

Hello Guyyyyssss 😃😃

In this Article , I will gonna to show you how we can integrate LVM with Hadoop to provide Elastic Storage kind of feature to the datanode..(Why Datanode ???.. Because , Datanodes are the one where the data is stored in the hadoop cluster..At the master side only the metadata is stored…That’s why only datanode 😃)

I am considering that u guys are familiar with the hadoop cluster and know how to set-up your own hadoop cluster..So, here we 🚀

Currently, In my cluster I have one master node and one slave node…

I attached hard disk(in my case VDI since I am using Virtual Machine) of size 1GiB to my operating system..Then

I created it’s partition using fdisk command utility and divided it into the two parts 500MB + 500MB
Then , I created the pv from both the partition using command pvcreate <name of the partition>
Then , I created vg from both the partition using the command vgcreate <vg-name-we-want-to-give> <name-of-partition-1> <name-of-partition-2>

Then , I created the LVM(of size 500 MiB) from the vg of approx size 1024 MB

Now ,I formatted the partition and then mount it to the /slave directory(directory of the data node) and then start the data node service

Now , If I see the output of hadoop dfsadmin -report then it looks like

Now , I stored a test file in the data lake provided by the hadoop cluster

Now , I increased the size of the lvm of the data node from 500M to 700M(approx)

But ,the output of the df -hT command looks like

Because , the increased portion has not been formatted yet..So , I made the use of resize2fs command for doing the online formatting..(resize2fs only supports online extending not shrinking)

Now ,the df -hT command’s output looks like

Now , I check the output of hadoop dfsadmin -report command

Now , I tried to reduce the size of the lvm of the storage of data node again using the resize2fs command

So, from here we can see that resize2fs command only supports on-line expanding not shrinking..So , for this first we have to make the folder offline by stooping the data node services and also unmount the lvm from the /slave directory

Then , I just mount the lvm to the /slave directory and again start the datanode service

Then , I checked the output of hadoop dfsadmin -report

At the end I also checked whether my data(here /text file), is there in the data lake or get lost in this process of resizing..And what I found is , it was there in the data lake 😃

Thanks For Reading..!!! 🔥

Hope, I was able to give u atleast one unknown information in this article.

See u in the next one ..

Signing Off … 😃

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Written by Prakhar Khandelwal

1 Follower

3 Following

No responses yet

Write a response

What are your thoughts?

Also publish to my profile

Recommended from Medium

How I Am Using a Lifetime 100% Free Server

Harendra

How I Am Using a Lifetime 100% Free Server

Get a server with 24 GB RAM + 4 CPU + 200 GB Storage + Always Free

Oct 26, 2024

9.4K

172

Privacy-Preserving Machine Learning — How to Train Models Without Compromising Data

RocketMe Up Cybersecurity

Privacy-Preserving Machine Learning — How to Train Models Without Compromising Data

As the demand for data-driven insights continues to grow, the importance of privacy in machine learning has never been more critical…

Oct 26, 2024

Lists

Staff picks

827 stories1649 saves

Stories to Help You Level-Up at Work

19 stories948 saves

Self-Improvement 101

20 stories3355 saves

Productivity 101

20 stories2821 saves

Jeff Bezos Says the 1-Hour Rule Makes Him Smarter. New Neuroscience Says He’s Right

Jessica Stillman

Jeff Bezos Says the 1-Hour Rule Makes Him Smarter. New Neuroscience Says He’s Right

Jeff Bezos’s morning routine has long included the one-hour rule. New neuroscience says yours probably should too.

Oct 30, 2024

25K

735

The 5 paid subscriptions I actually use in 2025 as a Staff Software Engineer

Level Up Coding

Jacob Bennett

The 5 paid subscriptions I actually use in 2025 as a Staff Software Engineer

Tools I use that are cheaper than Netflix

Jan 7

10.7K

264

Human Parts

Devon Price

Laziness Does Not Exist

Psychological research is clear: when people procrastinate, there's usually a good reason

Mar 23, 2018

343K

2212

Stackademic

Crafting-Code

I Stopped Using Kubernetes. Our DevOps Team Is Happier Than Ever

Why Letting Go of Kubernetes Worked for Us

Nov 19, 2024

5.9K

173

See more recommendations

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams