Discussion
Loading...

Post

Log in
  • About
  • Code of conduct
  • Privacy
  • Users
  • Instances
  • About Bonfire
Michał "rysiek" Woźniak · 🇺🇦
Michał "rysiek" Woźniak · 🇺🇦
@rysiek@mstdn.social  ·  activity timestamp 6 days ago

I went on a half-day journey of discovery through QubesOS, LVM, and device-mapper, so that I could get my laptop working again after it rebooted randomly at a very inopportune moment.

Frustrating and scary (this is my main work machine), but I also did learn a bunch.

For example, that LVM is basically an interface over device-mapper, and when something gets fscked in a weird way, you can use dmsetup directly to un-fsck it.

#Linux #QubesOS

  • Copy link
  • Flag this post
  • Block
Michał "rysiek" Woźniak · 🇺🇦
Michał "rysiek" Woźniak · 🇺🇦
@rysiek@mstdn.social replied  ·  activity timestamp 6 days ago

You know you're screwed when `lvremove` fails and spits out a message like:

"Manual intervention may be required to remove device dev_id=5678 in thin pool metadata"

It gets even more scary once you see the TODO in the code that generated it:
https://github.com/lvmteam/lvm2/blob/main/libdm/libdm-deptree.c#L1517

> Give some useful advice how to solve this problem,
> until lvconvert --repair can handle this automatically

Uhhhh. blobcatsweats

GitHub

lvm2/libdm/libdm-deptree.c at main · lvmteam/lvm2

Mirror of upstream LVM2 repository. Contribute to lvmteam/lvm2 development by creating an account on GitHub.
  • Copy link
  • Flag this comment
  • Block
Michał "rysiek" Woźniak · 🇺🇦
Michał "rysiek" Woźniak · 🇺🇦
@rysiek@mstdn.social replied  ·  activity timestamp 6 days ago

Snapshot creation in LVM thin pools is not atomic, and a hard power cycle happened in the middle of it. Metadata was written but device was not actually created.

This meant that there was, somewhere, a "dangling" reference to a device with dev_id=5678 that did not exist. It was simply not there in `dmsetup table` output.

It was also impossible to use `thin_ls` to see the list of devices the thin pool was aware of because there was no way to take a snapshot of the broken pool.

yikes

  • Copy link
  • Flag this comment
  • Block
🇺🇦 haxadecimal
🇺🇦 haxadecimal
@brouhaha@mastodon.social replied  ·  activity timestamp 4 days ago

@rysiek
I've used LVM for a long time, but hadn't heard of thin pools, so I looked it up. Sounds pretty useful.
Non-atomic snapshot creation worries the heck out of me.
!

  • Copy link
  • Flag this comment
  • Block
Michał "rysiek" Woźniak · 🇺🇦
Michał "rysiek" Woźniak · 🇺🇦
@rysiek@mstdn.social replied  ·  activity timestamp 6 days ago

But did you know you can *message* your LVM thin pools directly, avoiding all the LVM-related tools? Check out `dmsetup message`.

Using that to delete the metadata from the thin pool (`dmsetup message /dev/mapper/<vg>-<thin-pool-lv>-tpool 0 "delete 5678"`) and then the b0rked snapshot (the `lvremove` that refused to work earlier) did the trick.

*phew* blobcat

  • Copy link
  • Flag this comment
  • Block
AUSTRALOPITHECUS 🇺🇦🇨🇿
AUSTRALOPITHECUS 🇺🇦🇨🇿
@lkundrak@metalhead.club replied  ·  activity timestamp 4 days ago

@rysiek clever

  • Copy link
  • Flag this comment
  • Block
🇺🇦 haxadecimal
🇺🇦 haxadecimal
@brouhaha@mastodon.social replied  ·  activity timestamp 4 days ago

@rysiek
I hadn't heard of dmsetup messaging before, and even after reading about it, don't feel like I understand the concept. What's the point of "sending a message" to a target? Is it something asynchronous? How does the target receive messages? Does it send back a reply? I'm just used to issuing commands directly.

  • Copy link
  • Flag this comment
  • Block
Michał "rysiek" Woźniak · 🇺🇦
Michał "rysiek" Woźniak · 🇺🇦
@rysiek@mstdn.social replied  ·  activity timestamp 4 days ago

@brouhaha as far as I understand – and I might be wrong here – all LVM operations are "messages" under the hood. So you using some LVM commands "directly" really means actually sending some messages.

I think it is done to enable queuing of operations. Some of these might take a while to complete I guess.

  • Copy link
  • Flag this comment
  • Block
schnittchen 🏳️‍🌈​ :neocat_flag_gay: :neocat_flag_polyam:
schnittchen 🏳️‍🌈​ :neocat_flag_gay: :neocat_flag_polyam:
@schnittchen@tech.lgbt replied  ·  activity timestamp 6 days ago

@rysiek Oh wow, back when I last checked (many years ago) LVM over DM was basically just a table of ranges and parameters

  • Copy link
  • Flag this comment
  • Block
Michał "rysiek" Woźniak · 🇺🇦
Michał "rysiek" Woźniak · 🇺🇦
@rysiek@mstdn.social replied  ·  activity timestamp 6 days ago

@schnittchen I mean, it kind of still is.

  • Copy link
  • Flag this comment
  • Block
Joakim Fors
Joakim Fors
@joakimfors@mastodon.green replied  ·  activity timestamp 6 days ago

@rysiek Ouch, hope it's containable. I've never dared to use thin lvm volumes after doing some reading.

  • Copy link
  • Flag this comment
  • Block
robryk
robryk
@robryk@social.wuatek.is replied  ·  activity timestamp 6 days ago
@rysiek Is this something that lvm metadata backups would protect against?
  • Copy link
  • Flag this comment
  • Block
Michał "rysiek" Woźniak · 🇺🇦
Michał "rysiek" Woźniak · 🇺🇦
@rysiek@mstdn.social replied  ·  activity timestamp 6 days ago

@robryk I'd say yes, if made often enough. Not 100% sure how easy it would be to revert to a backup while the thin pool is b0rked like that though.

  • Copy link
  • Flag this comment
  • Block

bonfire.cafe

A space for Bonfire maintainers and contributors to communicate

bonfire.cafe: About · Code of conduct · Privacy · Users · Instances
Bonfire social · 1.0.2-alpha.7 no JS en
Automatic federation enabled
Log in
  • Explore
  • About
  • Members
  • Code of Conduct