The netbook is an Acer AspireOne (Intel Atom N270 1.6 GHz, 1 Gig RAM, 160Gig HD). What follows are the config changes made in ceph.conf before running the ceph-deploy command. The ceph version is Firefly 0.81. Since this is a one machine cluster I needed to tell ceph to replicate across OSDs and not across hosts.
[default] osd crush chooseleaf type = 0
I messed up setting the default journal size. At first I thought: Pfft. Journal, make it tiny – it just robs space. And my 4MB (yes four megabytes) journal made the cluster unworkable. With the tiny journals and default settings I could never reliably keep more than two OSDs up and data throughput was terrible. I rebuilt with 512MB journals instead.
[osd] osd journal size = 512
The machine was way underpowered. So I tuned a few other things. The authentication cephx was turned off. There are risks to this but this is a hobbyist project on a secured subnet.
[default] auth cluster required = none auth service required = none auth client required = none
The cluster uses a ton of memory and CPU when recovering objects. It helps to limit this activity somewhat.
[osd] osd max backfills = 1 osd recovery max active = 2
And since things could get a bit slow I increased a few timeouts:
[osd] osd op thread timeout = 180 osd op complaint time = 300 osd default notify timeout = 240 osd command thread timeout = 180
I was not able to get 2G flash keys to come up. Given the price of 8G sticks is only five bucks this isn’t much of a limitation. I suppose I could use LVM striping to join a bunch of 2G sticks together into a larger unit.
The speed is not all that quick. Ceph –w reports the write speed about 3 megabytes per second. That doesn’t sound like much except the data pool I was testing on writes three copies of the data – six if you count journaling. A lot of things could affect speed: tiny memory, slow CPU, slow USB sticks and/or the USB bus being saturated.
This config uses XFS on the USB sticks where BTRFS might perform better. While the speeds look poor, remember that ceph OSDs don’t report that a write is successful until the object is written to both the media and the journal. I could probably mitigate this double write by: having fewer OSDs by joining up groups of USB sticks with LVM stripes and/or moving the journal to a different device – right now ceph is using the USB sticks both for data and journal.
I stress tested RADOS by adding objects until the store filled up. It’s robust and not a single OSD timed out of the pool. As of writing this blog I am currently testing untarring linux kernel sources to the ceph filesystem – I’ll keep you posted.
My future plans are to expand the cluster utilising old hardware I have lying about. I’d like to add at least two more nodes – but they won’t necessarily be USB thumb drive backed.
No comments:
Post a Comment