Saturday, December 27, 2008

ZFS Server

Obviously I have an interest in OpenSolaris because of who I work for. There have been several recent events that have caused me to have more than a passing interest with OpenSolaris and more specifically ZFS.

I have been in the computer industry for a long time now. While I have seen enterprise disks, ones that are in many different forms of redundant setups fail, I had not personally had a disk failure. I actually have quite a sizable stack of disks from old systems, that I have kept, because you know some day I might need them.

Starting the last 2 years I have started to see my personal disks fail. I have actually had 4 disks fail in the last 2 years. One was 12 months and 3 days old. Yes it had a one year warranty. I am pretty obsessive about my personal backups, so far I have not lost any data, but I need something more to feel comfortable.

First I started with how much back up do I need? I have a sizable collection of songs ripped in from my CD's into iTunes. I have also started downloading songs from iTunes, and based on their DRM, if you loose the file, you have to repurchase the track. Next we have a HDD camcorder. This leads to a lot of digital files for the home videos. Beyond that I have the normal amount of work and personal files. All said and done, I have over 500GB of data that I need to back up. Also this data resides on 3 separate systems, 2 Macs and one windows machine.

I started by looking at on-line backup services. I did quite an extensive search. The services fell into two categories. The first where inexpensive or free, and quite frankly I would not trust them with my data. The second while I would trust my data, where cost prohibitive. In both cases the challenge of getting the initial data load to the system was enormous. Pushing 5000 GB up a 1MB upstream will take a VERY long time. Conclusion is that on-line backup is not right for me.

I then started to research home NAS solutions. Again this left me wanting for a solution. The home NAS's are either ridiculous expensive, or proprietary and not very expandable. I wanted a flexible solution that I knew I could easily upgrade and was not willing to pay $1000's for NAS's that fit this criteria.

Then like Sir Isaac Newton getting hit on the head with an apple it dawned on me. Why not build a ZFS file server? Unlike Newton my idea was not so revolutionary. A quick google search turns up tons of hits of people who have done just this. Another search of blogs.sun.com will turn up many hits of people at Sun who have done this.

When I original sat down to write this entry it was going to be a how to. The how to's are already well documented, so instead I have decide to write about my journey through the process. The links below are the steps that I went through to get my home NAS up and running.


Building the Server

My journey really started with this blog. It has a great overview of why you want to use ZFS and how to build your hardware, how to setup ZFS and how to use it. At first the information might seem daunting, but like most new projects once you get into you realize the project is broken down into pieces that can be easily followed.

The first step of the project is identifing and putting together your hardware. One of our Sun colleagues has documented a spectacular build. I seriously started going down this road. To get the parts in the US you are looking at about $800-$900. While this is still an exceptional value, it is quite frankly still a bit more than I was looking to spend.

My build started in another direction. Cheap. My goal was to build the system for as cheaply as I could. Again it is going to be open, and I can always upgrade the different components as needed. I have a difficult time parting with old gear. I have a Dell 4550 that has long given up the ghost for running Windows. Step one of my build was cracking it open and giving it a good dose of canned air to get the cobwebs out of it. The system only had 512MB of RAM, but a quick exploration through my stock piles found a compatible DIMM to bring the system to 1GB. The system also has P4 running at 2.5 gHz.

Now if you have read about ZFS you will probably asking yourself what was I thinking running on this minimum of specs? Doesn't ZFS need a 64Bit chip? Don't I need more RAM? Well to be honest I had no idea what my performance was going to be like. But again my goal for the system was to be a backup server. I have 2 programs, one for the MAC and one for Windows, that copy files from the systems I run, to the backup server. These programs are all scheduled to run in the middle of the night. It really doesn't matter if they take 30 minutes or an hour. Also after the initial load I would only be moving incremental changes, which is really not a significant amount of data. Therefore I decided to plug ahead with my 7 year old box.

I started by installing OpenSolaris 2008.11 on the box. All of the critical devices in the box where found, and the system was up and running in no time. Now came some interesting architectural decisions. My goal was to have 4 drives for the ZFS pool, and a separate drive for the OS. This leads to the ability to update the OS or replace the OS drive without interfering with the storage pool.

My first thought was to boot the system to a USB drive that was running the OpenSolaris operating system. The 4550 did not allow for boot to USB. I did find I was 8 revisions behind on the bios, and upgraded it, but alas still no way to boot to USB.

At this point my original plan changed. The case has slots for more drives and I decide to leave the IDE interfaces, one for a hard drive and one for the dvd and to add a SATA card to hang the rest of the drives off of. When finished the box has 5 HD's, one IDE and 4 SATA, an IDE DVD, and I decided to leave the floppy in it as well.

I decided that since I have a GigE switch I would upgrade to the the network card as well. After the checking the HCL, I was off to my local computer store to see what parts I could come up with.

Based on the HCL and what was in the store I picked up the following additional components:
DLink DGE-530T GigE network adapter

SIIG SATA 4 channel

3 Seagate 500GB drives

And as often goes with adventures like this my original goal was to buy 4 HD's. I was hoping for 250GB to 300GB to get me to a pool of 750GB or 900GB. The store I went to was having a sale on the 500GB drives, and I was able to pick 3 of them up for less than 4 of the others. I was able to get a 1TB pool and this sets me up for adding another 500GB to my ZFS pool very easily in the future, by simply getting another disk when I am out of space.

Getting to GigE

I have an old computer that I decided to install OpenSolaris on. I have a GigE switch in my lab and decided the onboard, 100MB card was not good enough, and decided to go GigE. The old adage that if ain't broke don't fix it, should have come to mind here, but….

The first thing I did was check the HCL list. With a few models I headed off to MicroCenter. They happened to have a D-Link DGE-530T and since it was on the compatibility list I decided to go for. I installed the card, and followed the instructions on the HCL list. During the installation I got errors and the card does not work.

I went back to the HCL list and looked at the details. The card that is known to work has the following config:
Node 0x00002c

assigned-addresses: 820e0810.00000000.e2000000.00000000.00004000.810e0814.00000000.00002000.00000000.00000100

reg: 000e0800.00000000.00000000.00000000.00000000.020e0810.00000000.00000000.00000000.00004000.010e0814.00000000.00000000.00000000.00000100

compatible: 'pci1186,4b01.1186.4b01.11' + 'pci1186,4b01.1186.4b01' + 'pci1186,4b01' + 'pci1186,4b01.11' + 'pci1186,4b01' + 'pciclass,020000' + 'pciclass,0200'

model: 'Ethernet controller'

power-consumption: 00000001.00000001

66mhz-capable:

fast-back-to-back:

devsel-speed: 00000001

interrupts: 00000001

max-latency: 0000001f

min-grant: 00000017

subsystem-vendor-id: 00001186

subsystem-id: 00004b01

unit-address: '1'

class-code: 00020000

revision-id: 00000011

vendor-id: 00001186

device-id: 00004b01

name: 'pci1186,4b01'


My card has:

Node 0x000010

assigned-addresses: 8100fb20.00000000.0000dc80.00000000.00000020

reg: 0000fb00.00000000.00000000.00000000.00000000.0100fb20.00000000.00000000.00000000.00000020

compatible: 'pci8086,24c3.1028.142.1' + 'pci8086,24c3.1028.142' + 'p model: 'Ethernet controller'

power-consumption: 00000001.00000001

66mhz-capable:

fast-back-to-back:

devsel-speed: 00000001

interrupts: 00000001

max-latency: 0000001f

min-grant: 00000017

subsystem-vendor-id: 00001186

subsystem-id: 00004c00

unit-address: 'c'

class-code: 00020000

revision-id: 00000011

vendor-id: 00001186

device-id: 00004c00

name: 'pci1186,4c00'


Now to be fair I am not a hardware expert, but obviously the names are different. I started googling around and found this blog post. We now know from the blog post that the name defines the Vendor ID and the Product ID. Since the first half is the same, and since they are both DLinks things are adding up.

Now what about the product ID? There is a website that documents all of the unique codes on PCI cards. The websites shows that a 4b01 has a Marvell 88E8001 Chip. The website also shows that the 4c00 has the Marvell 88E8003 Chip. Even though I have a different chipset, I decided to give it whirl and ran the following command:
/usr/sbin/update_drv -a -i “pci1186,4c00” skge

This brought the interface on-line and it seems to be working!


Getting to SATA

I am continuing to experiment with OpenSolaris on an old computer that I have. I decided I wanted to put more disks in the server and begin to research what cards where available. The Hardware Compatibility List lists 2 different PCI based SATA cards that are known to work. Both of the cards have the Silicon Image 3112A chipset in them.

I headed over to my local computer store, MicroCenter, and they had a SIIG 4 channel SATA card with the Silicon Image chipset on it. Knowing it would be a 50/50 chance that it would work I went ahead and picked it up.

I installed the card and hooked up a SATA drive I had available. On boot the card showed up in the boot screens and it detected the drive. Good sign. Once OpenSolaris was up and running, though no luck. Looks like the 50/50 bet had played against me. OpenSolaris did not recognize the card.

I started googling around and found a lot of hits. Some said the cards did not work. Some said that if you ran the update_drv command it would work. I tried running the command but still no luck. Next I hit upon a couple of Windows users who where having issues with the card. Some of the responses hinted at using different firmware on the card. This hint got me to this OpenSolaris bug ID. Seems as if the card ships in a raid configuration which does not work with OpenSolaris. You can go to the original manufacture and get a different bios for the SATA card that presents the disks as JBOD instead of RAID and it will work.

The download page for the chipset can be found here. The next challenge is how to update the bios on the card? There is a DOS based or a Windows based utility in the downloads directly. Obviously since I am running OpenSolaris, the Windows utility was not going to be of much use to me. The computer does have a floppy disk, so DOS boot it was!

This led to probably the most amusing or ironic step of this whole process. I found myself building a DOS boot disk using a USB floppy drive attached to my Mac, passed through to a Windows Virtual Machine.

There are 3 versions of the bios available. One passes the disks through as plain disk, one builds the raid, and one is to be used the chipset built into motherboards. Make sure pick the correct one!

With my SATA card bios flashed to the new version, I rebooted. This time OpenSolaris can see the sata disks.


ZFS File share

Working with ZFS is actually quite easy. If you have done any work with any other type of RAID, NAS, etc, you will truly understand what an amazing file system ZFS is and how easy it is to work with. The first time I built a ZFS filesystem for a customer, they did not believe me because I had done it so quickly, and made me delete it and do it again. No challenge what so ever with ZFS!

The official docs for ZFS can be found here and a good best practice guide can be found here.

So without further ado lets create the ZFS pool. First I run the format command to see a list of my disks.

Elmerfudd

The disks that I want to use c3d1, c4d0 and c4d1. These are all 500GB disks, that I will put into a raidz1 configuration. raidz1 provides for the ability to have an entire disk fail without data loss. With raidz, your pool will be the size of your disks minus one, so in my case I will wind up with about 1T of space in my pool. (3*500GB - 500GB).

Now you may have noticed that the server I am working on is named ElmerFudd so what better to name the share than shotgun? The command below creates a zfs file. The -m command defines the mount point, raidz1 defines the raid to use, and the list of disks define what disks should go into the pool:
#zpool create -m /export/shotgun shotgun raidz1 c3d1 c4d0 c4d1

Almost instantaneously the raid is created!

We can check the status of the pool with the following command:
#zpool status shotgun

Elmerfudd2

Note with this command we can see that there are 3 drives configured in raidz1.

Great now we have an almost 1TB raidz file system, ready to store all of backup data on it. The next challenge is that my data is on my Macs and Window machine. How to network share the file system so that the other machines can see it? I choose to use SMB, which is the Solaris implementation of CIF, which I knew both my mac and windows box would be able to see.

I again started with google to find the steps necessary to get an SMB sharing out a ZFS filesystem on OpenSolaris. I came across this blog which has the necessary steps, which I have included here as well. More details about SMB shares can be found at the Sun Docs on the subject.

The smb isn’t included in the default kernel, you can install it with:
# pfexec pkg install SUNWsmbs

# pfexec pkg install SUNWsmbskr

You then need to reboot the system

Next you will need to enable the server to start automatically at system boot
# svcadm enable -r smb/server

If you get an error about more than one interface it is okay

Check to see if the smb service is running:
# svcs | grep smb

Next we need to modify our pool to turn on smb sharing the following command does just that:
#zfs set sharesmb=on shotgun

The following command shows us the status of our zfs pool. Notice the smbshare setting is turned on.
#zfs get all shotgun

Elmerfudd3

Next we need to modify the pam.conf file to allow users to authenticate against the share. SMB keeps a separate password file so you need to run the password command for any user who you want to be able to mount the smb share.

Add this line to /etc/pam.conf:
# Seem to need this line for smb / cifs:

other password required pam_smb_passwd.so.1 nowarn

reset the password:
#passwd <user who wants share access>

Now for the real test. Can I see the share from my Mac and Windows machine? On my mac using finder, I went to the Go Menu and then connect to server. In the connect box I typed smb://elmerfudd/shotgun. I was prompted for my username and password and the share mounted.

From my windows system I went to map network drive and entered \elmerfudd\shotgun. Again after providing my credentials the share was mounted.


Building my ZFS Server Journey comes to an end

Building my home NAS was quite and exciting journey for me. Thank you for coming along with me through this entries!

My goal when I set out was to build a home ZFS Fileserver to provide a safe backup point for my data. My other goal was to do it as inexpensive as possible. My total build cost was ~$225 for the purchase of the the Hard Drives, SATA card, and network card. Truly an exceptional value for what I now have!

From a performance standpoint, the file server is extremely responsive. To be fair if there where several users hitting it, it may not perform, but for my home office environment it is spectacular.

I hope that this journey inspires you to take on the challenge of building a home NAS for yourself. I know many home users have that old computer sitting in the basement or closet wondering what to do with it. Put it to good use and do yourself a huge favor by backing up your data.

And while it it sad to see this journey come to end, maybe it quite hasn't? There is a new feature in OpenSolaris 2008.11, called Time Slider, that is a gui interface to build snapshots. Sounds like an opportunity to let the adventure continue!


No comments:

Post a Comment