Archive for the 'Storage' category

Enterprise iSCSI storage with OpenSolaris and COMSTAR

October 28, 2009 4:11 pm

The goal of this project is to build enterprise-grade iSCSI storage that is modular enough to meet any iSCSI needs.

I chose OpenSolaris for the flexibility we get from ZFS, which everyone has at least heard of, but also for its Common SCSI Target (COMSTAR) project.   I’ll only be discussing the iSCSI target portion of this project, but I recommend reading more on the capabilities of COMSTAR outside of the iSCSI space.

HARDWARE

Since we talking about building our own storage array, lets look at some hardware options. My personal preference for a chassis is the Supermicro SC846 for the redundant power, an option for two internal disks so that you can use all 24 hot-swap for just storage, and the ability to use 3.5″ and 2.5″ drives.

If you prefer only go for 2.5″ drives you might want to check out SC216 which will also provide 24 disks, but in only 2 Rack Units of space.

The next important decision is the HBA(s) you will be using. Keep in mind that since we’ll be using ZFS for this project, we do not want a RAID card, but instead a JBOD card. Trust me on this on, RAID cards can turn out to be a nightmare, plus JBOD HBA’s are cheaper. My personal preference in HBA for OpenSolaris is the LSI 3081 card for the Fusion-MPT chip. You don’t have to buy LSI brand, but I definitely recommend an HBA with Fusion-MPT. Since these provide 8 SATA ports, you’ll need 3 of them to support 24 disks.

The last major hardware decision is networking. My preference for 1GB is a NIC with the 82571EB chip. Intel offers a single port card, the Intel PRO/1000 MT, and a dual-port version, the Intel PRO/1000 PT.

For 10GB I recommend a NIC with the 82598EB chip. For a dual-port CX-4 version I use the EXPX9502CX4 card or if you prefer dual port SR fiber go with the EXPX9502AFXSR card.

Just to be clear, all of the hardware recommendation I have made I currently use with OpenSolaris servers and are confirmed by Sun to be supported in OpenSolaris.

CONFIGURATION

Let’s start with a fresh install of OpenSolaris.

Mirror OS disk

First, lets mirror our OS disk for added reliability.  In this example, OpenSolaris was install on disk c9d0s0.  Our second OS disk is c10d0s0.

1. Create a solaris disk label on the second disk

# format c10d0s0

Select “fdisk” then “create 100% Standard Solaris Partition over the full Disk”

2. Next, we need to copy the Solaris slice layout from the OS disk to the second disk. (note we use s2, this is very important)

# prtvtoc /dev/rdsk/c9d0s2 | fmthard -s - /dev/rdsk/c10d0s2

3. Next, we’ll attach the mirror disk to the OS zpool

#  zpool attach -f rpool c9d0s0 c10d0s0

4. Last, we need to make the second disk bootable

# installgrub -m /boot/grub/stage1 /boot/grub/stage2 /dev/rdsk/c10d0s0

Static IP
If you don’t want to rely on always getting the same DHCP IP, you’ll probably want to statically configure the IP of your storage server.

First, we need to disable the NetworkAutomagic service

# svcadm disable network/physical:nwam

Next, enable the config file-based networking service

# svcadm enable network/physical:default

Now we must configure the IP statically. This is done by creating a /etc/hostname. file. In this example I’ll use the e1000g0 interface.

 vi /etc/hostname.e1000g0
192.168.1.200

Configure the netmask for the management IP

# vi /etc/netmasks
192.168.1.0 255.255.255.0

Configure the default gateway

# vi /etc/defaultrouter
192.168.1.1

Tell system to use standard file-based DNS

# cp /etc/nsswitch.dns /etc/nsswitch.conf

Now, Configure DNS servers

# vi /etc/resolv.conf
nameserver 192.168.1.4

Configure IP Multi-Pathing (IPMP)
If you went with a dual-port card, or two cards, it’s advisable to use IPMP so that a single link down doesn’t make your iSCSI volumes unaccessible.

In this example I’m using two e1000g interfaces and creating the IPMP interface iscsi0.

# vi /etc/hostname.iscsi0
ipmp group san0 192.168.1.200 up

The primary interface of the IPMP group is e1000g0

# vi /etc/hostname.e1000g0
group san0 -failover up

The backup interface is e1000g1

# vi /etc/hostname.e1000g1
group san0 -failover standby up

Enable COMSTAR
Install stmf (library and service for COMSTAR)

# pkg install SUNWstmf

Now install the iSCSI toolset

# pkg install SUNWiscsit

At this point, reboot your machine before continuing on.

After rebooting, we will enable the stmf service

# svcadm enable stmf

Creating your zpool
I went with a chassis that supports up to 24 disks to build in room for expansion. Based on you needs, you can fill all 24 hot-swap trays with raw storage to be exported as one or more iSCSI volumes, or you can use some to take advantage of some of the performance advantages of creating a hybrid pool.

If you are unfamiliar with the term hybrid pool, I suggest reading up on ZIL and L2ARC. Here are a few links to get you started:
ZIL: SLOG BLOG
L2ARC: ZFS L2ARC

So for purposes of this example, I’ll presume to save four drive bays for SSDs, a pair for ZIL and a pair for L2ARC, leaving 20 disks. We can then use these 20 drive slots for 4 RAIDZ of 5 disks. I’m going to start with configuring one, then I’ll explain how to grow your ZPOOL when adding your second RAIDZ for storage expansion.

Before we go any further, now is a good time to demonstrate two useful commands. The first, we can use devfsadm to scan for newly added disks.

# devfsadm -Cv

Second, we can use the format command to list all recognized disks.

# format < /dev/null

For the first 5 storage disks, mine were recognized on channel 7. I'll create my initial zpool named "iSCSIdisks" as a RAIDZ using all 5 disks.

# zpool create iSCSIdisks raidz c7t0d0 c7t1d0 c7t2d0 c7t3d0 c7t4d0

There we go, we now have our storage to start creating iSCSI volumes. I'm going to now create a 20GB zvol (target volume) that will be used as the disk for a virtual machine.

# zfs create -V 20G iSCSIdisks/vm1_hdd

Next, I need to make a LUN (Logical Unit) out of this volume.

# sbdadm create-lu /dev/zvol/rdsk/iSCSIdisks/vm1_hdd

Now that we have create a logical unit, we need to find out the GUID of this volume so that we can provide it to COMSTAR for iSCSI access. Here's how you list all LUNs that have been created.

# sbdadm list-lu

Now, if you don't already have the iscsit server enabled, now would be a good time to do so.

# svcadm enable -r svc:/network/iscsi/target:default

I'm going to create a basic iSCSI target configuration here that leaves this storage wide-open to be accessed by anyone, I suggest you secure yours. To do so, read up on itadm in the man page.

# itadm create-target

You can now see your newly created iSCSI target, and all previously created ones, using the itadm command.

# itadm list-target

You're all set to access this storage remotely.

The last thing I want to come back to is how we will grow our underlying storage as we need to expand. Following the previous example of a 5 disk RAIDZ, I'll just add a second 5 disk RAIDZ to the zpool iSCSIdisks.

Since I have 3 LSI HBAs, each with 8 ports, my next 5 disks will consume the last 3 ports of my first HBA and the first 2 ports of my second one. I plug in the 5 new disks, run "devfsadm -Cv" then run "format < /dev/null" to ensure they have been recognized. Now I'm ready to add them.

# zpool add iSCSIdisks raidz c7t5d0 c7t6d0 c7t7d0 c8t0d0 c8t1d0

And that’s it, your zpool is now grown and ready to be sliced up into more iSCSI targets.

Enjoy your new enterprise iSCSI array, and don’t for get to check out ZIL and L2ARC!

AppleTalk on Solaris (AFP)

July 27, 2008 7:45 pm

AppleTalk has been quite a popular choice for networking computers mainly because it’s, well, made by Apple. It was supposedly made solely for Apple computers but the Internet folks have never been one to settle for should be’s and would rather run after could be’s. If cheap mobile phones today can be hacked to become as efficient as high-end phones, why not try making AppleTalk work on other operating systems right? That’s what we’re aiming to do on this page, make AppleTalk work on Solaris.

[Taken from: www.unixzone.dk]

Netatalk 2.0.3 requires some patching to compile on Solaris 10 (or OpenSolaris)

    Requirements:

  • netatalk
  • Berkeley DB 4.2.52
  • GCC compiler, Sun Studio didn’t work for me
  • Patches: netatalk-2.0.3/sys/netatalk/at.h and netatalk-2.0.3/sys/solaris/tpi.c

Click <HERE> for build instructions for DB

On Solaris we don’t use ranlib, LDFLAGS adds /usr/local/lib to the
library search path where my Berkeley DB libs are , rest of the options
are self-explanatory.

 # gzip -cd netatalk-2.0.3.tar.gz | tar xf -
# gzip -cd patches.tar.gz | tar xf -
# cd netatalk-2.0.3
# RANLIB=echo CC=gcc LDFLAGS=-R/usr/local/BerkeleyDB.4.2/lib
./configure --prefix=/opt/netatalk --with-ssl-dir=/usr/sfw
--with-bdb=/usr/local/BerkeleyDB.4.2 --without-pam --disable-ddp
--disable-tcp-wrappers  --disable-srvloc --with-cnid-dbd-backend
# echo "#define SOLARIS2 10" >>config.h

Depending on the version of your Solaris installation, you’ll want to
change this to match, ie. 8, 9, 10, or 11 for OpenSolaris.

Patch the source to support x64 Solaris

 # patch -i ../patches/at.h.patch sys/netatalk/at.h
Looks like a unified context diff.
done
#  patch -i ../patches/config.h.patch ./config.h
Looks like a normal diff.
done
# patch -i ../patches/endian.h.patch sys/netatalk/endian.h
Looks like a normal diff.
done
# patch -i ../patches/tpi.c.patch sys/solaris/tpi.c
Looks like a unified context diff.
done

Build and install the software

 # make
# make install

Under Solaris, you must create atalkd.conf, since Solaris provides no
method for determining the names of the available interfaces. It is
sufficent to name the available interfaces in atalkd.conf, one per line.
E.g.
eri0
on a line by itself on many Suns, hme0 on Ultras etc. See atalkd(8).

Create init script and add to Sun’s svc system

 # cp distrib/initscripts/rc.atalk.sysv /opt/local/lib/svc/method/netatalk

Place netatalk.xml somewhere on the file system

 # svccfg import /path/to/netatalk.xml
# svcadm enable netatalk
# rm /path/to/netatalk.xml

Now for configuration:

 # cd /opt/netatalk/etc/netatalk/
# ls -l
total 96
-rw-r--r--   1 root     root       5066   Apr  4 15:21 AppleVolumes.default
-rw-r--r--   1 root     root       25124  Apr  2 14:49 AppleVolumes.system
-rw-r--r--   1 root     root       11259  Apr  4 14:59 afpd.conf
-rw-r--r--   1 root     root       1059   Apr  4 11:57 atalkd.conf
-rw-r--r--   1 root     root       1429   Apr  4 15:01 netatalk.conf
-rw-r--r--   1 root     root       1479   Apr  2 14:49 papd.conf
drwxr-xr-x   2 root     root       512    Apr  3 11:49 uams
#

Add the following to “afpd.conf”:

 "Solaris AFP" -uamlist uams_guest.so -loginmesg "Welcome, $u!" -transall -noddp -tcp

Configure “netatalk.conf” as seen here:

 # Appletalk configuration
# Change this to increase the maximum number of clients that can connect:
AFPD_MAX_CLIENTS=50

# Change this to set the machine’s atalk name and zone, the latter containing
# the ‘@’ sign as first character — compare with nbp_name(3) if in doubt
#
# NOTE: If Netatalk should register AppleTalk services in the standard zone
#       then you need not to specify a zone name here.

#
#       If your zone has spaces in it, you’re better off specifying
#       it in afpd.conf if you realize that your distribution doesn’t
#       handle spaces correctly in the startup script. Remember to use
#       quotes here if the zone name contains spaces.
#
#ATALK_ZONE=”@some zone”
ATALK_NAME=`echo ${HOSTNAME}|cut -d. -f1`
# specify the Mac and unix charsets to be used

ATALK_MAC_CHARSET=’MAC_ROMAN’
ATALK_UNIX_CHARSET=’LOCALE’
# specify this if you don’t want guest, clrtxt, and dhx
# available options: uams_guest.so, uams_clrtxt.so, uams_dhx.so,
#                    uams_randnum.so
#AFPD_UAMLIST=”-U uams_clrtxt.so,uams_dhx.so”
# Change this to set the id of the guest user
AFPD_GUEST=nobody
# Set which daemons to run (papd is dependent upon atalkd):

ATALKD_RUN=no
PAPD_RUN=no
CNID_METAD_RUN=yes
AFPD_RUN=yes
TIMELORD_RUN=no
A2BOOT_RUN=no
# Control whether the daemons are started in the background
ATALK_BGROUND=no
# export the charsets, read form ENV by apps

export ATALK_MAC_CHARSET
export ATALK_UNIX_CHARSET

Add the following to “AppleVolumes.default”:

 :DEFAULT: cnidscheme:dbd
/Storage "Storage" rwlist:nobody