Contents

Advanced Solaris Administration Class (SA-286)
Written by AdamShand on 21 August 1998

This is just a copy of the notes I took during the class. Mostly they are bits and pieces which I thought would be useful to review a couple months from now, when the class is a vaguer memory and the details and syntax has faded.

Misc Section

'sys-syspend' will write the current state of the sysem to disk so you can boot up quicker. it will work with power removed etc.

there is a bug in cde... you can not shutdown into single user mode from inside it. exit to a console first.

two usefull commands: 'logins' and 'prtconf'

Hardware Section

block devices (/dev/dsk) access the device by block, raw devices (/dev/rdsk) access the device by byte (character).

you can attach scsi devices to a hot system. not officially support and may cause your grandmother to get herpes, but it almost always works just fine.

you can rebuild your devices tree (/dev) in case of dire accident (or whim), without rebooting, by manually running:

        drvconfig       builds /devices
        devlinks        builds /dev
        tapes/disks     builds /dev
        ucblinks        builds /dev
        ports           builds serial stuff

You can see what it does on boot up by looking it:

    /etc/init.d/[drvconfig|drvlinks|...] 

Solaris reserves 10% of the file system for root write only. as of 2.6 this is reduced by 1% for every 600mb of the file system (eg. a 1500mb file system would reserve 8%). you can alter this behaviour with the '-m' option to 'newfs'.

Format will allow you to create overlapping partitions... this is bad, make sure you don't do it (eg. slice 1 ends on cylinder 1200 and slice 2 starts at cylinder 1000). this can run fine until the file systems fill up, at which point files will start randomly disappearing and being corrupted.

logical devices     /dev                used by sysadmins
physical devices    /devices            the real thing, used by everything
instance devices    /etc/path_to_inst   used by the kernel

The overlap slice (traditionally slice 2) can be redefined without any harm, it is merely convention. The only two recommended reasons for changing it are:

  1. you need eight slices on a disk.
  2. you are only going to have one slice on a disk (so make is slice 0).

Tips, Tricks and Trivia

Most compilers use /var/tmp to write temporary files to. by changing the environment variable TMPDIR (?) to point to /tmp you can have it write the temporary files to ram. this can speed up larlge compiles substantially.

The reason it's a performance benifit to have /tmp mapped to your swap partition is that it means that any writes to /tmp are being written to ram.

There are TWO swap entries in /etc/vfstab (how did i never notice that??). You can stop the swap <-> /tmp mapping taking place by removing the 'tempfs' line.

ethernet abreviations: le is 'lance ethernet', bme is 'big mac ethernet', and hme is 'happy meal ethernet' (ie. a big mac plus some).

inode = index node

ONLY 'shutdown' and 'init' run /etc/init.d scripts when changing run levels. this will be important on the oracle server (eg. 'halt' and 'reboot' do NOT run the rc scripts).

With Sun OS you had to have at least one mb of swap per mb of ram. this was because it did a one - one map between hd and ram. this is no longer true with solaris, you can have as much or as little swap as you like.

the superblock magic number is used to detect corruption in the superblock (ie. if it's not what it should be... it's corrupted). the primary superblock number is 11954 (BillJoy's birthday), and the seconary one is 90255 (someone else's birthday).

the /etc/vfstab option 'largefiles' (it's on by default) means that you can mount a partion with a file larger then 2gb. this is new to 2.6 as it supports triple indirect pointers in the inode.

in solaris 2.6 there is NO performance penalty of using a swap file over a swap slice. this is very cool. when you have multiple swap devices (be they files or slices), it puts data across them in one mb interleaves.

swap can be added dynamically (files or slices) with the swap -a command.

when you do a 'boot cdrom -s' at the proms there is a '/a' directory there for you to mount things on. (doh!)

meta devices (/dev/md) can be added to non-desctuctively, but not removed from.

/etc/dfs/sharetab is actually what is looked at for nfs export permissions. this is updated automatically from /etc/dfs/dfstab when you run 'shareall'.

web nfs is really stupid (my opinion, not suns :-), but the latest versions of netscape do support it (as well as hothjava, of course).

you can have a maximum of one line with '-o public' listed in /etc/dfs/dfstab.

'share -F nfs -o public /export' means read/write access to the world for your export directory... neat eh?

when fsck'ing meta device filesystem you must fsck the /dev/md device, not the /dev/dsk devices which comprise it (get nasty errors if you try).

in /etc/dfs/dfstab the '-F nfs' is mandatory (even though it is the default) because of the way the file is parsed at boot up.

What the UFS file system structure looks like:

This all uses about 10% of your disk as overhead, though of course it can vary depending on how many inodes you have etc.

The boot process looks like this:

  1. the boot proms load the ufs file system reader from the boot block
  2. the kernel is loaded
  3. the kernel mounts the root filesystem and starts init
  4. init steps up through the run levels doing it's thing.

With the 5/98 solaris media there is a maintainance cd on which there is a 'fastpatch' program (by casper dik?) which installs patches much faster then patch add. check it out.

Commands to Remember

newfs -v -i size -m percent <device>

prtconf

prtvtoc <raw device>

mountall -l

fsck <device1> <device2>

fsck <partition> <device>

newfs -N <device>

fsck -o b=32 <device>

fstyp -v <raw device>

fuser -c <partition>

swap [-l|-s]

nfsstat

share (in /etc/dfs/dfstab): syntax for '-o' options

share|dfshares|dfmounts|showmount

clear_locks

automount -v

bc -l

rdate <server>

snoop <hostname>

How NFS Works

if lots of clients are accessing an nfs server you can increase the number of nfsd's running by changing it in /etc/init.d/nfs.server (on the nfsd command line). for optimum performance you should have one nfsd per simultaneous access.

in /var/statmon/sm[.bak] sometimes contains outdated information. if statd starts bitching, go into this directory and delete the offending hosts names, and then restart statd.

if anything changes the file handle of a mount point (the inode is in the file handle) on the server the clients will not be able to continue to utilize the resource without remounting it. the symptom of this is "stale file handle" error messages.

mountd is the ONLY thing that looks at /etc/dfs/sharetab. so if a client has a share mounted and you then revoke their permissions they can still utilize the share because they are talking to nfsd.

mounting nfs shares in the foreground ('fg' option in /etc/vfstab) is really only useful for diskless clients, where is they can't get the share, they may as well hang forever waiting for it.

the only justification for mounting a share with as non-interuptable ('nointr' in /etc/vfstab) is is "you are so screwed without this share that you may as well hang forever waiting for it to come back".

with bizarre nfs slowness problems often increasing the blocksize to 32 can help (this is not understood voodoo... but has helped many people).

hosts listed in /etc/dfs/dfstab must have the fqdn listed if you are using dns. mountd will not cannonify them.

if you have /export exported to a client then the client could mount /export/html since they have access rights to anything under /export.

you can run a cachefs filesystem (with it really being an nfs mount) with consistancy checking turned off and a client will continue to run even if the nfs mount fails.

you can run a cachefs file system for a subset of a nfs share. (eg. if you have /share mounted you could run a cachefs filesystem on only /share/bin).

Automounter Stuff

automounter is a client side ONLY thing. the resource is shared normally on the server.

you can use either automountd OR /etc/vfstab to mount a share. you can never use both.

if you would get a "mount point busy" error if you tried to unmount a share, then automounter considers the share non-idle (eg. if you login and your home directory is automatically mounted, it will not unmount until 5 minutes after you log out).

you can't list netgroups in auto_home. you have to list either usernames or a * (for everything).

How the NFS Daemons Work

mountd: (runs on the server). this is the process that authenticates a clients request to mount a share.

nfsd: (runs on the server) this is what actually talks to the client's kernel and does the interesting stuff.

statd: (runs on both) notifies clients if/when an nfs servers becomes reachable again after being unreachable.

lockd: (runs on both) sends file lock requests from the client to the server. if a clients lockd is notified that a server is back up it will re-issue locks to the server for all files that it believes it has a lock on. if in the meantime the server has granted one of those locks to another client it will issue an error message to the client.

          NFS CLIENT                                     NFS SERVER
 +--------------------------+                   +-----------------------+
 |                          |   mount request   |                       |
 | # mount server:/opt /opt |    ---------->    | mountd                |
 |   (/etc/vfstab)          |   <------------   | (/etc/dfs/sharetab)   |
 |                          |  file handle (fh) |                       |
 |                          |                   |                       |
 |                          |                   |                       |
 |                          |  file access (fh) |                       |
 | # cat /opt/bigfile.txt   |   ------------->  | nfsd                  |
 |                          |    <-----------   |                       |
 |                          |        data       |                       |
 |                          |                   |                       |
 |                          |                   |                       |
 |              +--> lockd  |   <---------->    | lockd <--+            |
 |              |           |                   |          |            |
 |              +--> statd  |   <---------->    | statd <--+            |
 |                          |                   |                       |
 +--------------------------+                   +-----------------------+

NIS and NIS+ Stuff

NIS+ Basics and Some NIS Specific Info

if /var/yp/domainname/ypservers exists (created by 'ypinit -c') then 'ypbind' is started on bootup bound to a particular master/slave server. if it does not exist then it will broadcase for a server and the fastest one to respond is it's daddy.

if ypbind is binding to the wrong master/slave server you can force it to bind to a particular server by changing 'ypstart' to start 'ypbind' like this:

you can then use the command 'ypset <server>' to manually set the server it should bind to.

nsswitch.conf neat stuff, more info at 13-41, the below syntax is useful:

what this says is that if the host doesn't exist in the nis host map then don't bother checking the /etc/hosts file. however if the server is down then when a query is made it won't return a 'NOTFOUND' error, it will return an 'UNAVAIL' error so it will check the /etc/hosts file.

/var/yp/domainname/ypservers is a list on the master server of all the slave servers to push information out to when a map is updated. this file is created when 'ypinit -c' is run so there is no text file to update, only a map. so if you need to add/change slave servers you have to either re-run 'ypinit -c' (which will result in a denial of service as you do it) or you can run the neato script provided on ??-?? when will remake the map file for you.

there is a yppasswdd (spelling?) which will allow you to update the passwd/shadow map from a client server. one gotcha with this is that if this is a large map file, then everytime someone runs this it will re-push out the ENTIRE map to all of the slaves (you can disable this behavior).

boeing has 300,000 users in their passwd/shadow NIS (not NIS+) maps and it works great, but they are apparently the largest installion in the world.

How NIS+ is Different

a nis slave equals a nisplus replica

a nis map equals a nisplus table

nisplus supports partial updates of tables (eg. changing a users password only results in that new password being pushed out to all the replica servers.

nisplus is more secure (although there is still no encryption), mostly by providing authentication and authorization features. running nisplus in compatibility mode simply removes the security features (secuirty level 0). this is no worse the simply running nis.

a nisplus domainname must end in a dot (eg. alaska.net.)

nisplus supports client side nis caching.

Ideas for Internet Alaska

mount /users nosuid

on name servers turn off dns in /etc/nsswitch.conf, and make it not point at itself for dns. this may help resolve hanging problem (obviously an ugly solution).

check out the E250's ... apparently they cost less then ultra2's (16k from sun with rackmount and 2x250mhz cpu's ... from sun).

track down asd (a free automounter). it has some really good white papers included on nfs... apparently.

what about mounting /share read only to everything but the admin server?

check out acl's (man -k acl), they work over nfs (but will they work with the netapps?), and are pretty cool.

Making KSH Usefull

Esc x2 is file name completion

-- .kshrc -- Note: ^ means "control", so you are embedding literal control characters. VI is useful for this.

alias __A=^P
alias __B=^N
alias __C=^F
alias __D=^B
set -o emacs
export PS1="`hostname`(`whoami`)> "

-- .profile --

export ENV=$HOME/.kshrc
export PATH=/bin:/sbin ...


CategorySoftware CategoryPublished

SunSolarisClassSa286 (last edited 2004-06-24 00:28:16 by AdamShand)