Contents
Advanced Solaris Administration Class (SA-286)
Written by AdamShand on 21 August 1998
This is just a copy of the notes I took during the class. Mostly they are bits and pieces which I thought would be useful to review a couple months from now, when the class is a vaguer memory and the details and syntax has faded.
Contents
Misc Section
'sys-syspend' will write the current state of the sysem to disk so you can boot up quicker. it will work with power removed etc.
there is a bug in cde... you can not shutdown into single user mode from inside it. exit to a console first.
two usefull commands: 'logins' and 'prtconf'
Hardware Section
block devices (/dev/dsk) access the device by block, raw devices (/dev/rdsk) access the device by byte (character).
you can attach scsi devices to a hot system. not officially support and may cause your grandmother to get herpes, but it almost always works just fine.
you can rebuild your devices tree (/dev) in case of dire accident (or whim), without rebooting, by manually running:
drvconfig builds /devices
devlinks builds /dev
tapes/disks builds /dev
ucblinks builds /dev
ports builds serial stuff
You can see what it does on boot up by looking it:
/etc/init.d/[drvconfig|drvlinks|...]
Solaris reserves 10% of the file system for root write only. as of 2.6 this is reduced by 1% for every 600mb of the file system (eg. a 1500mb file system would reserve 8%). you can alter this behaviour with the '-m' option to 'newfs'.
Format will allow you to create overlapping partitions... this is bad, make sure you don't do it (eg. slice 1 ends on cylinder 1200 and slice 2 starts at cylinder 1000). this can run fine until the file systems fill up, at which point files will start randomly disappearing and being corrupted.
logical devices /dev used by sysadmins physical devices /devices the real thing, used by everything instance devices /etc/path_to_inst used by the kernel
The overlap slice (traditionally slice 2) can be redefined without any harm, it is merely convention. The only two recommended reasons for changing it are:
- you need eight slices on a disk.
- you are only going to have one slice on a disk (so make is slice 0).
Tips, Tricks and Trivia
Most compilers use /var/tmp to write temporary files to. by changing the environment variable TMPDIR (?) to point to /tmp you can have it write the temporary files to ram. this can speed up larlge compiles substantially.
The reason it's a performance benifit to have /tmp mapped to your swap partition is that it means that any writes to /tmp are being written to ram.
There are TWO swap entries in /etc/vfstab (how did i never notice that??). You can stop the swap <-> /tmp mapping taking place by removing the 'tempfs' line.
ethernet abreviations: le is 'lance ethernet', bme is 'big mac ethernet', and hme is 'happy meal ethernet' (ie. a big mac plus some).
inode = index node
ONLY 'shutdown' and 'init' run /etc/init.d scripts when changing run levels. this will be important on the oracle server (eg. 'halt' and 'reboot' do NOT run the rc scripts).
With Sun OS you had to have at least one mb of swap per mb of ram. this was because it did a one - one map between hd and ram. this is no longer true with solaris, you can have as much or as little swap as you like.
the superblock magic number is used to detect corruption in the superblock (ie. if it's not what it should be... it's corrupted). the primary superblock number is 11954 (BillJoy's birthday), and the seconary one is 90255 (someone else's birthday).
the /etc/vfstab option 'largefiles' (it's on by default) means that you can mount a partion with a file larger then 2gb. this is new to 2.6 as it supports triple indirect pointers in the inode.
in solaris 2.6 there is NO performance penalty of using a swap file over a swap slice. this is very cool. when you have multiple swap devices (be they files or slices), it puts data across them in one mb interleaves.
swap can be added dynamically (files or slices) with the swap -a command.
when you do a 'boot cdrom -s' at the proms there is a '/a' directory there for you to mount things on. (doh!)
meta devices (/dev/md) can be added to non-desctuctively, but not removed from.
/etc/dfs/sharetab is actually what is looked at for nfs export permissions. this is updated automatically from /etc/dfs/dfstab when you run 'shareall'.
web nfs is really stupid (my opinion, not suns :-), but the latest versions of netscape do support it (as well as hothjava, of course).
you can have a maximum of one line with '-o public' listed in /etc/dfs/dfstab.
'share -F nfs -o public /export' means read/write access to the world for your export directory... neat eh?
when fsck'ing meta device filesystem you must fsck the /dev/md device, not the /dev/dsk devices which comprise it (get nasty errors if you try).
in /etc/dfs/dfstab the '-F nfs' is mandatory (even though it is the default) because of the way the file is parsed at boot up.
What the UFS file system structure looks like:
- sector 0 of a disk contains the labelling information.
- sectors 1-16 contains the boot block. the boot block consistes of the UFS file system reader (so it can read the kernel).
- sector 17-32 contains the super block
- next is the cylinder groups (a cylinder group is 16 cylinders). each cylinder group contains:
- a backup superblock
- the cylinder group block (information about the cylinder group)
- the inode table
- and finally your actual data
This all uses about 10% of your disk as overhead, though of course it can vary depending on how many inodes you have etc.
The boot process looks like this:
- the boot proms load the ufs file system reader from the boot block
- the kernel is loaded
- the kernel mounts the root filesystem and starts init
- init steps up through the run levels doing it's thing.
With the 5/98 solaris media there is a maintainance cd on which there is a 'fastpatch' program (by casper dik?) which installs patches much faster then patch add. check it out.
Commands to Remember
newfs -v -i size -m percent <device>
- where 'size' is the unit to have an inode for every one of, and 'percent' is the amount of root write only space to reserve.
prtconf
- will show information about your drives
prtvtoc <raw device>
- will show information on the labelling of the device
mountall -l
- will mount all *local* partitions.
fsck <device1> <device2>
- you can specify multiple devices on the the command line.
fsck <partition> <device>
- you can mix partion names and devices so long as the partition names come first (eg. fsck /opt /dev/dsk/c0t0d0s1)
newfs -N <device>
- shows all backup superblocks for that partion.
fsck -o b=32 <device>
- rebuilds a file system using the backup superblock on block 32 (there is always a back up on block 32 of the slice)
fstyp -v <raw device>
- butloads of info on the slice (like really)
fuser -c <partition>
- will show what pid's are active in the partition. useful for figuring out why you can't unmount a partition.
swap [-l|-s]
- shows how much swap is available. info in 10-17 on how to create a swap file, and make it active.
nfsstat
- gives lots of info on nfs shares.
share (in /etc/dfs/dfstab): syntax for '-o' options
-o rw=calvino:byatt:valkyrie
-o rw=byatt,ro=greyling
-o rw=byatt,ro (all hosts get read only access)
-o rw=byatt,ro=calvino,root=greyling you can use nis[+] netgroups instead of hosts (except or the 'root' option). if there is a netgroup/hostname conflict the netgroup wins.
share|dfshares|dfmounts|showmount
- gives information on what's shared, on your server or other servers
clear_locks
clears any remaining nfs locks <sigh> ...
automount -v
- shows any changes to the direct or master maps since automount was last run.
bc -l
allows you to convert between bases.
# bc -l
obase=16 (output base)
192
C0
^D (to exit)
rdate <server>
- sets the local machines clock from 'server' using the time service.
snoop <hostname>
- will snoop all traffic to or from that host. (doh!)
How NFS Works
if lots of clients are accessing an nfs server you can increase the number of nfsd's running by changing it in /etc/init.d/nfs.server (on the nfsd command line). for optimum performance you should have one nfsd per simultaneous access.
in /var/statmon/sm[.bak] sometimes contains outdated information. if statd starts bitching, go into this directory and delete the offending hosts names, and then restart statd.
if anything changes the file handle of a mount point (the inode is in the file handle) on the server the clients will not be able to continue to utilize the resource without remounting it. the symptom of this is "stale file handle" error messages.
mountd is the ONLY thing that looks at /etc/dfs/sharetab. so if a client has a share mounted and you then revoke their permissions they can still utilize the share because they are talking to nfsd.
mounting nfs shares in the foreground ('fg' option in /etc/vfstab) is really only useful for diskless clients, where is they can't get the share, they may as well hang forever waiting for it.
the only justification for mounting a share with as non-interuptable ('nointr' in /etc/vfstab) is is "you are so screwed without this share that you may as well hang forever waiting for it to come back".
with bizarre nfs slowness problems often increasing the blocksize to 32 can help (this is not understood voodoo... but has helped many people).
hosts listed in /etc/dfs/dfstab must have the fqdn listed if you are using dns. mountd will not cannonify them.
if you have /export exported to a client then the client could mount /export/html since they have access rights to anything under /export.
you can run a cachefs filesystem (with it really being an nfs mount) with consistancy checking turned off and a client will continue to run even if the nfs mount fails.
you can run a cachefs file system for a subset of a nfs share. (eg. if you have /share mounted you could run a cachefs filesystem on only /share/bin).
Automounter Stuff
automounter is a client side ONLY thing. the resource is shared normally on the server.
you can use either automountd OR /etc/vfstab to mount a share. you can never use both.
if you would get a "mount point busy" error if you tried to unmount a share, then automounter considers the share non-idle (eg. if you login and your home directory is automatically mounted, it will not unmount until 5 minutes after you log out).
you can't list netgroups in auto_home. you have to list either usernames or a * (for everything).
How the NFS Daemons Work
mountd: (runs on the server). this is the process that authenticates a clients request to mount a share.
nfsd: (runs on the server) this is what actually talks to the client's kernel and does the interesting stuff.
statd: (runs on both) notifies clients if/when an nfs servers becomes reachable again after being unreachable.
lockd: (runs on both) sends file lock requests from the client to the server. if a clients lockd is notified that a server is back up it will re-issue locks to the server for all files that it believes it has a lock on. if in the meantime the server has granted one of those locks to another client it will issue an error message to the client.
NFS CLIENT NFS SERVER +--------------------------+ +-----------------------+ | | mount request | | | # mount server:/opt /opt | ----------> | mountd | | (/etc/vfstab) | <------------ | (/etc/dfs/sharetab) | | | file handle (fh) | | | | | | | | | | | | file access (fh) | | | # cat /opt/bigfile.txt | -------------> | nfsd | | | <----------- | | | | data | | | | | | | | | | | +--> lockd | <----------> | lockd <--+ | | | | | | | | +--> statd | <----------> | statd <--+ | | | | | +--------------------------+ +-----------------------+
NIS and NIS+ Stuff
NIS+ Basics and Some NIS Specific Info
if /var/yp/domainname/ypservers exists (created by 'ypinit -c') then 'ypbind' is started on bootup bound to a particular master/slave server. if it does not exist then it will broadcase for a server and the fastest one to respond is it's daddy.
if ypbind is binding to the wrong master/slave server you can force it to bind to a particular server by changing 'ypstart' to start 'ypbind' like this:
ypbind -ypsetme -broadcast
you can then use the command 'ypset <server>' to manually set the server it should bind to.
nsswitch.conf neat stuff, more info at 13-41, the below syntax is useful:
hosts: nis [NOTFOUND=return] files
what this says is that if the host doesn't exist in the nis host map then don't bother checking the /etc/hosts file. however if the server is down then when a query is made it won't return a 'NOTFOUND' error, it will return an 'UNAVAIL' error so it will check the /etc/hosts file.
/var/yp/domainname/ypservers is a list on the master server of all the slave servers to push information out to when a map is updated. this file is created when 'ypinit -c' is run so there is no text file to update, only a map. so if you need to add/change slave servers you have to either re-run 'ypinit -c' (which will result in a denial of service as you do it) or you can run the neato script provided on ??-?? when will remake the map file for you.
there is a yppasswdd (spelling?) which will allow you to update the passwd/shadow map from a client server. one gotcha with this is that if this is a large map file, then everytime someone runs this it will re-push out the ENTIRE map to all of the slaves (you can disable this behavior).
boeing has 300,000 users in their passwd/shadow NIS (not NIS+) maps and it works great, but they are apparently the largest installion in the world.
How NIS+ is Different
a nis slave equals a nisplus replica
a nis map equals a nisplus table
nisplus supports partial updates of tables (eg. changing a users password only results in that new password being pushed out to all the replica servers.
nisplus is more secure (although there is still no encryption), mostly by providing authentication and authorization features. running nisplus in compatibility mode simply removes the security features (secuirty level 0). this is no worse the simply running nis.
a nisplus domainname must end in a dot (eg. alaska.net.)
nisplus supports client side nis caching.
Ideas for Internet Alaska
mount /users nosuid
on name servers turn off dns in /etc/nsswitch.conf, and make it not point at itself for dns. this may help resolve hanging problem (obviously an ugly solution).
check out the E250's ... apparently they cost less then ultra2's (16k from sun with rackmount and 2x250mhz cpu's ... from sun).
track down asd (a free automounter). it has some really good white papers included on nfs... apparently.
what about mounting /share read only to everything but the admin server?
check out acl's (man -k acl), they work over nfs (but will they work with the netapps?), and are pretty cool.
Making KSH Usefull
Esc x2 is file name completion
-- .kshrc -- Note: ^ means "control", so you are embedding literal control characters. VI is useful for this.
alias __A=^P alias __B=^N alias __C=^F alias __D=^B set -o emacs export PS1="`hostname`(`whoami`)> "
-- .profile --
export ENV=$HOME/.kshrc export PATH=/bin:/sbin ...