Quick Index


|
|
This paper was originally published in the Proceedings
for Fifth USENIX LISA System Administration Conference, San
Deigo, CA. 1991.
Fdist: A Domain Based File Distribution
System
for a
Heterogeneous Environment
Linda Bissum
Paul M. Moriarty
ABSTRACT
Administration of a large UNIX site is by no means easy.
The tools provided by standard UNIX implementations are, at
best, inadequate. The explosive growth in network sizes
over the last few years has resulted in larger and more
complex sites but no significant new tools to help system
administrators maintain these sites; particularly in
networked UNIX environments.
One major problem that has not been fully solved is that
of automatic file distribution. The rdist(1)
command works well in small homogeneous environments.
However in larger and more heterogeneous environments, it
becomes difficult to maintain the rdist command files in an
orderly and systematic manner.
This paper describes the design, implementation and
practical experience with fdist, a domain oriented
distribution tool that enables easy management of automatic
file distribution in a large heterogeneous environment.
Fdist is written in Perl and uses a slightly
modified version of rdist as its underlying file
distribution tool.
Introduction
This paper describes a software package, fdist,
which was developed by MIPS Computer Systems, Inc.
Fdist provides automatic distribution of files
across a large number of heterogeneous systems in a manner
which appears virtually homogeneous.
Prior to the development of the fdist system,
MIPS used a simplistic automatic file distribution approach
which was originally implemented to support a much smaller
host base. The MIPS scheme also used the rdist command, but
the rdist distribution files (distfiles) were maintained
entirely by hand. By the time the fdist project
was initiated, the MIPS corporation had grown to the size
where the manual maintenance process of the distfiles had
become cumbersome, time consuming and error prone. Some of
the problems encountered included:
- The distfile was originally written to handle
approximately 10 hosts. As the number of hosts increased,
the distfile and the recipient hosts list became a
hodge-podge of rules for the various types of hosts. As a
result of the lack of a systematic approach to their
design, the distfile and recipient hosts list became very
difficult to understand and maintain in a reasonable
manner.
- Although there were approximately 500 UNIX hosts in
/etc/hosts, only 123 were receiving updates because the
remaining hosts were too different to group in the
distfile (e.g., they were a hardware platform or version
of the operating system that wasn't already modeled). The
differences in configuration among hosts was a source of
network reliability problems.
- Under sub-optimal conditions, rdist would take more
than 12 hours to complete (sometimes due to network load,
hosts which were down or unreachable, or especially large
numbers of files to distribute).
- Circumstances often arose which necessitated the
frequent (more than daily) distribution of some kinds of
information (e.g., when adding a new host or
changing/updating mail aliases). This wasted considerable
network bandwidth.
- Customizing distributions for a single host or
collection of hosts was cumbersome. The existing set of
rules had evolved into an ``all or nothing'' scheme.
It became clear that this approach would be unable to
handle the expected future growth in the number of file
servers, let alone providing similar support of
workstations. Therefore, the system administration staff
decided to design and implement a system which would be
able to handle automatic file distribution of a large
number of heterogeneous hosts.
Description
Fdist is essentially a distribution system
built on top of rdist. One host, called the distribution
master, distributes both the source files and the fdist
utilities to five distribution slaves. From the perspective
of fdist, the distribution slaves are clones of
the distribution master in that each contains a copy of all
files which are to be distributed as well as the
fdist utilities. The distribution slaves then
distribute the source files to their respective clients
(approximately 100 per slave). Additionally, fdist
generates and maintains the /etc/hosts and /etc/ethers
files for all hosts under its control. Fdist sends
e-mail to notify the system administrator(s) responsible
for maintaining the fdist system of any problems relating
to the distribution of files. The person tasked with
maintaining the fdist system is more commonly
referred to as the distmaster.
Design goals
The above problems motivated the decision to implement a
new distribution scheme. Some of the more important goals
for the new fdist distribution scheme were:
- Full support for heterogeneity.
- The ability to support a minimum of 1000 hosts.
- The ability to specify distribution based primarily
on use of the target system rather than hardware
vendor and OS type.
- Full support for separating policy strategies from
the daily maintenance of the network.
- Centralized control of distribution.
- Ease of use. Fdist should enable a less
skilled system operator to maintain many of aspects of
the distribution without the need to know rdist syntax or
any other complex language.
- Enable a simple scheme where files are distributed
with an interval relevant to their type and importance.
To keep network usage to a minimum it was important to
ensure that on any given hour, only a few files flowed
through the network instead of the old ``distribute
everything to the entire world'' approach.
- Support good interaction with other tools. We wanted
a shell command line approach, rather than a graphical
user interface, to ensure that we could integrate the new
distribution system with existing tools where necessary.
If the need for a menu driven system, or a full graphical
user interface arose later, it could always be built on
top of the shell command line interface.
In order to achieve these goals, it was necessary to
have the ability to specify to which hosts a given file
should be distributed, without having to specify a list of
hosts on a command line. The approach that was chosen was
one which had been discussed in the IEEE POSIX 1003.7
system administration committee which groups hosts into
administrative domains. Since the committee gave no
suggestions as to what an administrative domain was, the
Internet's use of fully qualified, domain style names for
e-mail was emulated for the syntax and semantics used in
the administrative domain naming scheme. However, the
administrative domains are, in this implementation,
completely separate from the Internet e-mail domains, and
share only the syntax and semantics of the Internet
approach. Therefore, each host is part of both an Internet
and administrative domain, with neither type imposing any
restriction on the other. For the purposes of this paper,
the term domain refers to administrative domains (described
below).
Administrative domains
As mentioned above, the administrative domain's syntax
was borrowed from the Internet domain structure. However,
the semantics are somewhat different and less general. When
the domain structures were designed, it was necessary to
ensure that:
- Each host could be member of several administrative
domains, where each domain is a ``view'' of a specific
administrative problem.
- Hosts could be grouped according to their usage.
- Different hardware platforms and operating systems
could be supported.
- Wildcards could be used in place of any domain name
component. This was implemented using the domain name
common, which matches any domain name component.
( common was chosen rather than the asterisk
(`*') character to avoid the conflict with the shell, as
it was envisioned that the system would used by less
skilled system operators.)
A five component domain name permitted sufficient
flexibility to meet the design goals. Fdist should
handle an arbitrary number of domain name components but
only five were used in this implementation.) Each level has
a specific meaning:
- The top level component of the domain name is used
for the general specification of the domain. Today, there
are only two top level domains in use, adm and
prt, however the fdist system will
support any number of top level domains.
- The second level component of the domain name
specifies the group to which the host belongs. While
there are a number of these domains, this paper will
exemplify only the cc (computer center) and osgrp
(operating system development group).
- The third level component of the domain name
specifies the specific operating system that the host is
running. In all instances it is necessary to
differentiate between SunOS and RISC/os, but in many
cases it is also necessary to differentiate between
different releases of the same operating system.
- The fourth level component of the domain name is used
to specify the hardware platform on which the OS is
running. These are typically real differences between
different architectures, like Sun 3 and Sun 4. In some
cases, the fourth level distinguishes among system
configurations, like the difference between a file server
and a workstation.
- The fifth level component of the domain name has not
been used much in practice, however when the system was
designed it was believed that there would be a great need
to distinguish between diskless and disked systems. This
has not proven to be the case. Therefore the
implementation of fdist has been changed to
default to the ``disked'' domain unless diskless is
specified explicitly.
Finally, there had to be a way to identify a specific
file within a domain. The colon (`:') was chosen to combine
the domain name with the file name, as this notation has
commonly been used to specify a file's path on a specific
host. The domain and path combination is called a
domainpath.
Using the above definitions, /etc/hosts.equiv file on
all disked Sun 4's that run SunOS and are used by the
Computer Center can be referred to as:
disked.sun4.sunos.cc.adm:/etc/hosts.equiv
and the services file on all MIPS workstations as:
mipsws.common.common.adm:/etc/services
Database structure
This section outlines the implementation of the database
structures. The term database is used very loosely, as
fdist's database includes a copy of all files to
be distributed as well as number of configuration tables,
which at this time all are flat text files. These files
were supposed to be implemented using a commercial database
product, but dependencies on another corporate project have
so far not permitted this.
Hostdata File
The hostdata file is a text file that contains
all the per host information. Under the
fdist system, hostdata contains per
host information, one host entry per line using white
space as field separators. This file contains traditional
host information, such as ethernet and IP addresses,
primary host names and aliases (the latter comma
separated). The file also contains the names of the
administrative domains that a host belongs to. This
information is kept in five additional fields, where the
first is a comma separated list of all top level component
of the domain name followed by the remaining four domain
component levels.
Several ``place holders'' occupy (sub)fields for which
no information available. Only one such name should be
sufficient for the computerized part of the system (and in
fact is, as all of these names are internally represented
as ``none''). However, the human interface to the system
requires that the people maintaining the system be able to
differentiate between several values. Currently implemented
values are:
- NONE: No value is used for this field.
- UNKNOWN: There should be a value for this field, but
it is not known.
- IGNORE: This domain entry is ignored. This is used in
the top level domain component field for gateways, PC's
and other non-UNIX platforms, where the distribution
information is irrelevant.
The distinction between NONE and UNKNOWN was very
important in the beginning when many field values were
truly unknown. It made it clear that fields labeled UNKNOWN
had not yet been assigned a correct value, as opposed to
those fields which were not supposed to have a value (e.g.,
a host which has no host name alias or where the ethernet
address is of no concern will have NONE in the respective
fields). The original incarnation of this file had many
UNKNOWN's which over time were replaced with their correct
values.
Distdata fileThe distdata file, also a
flat text file, contains the per file information
for those files under fdist's control, one source
file per line using the colon (`:') as field separators.
For each file that is to be distributed, its source name
(where it is located on the distribution machine),
destination name on the client and distribution domain are
all recorded. The same line also contains the distribution
frequency, any shell command to be executed on the client
after the file has been received and what editor should be
used when editing the source file (using EDITOR to indicate
the editor specified by either the $EDITOR or $VISUAL
environment variables and NONE for no editing allowed).
Config fileThe config file contains
information about the how fdist system is
configured. Config is a text file, with a format
inherited from a previous version of fdist and
whose structure has really outlived its usefulness.
Config records which host is the distribution
master (since commands which alter data are only allowed to
execute on the master), as well as a description of the
legal domain component names. Under domain component
information, we also record which hardware platforms
support a given operating system (e.g., ULTRIX does not run
on a non-DEC hardware platform).
Distributed filesFdist keeps a private
copy of all files which are distributed to both guarantee
authenticity and provide centralized control of the source
files. These source files are kept in an almost flat
directory space. Almost, because there is one additional
directory level, but this was only implemented because of
the legendary inability of UNIX to deal with very large
directories. The files were therefore spread out into 60
directories, named 00 to 59. (A better directory naming
scheme could have been used but, since the operator is
shielded from the path names as all access to the source
files is via fdist commands, the directory names
were kept terse). A new file will be placed into the
directories in a pseudo random manner. The destination
directory is chosen based on the number of seconds elapsed
since the last full minute. This ``poor man's'' approach to
random distribution has worked well; currently no directory
has more than 12 files in it (with over 400 files being
distributed). To further avoid collisions among files with
the same name but which are distributed to different
domains, a sequence number is appended to the file name
when it is installed into the data directory.
Command interface
The command interface enables the system administrator
or operator to create new domains and specify files which
are distributed to those domains. It is also possible to
modify the content of a text file or to change a file's
access permissions, in a manner similar to traditional UNIX
commands.
The format of the command line follows what was the
current IEEE POSIX 1003.7 draft at the time the project was
started. Since the standards group has shown no progress in
this area since then, no changes have been made to the
original strategy. (Future versions of fdist may
or may not continue to track the standard.)
With a few exceptions, all commands to the
fdist system are invoked with a single
fdist command of the form:
fdist -o <command> [<options>] [...]
[<argument>] [...]
A brief overview of the existing fdist commands
is located in Appendix A.
Reporting
One problem became very apparent after the first version
of the fdist system had been installed and running
for a few weeks. The output produced by rdist(1) was both
verbose and generated on a number of machines. There was
too much information in a format not fit for human
consumption. It was necessary to implement a message
processing system which extracted the important messages
and presented them to the system administrator in a fashion
which matched the domainpaths used by the fdist
command line interface.
An attempt to write a simple parser failed miserably.
The introduction of the domains made it difficult to
determine if all hosts had received the files destined for
them. The problem was further complicated by the fact that
the distribution was spread out over several hosts in order
to achieve reasonable performance.
Besides dealing with verbose reporting, it was critical
that the distmaster be notified in a timely fashion if
something went wrong with the distribution between the
master and the distribution slaves.
Efficient reporting had the following requirements:
- Ensure that the master need not rely upon any other
machine.
- Ensure that the distmaster be notified by e-mail if
the distribution master or slaves encountered
problems.
- Analyze the distribution results in the domain
context already used by fdist and report
problems with minimum verbosity.
- Split the error information into daily and hourly
reports, where the hourly reports have only the most
urgent information.
Notification of problemsSince it was so important
that the distmaster be notified quickly of any problems
with either the master or any of the slaves, the mechanism
needed to be both simple and robust. Therefore, the
mechanism was based on time stamp files. While this
approach is not very elegant, it has proven to have the
required robustness. The strategy implemented is described
below:
- Notify the distmaster if a distribution slave no
longer receives updates from the distribution master.
This happens if the distribution master crashes, severe
networking problems occur, or if the distribution slave
is no longer recognized as such by the master.
- Before doing a distribution to the distribution
slaves, the master updates a time stamp file with the
current date and time. This file is located within the
distribution tree so it is always distributed to each of
the slaves.
- Each time a distribution slave starts a distribution
to its clients, it checks whether the time stamp file
from the master has been updated within the last 75
minutes. If the file has not been updated, the
slave notifies the distmaster of the problem by
e-mail.
- Notify the distmaster if a distribution slave no
longer updates the clients. This happens if the slave
crashes or has networking problems.
- After completing a distribution to the clients,
the distribution slave updates a time stamp file with
the current date and time.
- Every hour, the distribution master checks each
of the distribution slaves, to ensure that no time
stamp is older than 75 minutes. If the file has
not been updated on any of the slaves, the
master notifies the distmaster of the problem by
e-mail.
The above strategy ensures that the distmaster is
always notified when there are problems. The drawback is
that when the master goes down, a large amount of e-mail is
sent (one message from each slave every hour). However, it
was not worth trying to eliminate the duplicate mail in
this situation, as it complicated the design, ultimately at
a cost of reduced robustness.
Notification of problems distributing to clients
Because the distribution to the clients is, in itself,
distributed, we chose to use syslogd to redirect all rdist
output back to a common location on the distribution
master. All messages are also kept on each distribution
slave, in case they are needed for debugging purposes. To
simplify later processing, all rdist error messages which
span multiple lines are edited on the fly to be only a
single line. This editing is necessary as syslogd cannot
guarantee that such lines will arrive on the master without
being interspersed by other lines.
On the master, each rdist message line is analyzed. One
of the goals was to be able to accurately report whether or
not a host had been updated with a specific file.
Therefore, when a file is updated on the master (e.g.,
through an edit command), this information is recorded in a
dbm database file where the key is the real path and the
data is a list of hosts which are supposed to receive the
updated file. Each time a message reaches the master that a
host has received a specific file, that host is removed
from the list of outstanding hosts. This strategy made it
possible to implement the status command, which
allows the system administrator to see which hosts still
need to receive a given file.
All error messages are collected and processed. The
system administrator can specify whether a given message
should show up in either the hourly or daily reports or
just simply be discarded. This was implemented using two
files, one called filter.delete and the second, filter.day.
Each file can contain any number of regular expressions.
Error messages matching the regular expressions in
filter.delete are discarded; the ones matching the regular
expressions in filter.day are placed in the daily report.
Messages matching neither set of regular expressions are
presented in the hourly report. This approach ensures that
new and unexpected error messages are presented to the
distmaster with the least possible delay.
Distribution delays
A distribution scheme which uses rdist has inherently
longer propagation delays than some of the more well known
distribution mechanisms such as NIS (formerly YP). The
various delays introduced by our method were one of the
concerns in the overall design. Propagation delay has been
minimized as much as possible, especially on those files
distributed hourly.
Experience with fdist has shown that its longer
propagation delay versus NIS has not been much of a problem
due to the kinds of data being distributed. However, using
rdist as a distribution mechanism does have some
significant advantages over the NIS approach:
- Rdist can handle any kind of file, including binary
executables.
- The impact of the unavailability of one of the
distribution slaves, or even the distribution master
itself is far less severe than the unavailability of an
NIS server.
Implementation
When we started the design of fdist, Perl 3.0
had just been recently released. Although neither of us had
a lot of experience with Perl, we decided to use this as
our programming language. It seemed a faster implementation
vehicle than C; the shell/awk/sed combination would give us
neither the flexibility nor execution speed required.
Looking back, this decision was invaluable to the
success of the project. The current implementation of
fdist is bigger and much more comprehensive than
what we had originally planned for. Its implementation
relies heavily on the builtin regular expression and string
manipulations found in Perl. On the negative side however,
we found that understanding how programming styles
influence the performance of a Perl script is, at best,
counterintuitive. It would have been very helpful if Perl
3.0 had included a profiler, especially since we believe in
the methodology which says ``make it work, before you make
it fast.'' There were times when working on increasing the
performance of the program resulted in runtime changes from
hours to minutes or even seconds! Although a lot of work
has been done on optimizing performance, there is still
ample room for improvement. The two areas which need the
most attention are the replacement of the text
configuration files with a commercial database and the
replacement of the domain name handling routines with ones
that use a better approach.
As the fdist program grew to its present size
of approximately 8000 lines, it became difficult to handle
as one large file. As a result, we took an approach similar
to C programming where each subroutine (or a set of closely
connected subroutines) resides in a separate file. They are
then compiled and linked together to form a single program.
The difference with our approach is that we replace
compilation and linking with concatenation and syntax
checking of the individual Perl files prior to their
execution. (Future versions of fdist may simply use the
require() function instead).
The current implementation of fdist consists of
92 individual Perl files which are concatenated into a
number of programs or program segments before they are
installed.
A list of the more important subroutines, together with
a short explanation of their function is located in
Appendix B.
There are also a number of subroutines that read and
write the configuration files. As fdist evolved,
we needed to treat the configuration files as databases
with a number of indices. A CPU/memory tradeoff was done
where the data is internally replicated for each type of
index. This approach does not reduce the initial access
time of the files, but it significantly reduces the time
for additional accesses. For example, the first time the
hostdata file is read, it takes about eight seconds. It is
stored in an associative array with a simple counter as the
key. Later, if the same data needs to be accessed with, for
example, the primary hostname as the key, a new copy of the
data will be created as an associative array. This
operation takes one to two seconds. Furthermore, access to
an associative array which has already been allocated is
very fast. While this overall approach is sub-optimal, it
is the best one we have been able to implement to date.
Future work
Much of the future work needed by fdist has
already been referenced. The configuration files must be
replaced with a commercial database in order to achieve
better performance and the domain handling routines need a
complete rewrite to use a table driven mechanism.
However these items are just a better implementation of
what is already in place today. One new item involves using
multiple types of file distribution mechanisms. Currently
all file distribution is done using rdist. Therefore,
fdist is limited to ``pushing'' files from a
central host. This method has worked well for system
configuration files like /etc/hosts or /etc/group, but is
not as well suited for other kinds of distribution. There
is a need for a ``request service,'' where a host can
request a certain application (e.g., elm) and then receive
the necessary updates. This method, referred to as polling,
would permit users to receive upgrades to their
workstations whenever they feel the timing is right. It is
important to realize that these two methods must complement
each other; both are needed in today's environment.
Also useful would be a subscription type of service
where a host can request to receive updates of a file
whenever updates are available. This method is similar to
the file pushing already in place, but unlike the current
method, it allows for making distribution decisions on a
host-by-host basis.
It would be advantageous to have a much more flexible
domain naming scheme. It is hoped that this flexibility
will be realized when the domain name handling routines are
rewritten.
Finally, it would be useful to allow distribution of
entire directories (e.g., /usr/local). However,
distribution of directories has a number of conceptual
problems which, so far, have not yet been resolved.
Shortcomings
In addition to the limitations referenced above, two
other points worth noting are:
- A new top level domain cannot be created without
making modifications to the fdist sources
because, at present, the fdist system lacks the
functionality to make the required changes to the
hostdata file.
- The system is dependent on one central distribution
server.
Experience with the system
The MIPS user community consists of over 500 users,
approximately 400 workstations (mostly dataless) and about
120 servers containing greater than 300 Gb of backed-up,
on-line storage across a six building network.
Version 2 of fdist has been in use at MIPS for
over a year and has grown to where almost all hosts and
over 400 files are maintained under the system. Files are
easily added, removed or updated, and placed under
fdist's control.
The reports enable the system administrator to identify
both quickly and easily any problems relating to the
distribution, ranging from the simple cases of a client
being down or out of disk space to the more severe cases of
problems with one of the distribution slaves or even the
distribution master itself.
As a result of using fdist, MIPS has been able
to provide a distributed, homogeneous environment to its
user community; one where the user no longer needs certain
machines to accomplish specific tasks. In fact, the
environment has become so homogeneous in appearance, that
many users have become unaware of which machines they
specifically rely on beyond the workstation on their
desk.
Availability
The current version of fdist is 2.1.
Fdist 's availability is presently handled on a
case by case basis. Contact Paul M. Moriarty at MIPS for
additional information.
Acknowledgments
- We want to thank Larry Wall for the creation of Perl.
Without Perl, this project would probably never have been
attempted.
- Thanks to Tom Christiansen for helpful suggestions on
how to improve the performance of the messages filter, to
a point where the tool became usable.
- Finally, thanks to Susan Woolf and Rob Kolstad for
proof reading this paper.
References
- IEEE POSIX 1003.7 System Administration, Draft
2.
- RFC1034 Domain Concepts and Facilities
Paul M. Moriarty, a senior systems administrator in
Engineering Computer Services, has been with MIPS since
1988. He is responsible for system administration
automation and utilities as well as postmaster and news
administrator for the whole company. Paul is co-founder of
Bay-LISA, a San Francisco Bay Area user's group for system
administrators of large sites.
Linda Bissum is the President of /sys/admin, inc., a
consulting firm which specializes in Large Installation
System Administration. Linda is a member of the IEEE POSIX
1003.7 System Administration Standardization Committee.
Linda is also President and co-founder of Bay-LISA, a San
Francisco Bay Area user's group for system administrators
of large sites, and Senior Editor for ROOT, the UNIX System
Administration Magazine.
Appendix A - Fdist Commands
What follows is a brief overview of the existing
fdist commands.
Maintaining domain information
-
Create a new domain
Synopsis: fdist -o create
<new-domain>
The create command creates a new, empty
domain. The parent domain must already exist. The
command enters the necessary lines in the config file
to allow files to be added to the new domain as
necessary. If a domain with a similar configuration
already exists, it is probably simpler to clone that
domain rather than create the new domain from
scratch.
This command can only be executed by the super
user.
To create a new domain new under the domain
osgrp.adm:
fdist -o create new.osgrp.adm
-
Delete a domain
Synopsis: fdist -o delete [ -f ]
>domain<
The delete command deletes an existing
domain. The domain is required to be empty unless the
-f option is specified, in which case all files and
sub-domains contained by this domain are also
deleted.
This command can only be executed by the super
user.
The following command removes the domain
obsolete.osgrp.adm:
fdist -o delete obsolete.osgrp.adm
-
Clone an existing domain
Synopsis: fdist -o clone <domain>
<new-domain>
The clone command ``clones'' an existing
domain into a new domain. The new and the existing
domain must be on the same level. The clone command
creates the new domain in the config file and creates
new entries for each file in the new domain against
what is found in the old domain. No changes are made to
the old domain. The following command creates a new
domain new.osgrp.adm, by cloning the domain
old.osgrp.adm
fdist -o clone old.osgrp.adm new.osgrp.adm
-
Compare two domains and note the differences
Synopsis: fdist -o compare <domain1>
<domain2>
The compare command compares two domains
and lists the differences. It reports the results in
five groups:
- Files distributed to <domain1>, but not to
<domain2>;
- Files distributed to <domain2>, but not to
<domain1>;
- Files distributed to both <domain1> and
<domain2> which are different;
- Files distributed to both <domain1> and
<domain2> which are identical but have
different source files;
- Files distributed to both <domain1> and
<domain2> which are shared (same source
file).
-
Print a list of domains to which a host belongs
Synopsis: fdist -o domain [ -F ] <host> [
... ]
The domain command prints a lists of all
domains to which the host is a member. The -F option
specifies the additional reporting a list of all files
that the host will receive (with distribution frequency
shown).
-
Print list of hosts in a domain
Synopsis: fdist -o hosts [ -F ] <domain> [
... ]
hosts prints a lists of all hosts in a
domain. If the -F option is specified, it also writes a
list of all files that the host will receive (with
distribution frequency shown).
-
Print a list of files distributed to a domain
Synopsis: fdist -o list <domain> [ ...
]
The list command prints a list of all files
distributed to a domain. The following command below
shows what files are distributed to the domain
cc.adm
$ fdist -o list cc.adm
Files distributed to domain osgrp.adm:
day osgrp.adm:/.forward
hour common.adm:/.rhosts
week common.adm:/etc/TIMEZONE
hour common.adm:/etc/hosts
hour common.adm:/etc/hosts.equiv
Maintaining file information
-
Install a new file into a domain
Synopsis: fdist -o install [ -e <editor> ]
[ -s shell_command ] <file> <domainpath>
<frequency>
The install command adds a new file to the
distribution tree. If the new file is already
distributed to another domain, its domainpath can be
used in place of the file argument. However, it must be
noted that this creates a separate, but identical file.
If the two files must be shared, they must be merged
using the mrg command (see below).
The -e option sets the editor to the specified
value. The default is to use the values (at the time of
editing) of the $VISUAL or $EDITOR shell environment
variables. If neither of these are set, vi is used. If
the file is not a text file, the default value is no
editing.
The -s option sets the shell command to the
specified value. This shell command is executed on the
client after the file has been distributed. The default
is no shell command.
-
Print a list of domainpaths to which a file is
distributed
Synopsis: fdist -o file [-a] <file1> [
<file2> ... <file n> ]
The file command prints a lists of all
domainpaths to which a file is distributed. This is
useful for files shared by multiple domainpaths as one
can see exactly which domainpaths share the file. The
-a option will display the distribution frequency and
optional shell command together with the domainpaths.
The command below shows what domains share the file
/usr/lib/aliases:
$ fdist -o file /usr/lib/aliases
File /usr/lib/aliases is distributed to:
disked.sun3.sunos.common.adm:/usr/lib/aliases
400.common.adm:/usr/lib/aliases
bsd.common.adm:/usr/lib/aliases
450.common.adm:/usr/lib/aliases
ultrix.common.adm:/etc/aliases
-
Set the distribution frequency for a file
Synopsis: fdist -o freq <domainpath>
<frequency>
The freq command changes the distribution
frequency of a file.
-
Merge two domainpaths into one
Synopsis: fdist -o mrg <domainpath1>
<domainpath2>
The mrg command merges two domainpaths
together, changing the real file in domainpath2 to be
the same as domainpath1. All other entities for
domainpath2 remain unchanged (e.g., if domainpath2 has
a shell command assigned, while domainpath1 does not,
domainpath2 will still have the shell command after the
merge command has been executed).
The following command results in
domain1.adm:/somefile being the same (shared) as
domain2.adm:/somefile.
fdist -o mrg domain1.adm:/somefile
domain2.adm:/somefile
-
Select what editor should be used
Synopsis: fdist -o setedit <domainpath>
[<editor>]
The setedit command changes the editor used
to <editor> whenever the <domainpath> file
is edited using the fdist -o edit command. If the
editor field is omitted, the editor path is set to the
default.
-
Specify shell command to execute after file
distribution.
Synopsis: fdist -o shell <domainpath>
[<shell>]
The shell command modifies the shell
command for a file. This command is executed on all
hosts that <domainpath> is distributed to, every
time that file is updated. If no shell command is
specified, any existing shell command will be
removed.
-
Edit a file
Synopsis: fdist -o edit [-e <editor>]
<domainpath>
The edit command invokes an application
specific editor on the file specified by the domainpath
given as its argument. The editor used depends on which
editor was specified when the file was installed (or
later changed by fdist -o setedit). In each case, the
domainpath is be replaced by the real path before
editing.
The actual editor used depends on a number of
factors.
The file to be edited is locked before the editor is
called using a simple file lock that is honored by all
fdist commands. If no changes are made to the
edited file then the date of last modification remains
unchanged. This prevents unnecessary redistribution of
unmodified files.
-
Unlock a locked file
Synopsis: fdist -o unlock
<domainpath>
The unlock command unlocks a file that has
been unintentionally left locked (e.g., after a system
crash) by removing the lock file.
Traditional file manipulation
All the commands in this group are, in fact, standard
UNIX commands with the necessary packaging around them to
make them work correctly in the fdist environment.
The processing which is done checks for valid parameters
since not all parameters are permitted in the
fdist environment (e.g., since fdist does
not yet support the notion of directories, the -r option on
the rm command is not meaningful).
-
Compare two files
Synopsis: fdist -o cmp <domainpath1>
<domainpath2>
The cmp command compares two domainpaths by
first converting the domainpaths to the real paths.
After conversion, cmp calls the UNIX diff
command for text files; the UNIX cmp command for all
other files. If only one of the two domainpaths is a
text file, the following messages are displayed:
cmp: <domainpath1> is a text file
cmp: <domainpath2> is a binary file
-
Concatenate and print files
The cat command converts all domainpaths to
the real paths, and then calls the UNIX cat command,
passing any command line arguments to the UNIX
command.
-
Change group,
Change mode,
Change owner,
Copy a domainpath,
List information about a file,
Remove a domainpath
The chgrp, chmod, chown, cp, rm. and
ls commands convert all domainpaths to the
real paths and then call their respective UNIX command,
passing any command line arguments to the UNIX
command.
Maintaining rdist information and distribution
While all the commands mentioned above are aimed at
maintaining domains and files, it is also necessary to
initiate the distribution by calling rdist. There are three
commands for this:
-
Build distribution control files
Synopsis: fdist -o build [-d <domain>]
<frequency> [-f <file>]
The build command builds distribution
control files but does not initiate the distribution of
any files. The <frequency> parameter specifies
for what distribution frequency the files are rebuilt.
If no options are specified, the output is written to a
file where the dist command later finds it. Note that
both the f and d options result in redirection of that
output. The output is written in a format that is a
complete command file for the rdist(1) utility.
If the -d <domain> option is specified, the
build command only generates distribution information
for files distributed to that domain. The resulting
information is written to standard output, unless the
-f option is specified.
If the -f option is specified, the build command
redirects the distribution information to the specified
file.
The following command line rebuilds the distribution
control file for the hourly distribution. The output is
placed where `fdist -o dist hour' later finds it.
fdist -o build hour
The next script writes a distribution control file
to stdout, which contains entries for all files
distributed daily to the domain sysv.cc.adm
fdist -o build -d sysv.cc.adm day
-
Distribute files
Synopsis: fdist -o dist [-v] <frequency> [
<host> ] [ ... ]
The dist command initiates the distribution
of the files distributed with the interval specified by
<frequency>. This is done most conveniently by
executing fdist -o dist at the correct time intervals
from cron. Fdist -o dist initiates the fdist -o ldist
command on all distribution slaves.
If the frequency is specified as all, a
distribution for all frequencies is done.
One or more hosts may be optionally specified, in
which case only the specified hosts receive file
distribution. The use of all conjunction with
a hostname is convenient for updating new hosts that
are brought online.
-
Distribute files from local distribution slave
Synopsis: fdist -o ldist <frequency> [-r]
[ <host> ] [ ... ]
The ldist command initiates the
distribution of the files distributed with the interval
specified by <frequency>. Ldist is most often
executed on a distribution slave as a result of an
fdist -o dist command having been executed on the
distribution master.
Since no data is locally modified on the
distribution slaves, this command does not build
distribution control files.
Ldist, like dist, accepts all as a
frequency and can limit its distribution to single
hosts.
The -r option causes the ldist command to execute
silently as a background process.
Miscellaneous other commands
-
Check for last distribution
Synopsis: fdist -o check [-cmqv]
The check command checks a distribution
slave for its distribution status and writes the
results to stdout. The -c option causes check to ignore
client distribution status. The -m option causes check
to notify the distmaster by e-mail if the distribution
slave has not been updated for the last 75 minutes. The
-q option suppresses any messages that check would
print. The -v option causes check to always print the
distribution status independent of the time period
since last distribution.
Since check is used to notify the distmaster of
missing file distribution from the distribution master,
the following command should be executed hourly from
cron by root:
fdist -o check -cmq
-
Display the hostdata file with new slave
information
Synopsis: fdist -o chslave <from>
<newslave>
The chslave command substitutes one
distribution slave name in hostdata with another and
writes the results to stdout. The hostdata file is left
unmodified. The <from> argument can be a subnet
IP address (e.g., 130.62.6) instead of being a
distribution slave name, in which case all hosts on
that subnet become clients of newslave.
-
Display hostdata in ethers format
Synopsis: fdist -o mkether
The mkether command writes the information
in the hostdata file to stdout in the same format as
that used in the /etc/ethers file. Duplicate ethernet
addresses and hostnames are reported on stderr. Mkether
makes no attempt to correct such errors. Therefore, if
any duplicates exist in the hostdata file, they will
also appear in the output of this command.
-
Display hostdata in hosts format
Synopsis: fdist -o mkhost [-s]
The mkhost command writes the information
in the hostdata file to stdout in the same format as
that used in the /etc/hosts file. Duplicate IP
addresses, hostnames and host aliases are reported on
stderr. Mkhost makes no attempt to correct such errors.
Therefore, if any duplicates exist in the hostdata
file, they will also appear in the output of this
command.
Because some network components use the same
hostname for multiple IP addresses, the reporting of
duplicate hostnames can be turned off by specifying the
-s option. Duplicate IP addresses and host aliases are
still reported.
-
Display slave-client relationship
Synopsis: fdist -o network [<slave> ] [
...]
The network command prints a list of
clients for each distribution slave, sorted by network
address. By default, it prints information for all
distribution slaves. If one or more distribution slave
names are provided as arguments, only information for
those specified distribution slaves is printed.
-
Generate distribution report
Synopsis: fdist -o report hour|day
[<time>]
The report command prints a distribution
report. If the first argument is hour, it prints an
hourly distribution report; if the first argument is
day, it prints a daily distribution report.
By default, the command prints the report for the
current day/hour, unless the optional <time>
argument is specified. <time> is an integer,
which specifies the time period of the report to be
printed (that day in the month or that hour in a 24
hour period). For example,
fdist -o report hour 8
prints the 8 AM distribution report. If the time
period specified is larger than the current time
period, the report from the prior day/month is
printed.
<time> can also be specified as a negative
integer, in which case the command prints a report from
a previous period relative to the current period (e.g.,
fdist -o report day -1 prints yesterday's daily
distribution report).
Note the daily and hourly reports have no connection
to the distribution frequencies used for individual
domain paths. The two reports contain messages from
all distribution frequencies which are split
into the two report categories, depending on their
estimated importance. See the section on reporting
below.
-
Display distribution status
Synopsis: fdist -o status [ <domainpath> ]
[ ... ]
For each domainpath specified, status
prints a list of all hosts which still have yet to
receive the latest version of a specific source file.
By default, the status command prints the distribution
status for all domainpaths and all hosts which have not
received the latest copy of the given domainpath.
If one or more domainpaths are given as argument,
only the status for those domainpaths is displayed.
The data is based on the information kept in the dbm
file called the missing database
Appendix B: Fdist Subroutines
Below is a list of the more important subroutines,
together with a short explanation of their function. The
domain name manipulation routines all support the use of
wildcards, but this functionality can be turned on or off
using a boolean parameter.
- DomainCompare compares two domain names and
determines if they are identical (allowing for use of
wildcards in the domain names). For example
sysv.common.adm is equal to sysv.osgrp.adm (since
``common'' is our wildcard domain name), but
bsd.common.adm is not equal to sysv.osgrp.adm
- DomainContained compares two domain names to
verify if the first domain names is contained in the
second (allowing for use of wildcards in the domain
names). For example, sysv.osgrp.adm is part of osgrp.adm,
but osgrp.adm is not part of sysv.osgrp.adm
- DomainCreate creates a new domain,
performing all necessary tests to ensure the new domain
is correct.
- DomainDelete deletes an existing
domain.
- GetDomainPath maps a real path for a
specific host back into a domainpath.
- DomainVerify checks a domain name to ensure
it is valid and exists (allowing for use of wildcards in
the domain name).
-
FullDomains expands any wildcards in a domain
name. The result is a list of domain names which match
the wildcards. Open and Close are
routines which are used to open and close files when
file locking is required. A simple locking mechanism is
used to create a lock file. This eases integration with
shell scripts when required. With fdist , the
following lock requests are supported:
- Read-only lock: Open the file for read, without
creating a lock file.
- Open for Write after Read: Establish a lock on
the file and then open the file for read. Fails if
the file lock cannot be established.
- Open for Write: Used to open the file for writing
when the lock file has already been established by an
Open for Write after Read.
- GetRealDomain returns the official domain
path as it is used in the database (including wildcards).
For example, when called with sysv.osgrp.adm:/somefile as
argument, it may return sysv.common.adm:/somefile
- GetRealPath returns the full source pathname
corresponding to a domain path. For example,
sysv.osgrp.adm:/etc/group may return
/fdist/data/37/group.2
|