December 4, 2013

Day 4 - Yum and repository tools by example

Written by: Michael Stahnke (@stahnma)
Edited by: Adam Compton (@comptona)

Over the years, I have mentored quite a few System Administrators. Levelling up means learning about your tools and what they’re capable of (and not memorizing command line flags). For this year’s article on SysAdvent, I wanted to share a lot about one my favorite tools: yum. When I say yum, I mean a little more than just the yum cli itself, but the ecosystem of tooling around it. I spend a lot of time doing things like package building, package repository management, and all in all hacking around with rpms and yum.

Yum is a tool that you’ve probably used if you been a system administrator for any period of time. It’s also one of those tools that is very easy to use and have it get out of your way. yum does network-based dependency resolution, meaning that if you want to install a package, it will download and install all dependencies of that package as well. These are the basics people often know. Under the hood it uses rpm. In normal operation, you use yum for searching, installation and uninstallation of packages. That’s actually pretty awesome, but mainly the trivial use-case for yum.

Beyond that, however, there is much more to the way yum works and interacts with repository metadata. Sometimes being able to query that data can solve heinous problems easily, rather than coming up with odd workarounds. That information can also help you make good decisions about package management.

So, rather than bore you with more background about the yum you already know, let’s dive into some examples. Some of these examples are certainly tailored for beginners, and some are more advanced. I hope the quantity of examples will make this post a good reference.

Yum client tips

Let’s get started on the client side of yum -- meaning just the CLI utility, yum. Beyond the normal update, erase, install there are a few nice enhancements that you may not be aware of.

If you’ve ever wanted a summary of what updates and types are available for your system, try the updateinfo subcommand.

root@f3 ~ # yum updateinfo
Updates Information Summary: available
  1 Security notice(s)
  11 Bugfix notice(s)
  2 Enhancement notice(s)
updateinfo summary done

In this case, the security update looks like something I should probably install. The other fixes I just don’t have time to vet right now. If you only want to install the security updates, just run ‘yum update --security’. There are also specific options for Common Vulnerability Exposures CVEs and Red Hat Security Advisories (RHSA) if you follow those types of announcements and want to be very specific about updates you take on.

Yum also provides a nice way to list out the repositories it knows about, and how many packages are in each repository.

stahnma@hu ~> yum repolist
repo id                      repo name                                             status
base                         CentOS-6 - Base                                        6,381
cr                           CentOS-6 - CR                                          1,215
epel                         Extra Packages for Enterprise Linux 6 - x86_64        10,081
extras                       CentOS-6 - Extras                                         13
puppetlabs-pepackages        Puppet Labs PE Packages 6 - x86_64                        58
updates                      CentOS-6 - Updates                                     1,555
repolist: 19,303

If you’re on a multi-user system, yum history is a bit fun to see who’s been installing packages.

root@f3 ~ # yum history
ID     | Login user               | Date and time    | Action(s)      | Altered
-------------------------------------------------------------------------------
    64 | root <root>              | 2013-11-27 03:45 | Update         |    1
    63 | root <root>              | 2013-11-25 03:28 | E, I, U        |   11
    62 | root <root>              | 2013-11-24 03:22 | Update         |    1
    61 | root <root>              | 2013-11-21 03:23 | Update         |    4
    60 | root <root>              | 2013-11-20 03:18 | Update         |    4
    59 | root <root>              | 2013-11-19 03:35 | Update         |    1
    58 | root <root>              | 2013-11-18 03:33 | E, I, U        |    6
    57 | Michael ... <stahnma>    | 2013-11-17 07:49 | Install        |    2

Seeing root performing a majority of the action may be expected. In the case where you prefixed the yum command with sudo as a normal user, it tracks that as well. The E, I, U is Erase, Installed and Updated which were the actions performed.

Similarly, when you want to see where a file came from on an already installed system, you can use yum. This can be important when you have a package that depends on a specific file rather than a package name. Here we get the package it came from, the version and the repository it was installed from.

root@f3 ~# yum resolvedep libpanel.so.5
libpanel.so.5:
0:ncurses-libs-5.9-11.20130511.fc19.i686 fedora

Rather than use resolvedep, if you know the filename you’re looking for, you can just install a package based off of a file in its path. As an example, in a past life, I needed to have uuencode available on several systems. For the life of me, I could never remember what package uuencode was supplied in because it wasn’t uuencode.rpm.

What I used to do was find a system that already had uuencode on it, and run an rpm query.

That obviously didn’t work if I couldn’t find a system that already had uuencode on it. Luckily with yum, I can just say, "hey, I want that command installed."

stahnma@hu ~> sudo yum install /usr/bin/uuencode
Loaded plugins: fastestmirror, security
...
Setting up Install Process
Resolving Dependencies
--> Running transaction check
---> Package sharutils.x86_64 0:4.7-6.1.el6 will be installed
--> Finished Dependency Resolution
Dependencies Resolved
=========================================================================================
 Package              Arch              Version      Repository             Size
=========================================================================================
Installing:
 sharutils            x86_64            4.7-6.1.el6  base                  187 k
Transaction Summary
=========================================================================================
Install       1 Package(s)
Total download size: 187 k
Installed size: 617 k
Is this ok [y/N]:

Ah-ha, the package was sharutils. What if you just wanted to find out what package gave you /usr/bin/uuencode without starting a yum package transaction? It turns out that all of this information is readily available in the metadata of the package repositories yum connects to. To query that metadata, you use the appropriately named command, repoquery, which is included in the yum-utils package.

yum-utils

stahnma@hu ~> repoquery --whatprovides /usr/bin/uuencode
sharutils-0:4.7-6.1.el6.x86_64

Here, you get the results of what includes that command, without starting a package installation transaction.

What if you want to see what sharutils is, and what files it comes with? repoquery can help you here very easily. It’s also worth noting that repoquery can be run with a non-privileged user account.

stahnma@hu ~> repoquery -il sharutils`
Name        : sharutils
Version     : 4.7
Release     : 6.1.el6
Architecture: x86_64
Size        : 631381
Packager    : CentOS BuildSystem <http://bugs.centos.org>
Group       : Applications/Archiving
URL         : http://www.gnu.org/software/sharutils/
Repository  : base
Summary     : The GNU shar utilities for packaging and unpackaging shell archives
Source      : sharutils-4.7-6.1.el6.src.rpm
Description :
The sharutils package contains the GNU shar utilities, a set of tools
for encoding and decoding packages of files (in binary or text format)
in a special plain text format called shell archives (shar).  This
format can be sent through e-mail (which can be problematic for regular
binary files).  The shar utility supports a wide range of capabilities
(compressing, uuencoding, splitting long files for multi-part
mailings, providing checksums), which make it very flexible at
creating shar files.  After the files have been sent, the unshar tool
scans mail messages looking for shar files.  Unshar automatically
strips off mail headers and introductory text and then unpacks the
shar files.
/usr/bin/compress-dummy
/usr/bin/mail-files
/usr/bin/mailshar
/usr/bin/remsync
/usr/bin/shar
/usr/bin/unshar
/usr/bin/uudecode
/usr/bin/uuencode
….
/usr/share/man/man5/uuencode.5.gz

The -i tells repoquery to provide the package information it can find. The -lis just to list all the files the package contains. In general, the arguments to repoquery are identical to rpm, with the exception that the -q from rpm is implied.

repoquery is like a swiss army knife for yum-based repositories. Here are a few quick examples using repoquery to learn about the packages in our repositories.

Find out what a package provides

stahnma@hu ~> repoquery --provides git
git = 1.7.1-3.el6_4.1
git(x86-64) = 1.7.1-3.el6_4.1
git-core = 1.7.1-3.el6_4.1
perl(Generators) = 1.00
perl(Generators::QMake) = 1.00
perl(Generators::Vcproj) = 1.00

Find out what a package obsoletes

stahnma@f3 ~> repoquery --obsoletes puppet
hiera-puppet < 1.0.0

Find out what conflicts with a package

stahnma@f3 ~> repoquery --conflicts firefox
xulrunner(x86-64) > 25.1

Figure what source rpm a package what built from

stahnma@hu ~> repoquery --source perl-version
perl-5.10.1-136.el6.src.rpm

Now, to let’s find everything that depends on system-release in any way.

stahnma@f3 ~> repoquery --whatrequires system-release
apt-0:0.5.15lorg3.95-7.git522.1.fc19.i686
apt-0:0.5.15lorg3.95-7.git522.1.fc19.x86_64
ovirt-node-0:2.6.0-1.fc19.noarch
ovirt-node-0:3.0.0-6.0.fc19.x86_64

Since system-release isn’t actually a package, what is providing it?

stahnma@f3 ~> repoquery --whatprovides system-release
generic-release-0:19-2.noarch
fedora-release-0:19-5.noarch

In addition to the above examples, repoquery has several more available options and questions it can help answer. However, there are some other tools that do more specific things than repoquery.

Sometimes, you want to look inside of a package. Normally you can do this from repoquery, but for some functions you just want it on disk. Check out yumdownloader for that. I often use yumdownloaderwith the --source option to download the source rpm because I plan to modify it and rebuild it.

stahnma@f3 ~> yumdownloader --source ruby
Enabling updates-source repository
...
ruby-2.0.0.247-15.fc19.src.rpm

If you’ve ever had to manage lots of EL systems, you find yourself needing to add on third-party repositories. I have often wanted to know where a package came from. Luckily, a few years ago, find-repos-of-install was written and added to yum-utils.

stahnma@hu ~> find-repos-of-install
...
xz-5.1.2-4alpha.fc19.x86_64 from repo koji-override-0
xz-devel-5.1.2-4alpha.fc19.x86_64 from repo fedora
xz-libs-5.1.2-4alpha.fc19.x86_64 from repo koji-override-0
ykpers-1.13.0-1.fc19.x86_64 from repo updates
…

In this case, I have fedora-updates-testing enabled and some custom koji (the Fedora build system) repositories.

Other times, you’ll want to play the role of an archaeologist and figure out how or why a package is installed on a system. Luckily, with the yumdb, you can figure that out pretty easily. yumdb does need to be run as root.

root@f3 ~ # yumdb get reason ack perl
ack-2.10-1.fc19.noarch
     reason = user
4:perl-5.16.3-266.fc19.x86_64
     reason = dep

This shows that ack was installed because it was explicitly called out by a user. perl, however, was pulled in as dependency of something else that a user specified. If you’d like to see what else is in the yumdb for a package, try yumdb info <packagename>.

Have you ever noticed how when you run a GUI on linux, after updates it alerts you if you need to restart something, or reboot? When running headless, you don’t have that nice reminder, but you do still have the logic behind it. After you perform updates, you might need to restart some daemons (or even the whole system). Enter the command needs-restarting.

The command provides a list of PIDs and programs path names that started before components they rely upon were updated.

stahnma@f3 ~> sudo needs-restarting
153 : /usr/lib/systemd/systemd-journald
497 : sendmail: Queue runner@01:00:00 for /var/spool/clientmqueue
485 : sendmail: accepting connections
223 : /sbin/rsyslogd -n
208 : /sbin/auditd -n
466 : /sbin/agetty --noclear tty1 38400 linux
257 : avahi-daemon: chroot helper
1 : /usr/lib/systemd/systemd --system --deserialize 21

We’ve dug into a few of the ways that some programs inside the yum-utils package can give you more insight into items on your local system. What if we start to look to look at the repositories as a whole?

Repository Tools

Now we can start to walk through analysis of the metadata and repositories. This can lead to knowing a whole lot more about how your systems are put together. Dependencies for yum are tracked in a directed (and normally acyclic) graph. If you’ve ever wanted to see the graph from a yum repository, you can. You can even see one of all repositories combined (but it will be HUGE and everything will point to glibc). Here you can see the graph of the puppetlabs-products repository found on http://yum.puppetlabs.com. In this use-case, I already have the *puppetlabs-products * repository setup on my system. To see what repositories are installed, try yum repolist.

stahnma@f3 ~> repo-graph --repoid=puppetlabs-products > repo.dot
stahnma@f3 ~> dot -Tpng repo.dot > repo.png
Graph
Graph

If you wish to run your own repositories, yum-utils can help with that as well. Beyond that, you’ll probably want to install createrepo to generate rpm metadata, and a web server.

To make a yum repository in it’s simplest form, you just need to do something like

cd /path/to/rpms; createrepo .

If that directory is exposed via http, (or another protocol that Python’s urlgrabber supports) you should be good to go once you set up a client configuration file in /etc/yum.repos.d.

yum-utils includes a few tools to make managing repositories much easier. As an administrator, you may wish to mirror a entire repository locally. To do this, you can use reposync.

Warning: If you run reposync with no arguments, it will pull down all the yum repositories you have enabled on your system and put them in the local directory. This is almost always not what you want.

In this case, let’s say I wanted to mirror puppetlabs-products from http://yum.puppetlabs.com.

stahnma@f3 ~> reposync --repoid=puppetlabs-products

This took about 20 seconds to download all of the Fedora 19 x86_64 rpms from in the repository. reposync did not create the rpm metadata for me. I would have to do that manually or in a cron job. Also, if I wanted more than just fedora-19-x86_64, I would have to make special configuration file for this mirroring.

For larger tasks and mirrors, making a configuration file is ideal. The configuration file needs to have a [main] section and define reposdir or else you’ll pull in the entire system’s yum configuration in addition to the repositories defined in your configuration file.

This configuration file will mirror EL6 and Fedora 19 x86_64. It also only pulls the newest items and does include source rpms.

# Configuration file found at [https://gist.github.com/stahnma/7729972](https://gist.github.com/stahnma/7729972)
stahnma@f3 ~> reposync -c ./puppetlabs.repo --newest-only -t --source

If I’ve been operating a mirror or package repository for a while, there is a good chance that I have lots of older rpms around that will never get pulled in by yum anymore. What I’d like to do is remove old rpms if there are newer ones that replace them. yum-utils includes repomanage to do that. repomanage is able to list out the packages which won’t be utilized by yum because they have been replaced by newer versions of the package.

stahnma@f3 ~> repomanage -o . | xargs rm
stahnma@f3 ~> createrepo .

Now my mirror is eating a little less disk space, but it still takes up a lot of space and network bandwidth to get, especially on large repositories like EPEL or Fedora. Also, there lots of packages in those repositories that I will likely never use. Luckily there is repotrack available.

repotrack takes a package as an argument and then will mirror that package and all its dependencies. Even the smallest package normally has several layers of indirect dependencies. For example, perl has 38 in a Fedora 19 system, git has 170.

If I want to mirror a few packages (and all deps to be sure it installs cleanly from my repository) I would do something like this.

stahnma@f3 ~> repotrack perl git

I now have a perl, git and all of their dependencies downloaded into a single directory that I can use as yum repository.

Now, over time, I will likely find yourself adding packages, replacing packages, and trimming out older packages. With this type of customization (especially if you run your own yum repositories), I will want to make sure the dependencies can still be resolved and I haven’t put my users in an impossible state. To solve this problem, there is repoclosure.

repoclosure, by default, will source yum.conf and see if I have any unmet dependencies in the repositories enabled on the system. I recommend having a custom configuration for this command, much like the one used in the reposync example. The main difference here is that I also want to include the base system repositories for a complete closure.

For example, take the puppet rpm, it depends on a few things in the puppetlabs yum repositories, but those, in turn, depend on items in the base repositories (such as ruby). Without having the base repositories specified in my configuration file, I would never get a clean closure of the puppetlabs-products-el-6-x86_64 repository.

# This configuration file is at https://gist.github.com/stahnma/7739439
stanma@f3 ~> repoclosure -c el6 -t
Reading in repository metadata - please wait....
Checking Dependencies
Repos looked at: 4
   centos-base
   centos-updates
   puppetlabs-deps-el6-x86_64
   puppetlabs-products-el6-x86_64
Num Packages in Repos: 8295

We’ve seen a few of the helpers that yum-utils provides to make managing repositories a little easier. We can mirror entire repositories, track just a package and its dependencies, prune older content, and ensure all dependencies are resolvable given a known set of repositories. If you do all of these things, your custom yum repository will likely be in pretty good shape.

DNF - the future

If you haven’t already had enough fun learning about yum and its tool set, there’s a new tool on the horizon with a longer term goal of displacing yum called dnf. It’s a next generation yum tool that is nearly 100% abi/api compatible with yum. It is however, faster. To accomplish this, speed increase it uses libsolv (from zypper fame if you’re a SuSE fan) under the hood. dnf is available in Fedora 18 and higher, but not installed by default. You may wish to play with it if you want to see the future of where yum and repodata are headed.

Conclusion

I hope you enjoyed this quick walk through some of the fun and power you can have when working with yum and its associated repository information. Once you can easily query your package manager and repositories, you should be able to head down a path of better times, including automation, and rapid prototyping. Knowing this type of information is also critical when you start making your own packages and repositories.

Lastly, I’d like to thank Seth Vidal for his contributions to making us all better at managing our systems because of the tools he wrote. You are missed.

Further Information

3 comments :

d said...

Hey, Thanks for the nice post it's useful ..as you said you plaid a lot with rpms..i need some suggestion . i am looking for some tool using which application users can create their own rpm files ( they dont know how to create SPEC file ) .. also some tool where they can upload there rpms and install themselves .. similar to HP SA... currently we have HPSA but looking for some other similar option .. current number of rpms we are managing are approx 35k.. excluding OS rpms...

Chairman Alec said...

Have a look at http://search.cpan.org/~chipt/RPM-Specfile-1.51/lib/RPM/Specfile.pm if you have people who can do perl

john said...

It is also worth noting in the article that "yum updateinfo" isn't a default command and you need to install the yum-plugin-security package to enable that command.