Wow… Seriously awesome!

For experiment sake, I decided to duplicate my newly created RHEL session using ZFS.  My Fedora and RHEL virtual instances each live on their own zvol.  As you can see below, each VM has a zvol named disk0:

# zfs list -r local
NAME                                    USED  AVAIL  REFER  MOUNTPOINT
local                                   196G   717G    31K  /local
local/shared                            196G   717G   172G  /shared
local/shared/Virtual                   24.2G   717G    33K  /shared/Virtual
local/shared/Virtual/ISO               3.56G   717G  3.56G  /shared/Virtual/ISO
local/shared/Virtual/KVM               20.6G   717G    36K  /shared/Virtual/KVM
local/shared/Virtual/KVM/fedora        10.3G   717G    31K  /shared/Virtual/KVM/fedora
local/shared/Virtual/KVM/fedora/disk0  10.3G   724G  2.94G  -
local/shared/Virtual/KVM/rhel          10.3G   717G    31K  /shared/Virtual/KVM/rhel
local/shared/Virtual/KVM/rhel/disk0    10.3G   724G  2.89G  -

To duplicate the RHEL virtual machine, I first created a snapshot of the volume while the RHEL VM was booted.

# zfs snapshot local/shared/Virtual/KVM/rhel/disk0@booted

I then cloned the RHEL VM to a new zvol:

# zfs clone -p local/shared/Virtual/KVM/rhel/disk0@booted local/shared/Virtual/KVM/rhel2/disk0

With the new cloned zvol, rhel2, the local zpool now looks like this:

# zfs list -r local
NAME                                    USED  AVAIL  REFER  MOUNTPOINT
local                                   199G   714G    31K  /local
local/shared                            199G   714G   172G  /shared
local/shared/Virtual                   27.1G   714G    33K  /shared/Virtual
local/shared/Virtual/ISO               3.56G   714G  3.56G  /shared/Virtual/ISO
local/shared/Virtual/KVM               23.5G   714G    39K  /shared/Virtual/KVM
local/shared/Virtual/KVM/fedora        10.3G   714G    31K  /shared/Virtual/KVM/fedora
local/shared/Virtual/KVM/fedora/disk0  10.3G   721G  3.04G  -
local/shared/Virtual/KVM/rhel          13.2G   714G    31K  /shared/Virtual/KVM/rhel
local/shared/Virtual/KVM/rhel/disk0    13.2G   724G  2.89G  -
local/shared/Virtual/KVM/rhel2         30.1M   714G    31K  /shared/Virtual/KVM/rhel2
local/shared/Virtual/KVM/rhel2/disk0   30.0M   714G  2.89G  -

I modified my kvm start script with a new MAC address and vnic and booted the new virtual machine.

Duplicated RHEL

I created and booted a completely new RHEL VM instance in literally less than a minute!

Posted in computers, openindiana | 1 Comment

Installing RHEL 5.5 and Fedora 13 in OpenIndiana 151a KVM

KVM is really quick on my VT enabled Dual Xeon 5506 system running Open Indiana 151a. I have to say this is very nice work by the Open Indiana / Joyent / Illumos teams!  Here’s a screen shot of RHEL 5.5 and Fedora 13 installing at the same time:

Installing RHEL and Fedora with OI-151a KVM

Running RHEL and Fedora with OI-151a KVM

Instructions on using KVM in OpenIndiana 151a can be found here.  One caveat, in order to use /usr/bin/qemu-kvm, you will need a modern Intel processor that supports the new VT extensions.  To find out if you have a compatible processor, first find your processor model (i.e. i5-2500K or E5506).  Use prtdiag in a Solaris derivative or dmesg in Linux to find your processor model and then look up the processor in Google.  You should find a processor marketing sheet from Intel.  If your processor supports the VT extentions, you will see this:

Intel VT is supported on this processor

Posted in computers, openindiana | Comments Off

Compiling Illumos in OpenIndiana 151

My primary Illumos build workstation is a new-ish Core i5-2500K.  To begin tinkering with the Illumos kernel, I installed Open Indiana 151 using the dev-il repository.  The build process works great on OI-151 up to the point where you upgrade the system via the onu script.  I didn’t realized it was an issue until I ran into trouble after a reboot and went to test my modifications.  None of my changes showed up!

I am debugging a particularly nasty problem with my motherboard’s USB 3.0 hardware and simple functions I inserted into Illumos to help debug were not showing up in KMDB sessions.  I went searching the internet for a solution and found that the version change of the Open Indiana dev-il repository conflicted with package builds in the Illumos internal repository.  I reinstalled with Open Indiana 148 and the problem went away.

You can read the bug report here:  https://www.illumos.org/issues/1118

Posted in computers, illumos, openindiana | Comments Off

Stumbling Through KMDB with New Hardware

Recently, I refreshed my home workstation from an AMD X-2 based system with an ASUS A8N socket 939 motherboard to a brand spanking new Sandy Bridge Core i5-2500K based system with a Gigabyte P67A-UD3-B3 motherboard.  Debian Squeeze runs lightening fast!  However, I did not purchase this machine to for the sole intent of running wobbly windows in Debian.

I want to use this new machine to build and tinker with the latest Illumos releases.  So I popped the latest Open Indiana 151 beta install DVD and selected the graphical install mode from the OpenIndiana GRUB menu.  The OI kernel text displayed next and then… nothing.  The system was apparently getting stuck while attempting to load the kernel.

My next logical step was to fire up KMDB (Kernel Modular DeBugger), what else?!

middle of a kmdb session

Stepping through the kernel load functions was a little tedious as my KMDB skills are still rusty.  Since Illumos (the kernel OpenIndiana uses) is open source, I was able to follow along using the Illumos source code on my laptop without much trouble.

Soon, I discovered that the system appeared to be hanging while enumerating a PCI bus.  Specifically, the hang occurred in pci_autoconfig`enumerate_bus_devs.  The enumerate_bus_devs function is called several times during the PCI setup process and it wasn’t until the system attempted to use it in the second for block in the pci_setup_tree function (illumos-gate/usr/src/uts/intel/io/pci/pci_boot.c) that the kernel would lock up.

for (i = 1; i <= pci_bios_maxbus; i++) {
    if (pci_bus_res[i].dip == NULL) {
        pci_bus_res[i].root_addr = root_bus_addr++;
    }
    enumerate_bus_devs(i, CONFIG_INFO);

    /* add slot-names property for named pci hot-plug slots */
    add_bus_slot_names_prop(i);
}

I found that when i = 3, the kernel hung.  I’m not familiar with this part of the Illumos kernel (or any of it, really) but I was able to get a sense of what was going on despite my lack of specific knowledge.  I assumed that a device on the PCI bus was not getting along with Illumos and that finding and disabling the device (if possible) would allow me to boot an Illumos based kernel successfully (Joyent’s SmartOS hangs in a similar place).

My first thought was the ATI Radeon 5750 graphics card was the culprit.  I wanted to try something a little easier than tearing my machine apart and inserting an NVIDIA based graphics card first.  So I rebooted and entered the BIOS.  It was then I had a brainwave.

“USB 3.0?”

Sure enough, I disabled USB 3.0 on my motherboard and booted straight into the OpenIndiana install without a hitch.

Now that I have a valid build environment, I’m off to find out why enabling USB 3.0 on my mobo was causing Illumos so many fits.

Posted in computers, illumos | Comments Off

“Wait… is that billions?”

Ironically, after yesterday’s post about Oracle’s caviler attitude toward dealing with communities built around open software, a piece of died-in-the-wool proprietary software stepped out of Oracle’s (ever growing) stable of software, neighed a couple of times, stamped it’s feet and kicked me in the head.  Specifically, one of our 11g database servers shot up to one minute load averages a little above 2,000.  Yes, two zero zero zero.

Despite the comically high load, our M5000 server running Solaris 10 withstood the onslaught and the system was still quite usable.  A testament to the engineering teams at Sun if I ever saw one.  A quick jump to old performance standbys, mpstat, iostat, prstat and vmstat revealed many involuntary context switches and a run queue in the thousands.  My DBA colleague and I went about the business of diagnosing the problem and he promptly ran an AWR report.  Below was what we found.

Oracle 11g AWR Report

"Wait, is that billions?"

“Wait, is that billions?” I asked naively.

“Yeah… something’s wrong.” my DBA colleague replied in his usual calm, understated way.

Oracle Support confirmed that, yes, four billion mutex waits in a span of an hour appeared to be the cause of our pain.  Luckily for us, this is not undiscovered country.  A quick Google search later revealed that 11g is notorious for this particular type of pain.  The fix, of course, was a patch.  Specifically, 10411618: Add different wait schemes for mutex waits.

Annoyingly, had Oracle been gracious enough to add DTrace probes to their enterprise products, we could’ve saved a lot of heartache with a one-liner.

Also, here’s a good intro to Oracle DB mutexes and latches.

Posted in computers, solaris | Comments Off

Oracle-ish Anxiety

For the past year, I have suffered from what I’ve churlishly dubbed as Oracle-ish Anxiety.  It’s the feeling that at any moment, Oracle may greedily yank a technology that I depend upon, kicking and screaming, from the clutches of the public domain.

OpenSolaris was the first to go to the chopping block.  No, chopping block is the wrong metaphor.  Void is more like it.  No more OSUG meetings, no more t-shirts with OpenSolaris logos, no more following online wars on topics such as governance bodies and community participation (online wars aren’t necessarily bad, they occur because people care).  The hole OpenSolaris left is raw and only Oracle remains.  A hard pill to swallow, to say the least.

Honestly, it makes me a little sad.  Whether they meant to or not, Oracle killed a vibrant community that, while smaller than the Linux cowboys, was growing a following of rabid devotees.  Strangely, OpenSolaris’ followers devotion was driven by rather boring technologies that stood in stark contrast to the whims of a fanboy (I refuse to use an i, I absolutely refuse) nation.  DTrace just isn’t sexy compared to jiggly desktops.

Then the page at midnight comes.  No one in your house is safe from an angry pager.  Partners threaten to launch the pagers into walls, babies wake up and cry, and even the dogs regard you with disdain (until you give them a treat).  After surviving withering stares from tired family members, you hobble half awake to a terminal, turn off the jiggly windows, and SSH into work.  To your dismay, an important database server has crashed and important folks all over the company want to know what you’re going to do about it.

Suddenly, in a moment of blurry eyed recognition, DTrace, ZFS, and SMF become very sexy.  The moment these technologies save your bacon is the moment you become a devotee.  You realize that had you been dealing with Windows, Linux, or another *NIX, your goose would be proverbially cooked.  That just puts icing on the cake.

As an Oracle customer responsible for large systems that rely heavily on technologies in Solaris 10 (especially ZFS), I have a nagging feeling that as much as Oracle knows how to make a dollar, they don’t know how to capitalize on the moment that turns people into devotees.  And that, my friends, is what sets my Oracle-ish Anxiety a rock’n.

Posted in computers, solaris | 3 Comments

Distributed Version Control FTW

Linus Torvalds gives a good talk on the evilness of Subversion and CVS and the benefits of Distributed Version Control Systems (DVCS).  It’s pretty interesting.  I prefer Mercurial versus Git myself but the idea is the same.  Having a copy of the entire repository on your local system is a much better way to get distributed work done.

Torvalds on Git:  http://www.youtube.com/watch?v=4XpnKHJAok8

Joel Spolsky on Mercurial:  http://hginit.com/index.html

Posted in computers, development | Comments Off