PowerCLI and the Prompt

As a Linux Administrator for the last 10 or so years, I have become pretty accustomed to my the way that my command prompt is set up.  Basically, it contains the hostname of the machine and the current director I am in:

 

(jefferson:~) %

 

Over the last week or so I have been playing around with VMware’s PowerCLI as a way to simplify my host deployment. While not having the hostname in it does not bother me, the one thing that has been really driving my crazy is the length of the command prompt:

The prompt takes up nearly all of the screen.

To fix it, I needed to modify Initialize-PowerCLIEnvironment.ps1 in C:\Program Files (x86)\VMware\Infrastructure\vSphere PowerCLI\Scripts directory.  I searched for the word prompt and found the global:prompt function. I commented out (Get-location).path  and added a snippet that I found on Jason Williams’ Blog.  I did have to make a slight modification to it, adding a \ to the split commands.

 

# Modify the prompt function to change the console prompt.
# Save the previous function, to allow restoring it back.
$originalPromptFunction = $function:prompt
function global:prompt{

# change prompt text
Write-Host "$productShortName " -NoNewLine -foregroundcolor Green
#Write-Host ((Get-location).Path + ">") -NoNewLine
Write-Host ((get-location).Path.Split("\")[(get-location).Path.Split
("\").Length -1] + ">") -NoNewLine
return " "
}

 

Now when I launch my PowerCLI command, I know had something much more manageable and familiar:

UPDATE: @jakerobinson let me know I could do the same thing with:

Write-Host ((Get-location).Path.Split('\')[-1] + ">") -NoNewLine
Posted in VMware | Leave a comment

Learning and Growing

I have been a SysAdmin in one form or another for almost 20 years. The first 10 years of my career I spent in the U.S. Army, managing personnel databases. Most of that time was spent on proprietary systems, but the last couple of years we moved to SCO Unix and Infomix databases which is what prepared and propelled me on my path to being a Unix SysAdmin.

When I left the military for civilian life I was extremely lucky: I landed a job at a startup. It was just before the tech bubble burst so there was everything that embodied a tech startup going on from free soda to monthly parties to Aeron chairs. We blew through a lot of money and in the end were sold for not much, but aside from the silliness of a tech startup I LEARNED A LOT.

When I started they were only a few months old and not even incorporated yet. We built datacenters and infrastructure and applications. We developed plans and processes and strategies. Things that most of the places I have been since have already had (not to say that there is never room for improvement) we got to build from scratch. It wasn’t until we started our downward spiral and stopped innovating that I got bored and left the company.

Which turns out to be a pretty specific pattern throughout my career. Almost every single job I have left was because it no longer provided me a challenge or an opportunity to grow. Some places have provided me more than others, but eventually I get to a point where I have mastered the environment. So my choices are to be bored or move on.

I think that is one of the biggest issues with turnover for SysAdmins. I think that people whare are drawn to this professions (at least the really good ones) are drawn to the ability to learn and grow. When we are in a place where that stops happening we become unhappy and either start looking for something new or at some point start looking into a different career field altogether.

Posted in Life | Leave a comment

High Availability NFS using Red Hat Cluster Server

One of the first things that I had to tackle when I was hired into my current position was how our NFS servers were configured in our pre-production (dev,qa,staging) environements.  Many of the binaries and configuration files for the applications that we run were all stored on a Debian server and shared out to every host.  It worked ok for the most part, but if the server went down, so did everything else.

It was not much better in our production colos.  While we did have a backup filer (something we lacked in pre-prod), we would still have to fail that colo out until we could remount everything to the failover filer manually.

Since I had a good deal of experience with Red Hat Cluster, the first thing that we did in pre-prod was to build a GFS based NFS cluster.  The goal was to give the server a VIP that could float between the hosts and utilize the GFS filesystem mounted on both hosts.  The main reason for GFS so that we could go active/active and segregate traffic between the two nodes.

After implementation, results were mixed.  On the plus side, if one of the servers died and it failed over, it seemed to move over to the new hosts without the filesystem going stale.  On the bad side, heavy writes could bring the server to it’s knees and even occasionally make the filesystem unreadable to the host. Another problem that we had was with rsync.  Apparently GFS and rsync don’t play very well together.

After a few more experiments and some trial and error, we stuck with using Red Hat Cluster but moved to and active/passive system with an ext4 filesystem that was only mounted on one cluster node a time.  This improved our rsync abilities exponentially and resolved our performance issues almost completely.

I am still working on trying to eliminate the need NFS filer or at least greatly reduce it’s need.  Until then I am content that if one of my NFS nodes go down, I don’t have to spend 3 hours doing “umount -l /mnt/XXX” commands.

Posted in Linux, Operations | Leave a comment

Finally Friday

This week has been a bear.  The Oracle Training class that I took was a little more on the boring side that I generally would have liked.  There was some good information in the class but 5 days of blah blah blah to get a few nuggets of good information would not have been worth it if I had had to pay for it.  This whole past month has been pretty busy and I am not sure when the last time it was that I didn’t have to do some bit of work over the weekend.

Not this weekend, however.  I will be refraining from doing very much at all of a technological nature and instead focus my time spending it with family.  Tomorrow I plan to take my wife out kayak fishing and then we are spending Easter with my brother and his family in Temple just relaxing and perhaps enjoying an adult ginger-ale.

Too bad their isn’t football.

Posted in Miscellaneous | Leave a comment

Moving Colos

One of the reasons that I have not been doing much out here over the last month is that we have been working on moving one of our Colocations from our hosting provider back to our office. Things have not been going as well with the migration as we had hoped. It has been postponed twice already, and it is looking like it is going to get postponed again.

The first two reasons that it was postponed was due to site stability issues. Some of the more recent code that went out caused some problems that would require us to flip between our two Colos. Since our second Colo will be down for a week or two during the migration, there is a lot of concern about our exposure so until the code was stabilized we had to postpone.

After our last launch our site seemed to stabilize and we were looking to do a migration next weekend. Unfortunately, it looks like we might have to postpone it yet again. The big problem now is that if we don’t move it the fault will fall squarely on the Systems team. It appears that we don’t have enough AC to cover all of the new equipment. We currently have about 40 tons of cooling in our office datacenter and are using ~36 tons already.

I brought up my concern regarding power and cooling early on when we were discussing the possibility of moving the Colo. I was concerned that much of the savings that we would see would not be true savings since we would still have to power and cool the equipment here. The concern was brushed aside primarily because even if it cost the exact same to host it in the office vs. our hosting provider, it would fall under the facilities budget and not the IT budget. I let it go after that, (wrongly) assuming that it would be covered.

It was not until about a two week ago that anybody thought to engage our facilities manager and figure out exactly where we are and what we would need. That answer came yesterday, and it was not very good news. We have ~4 tons available and will be moving back ~15 tons worth. Needless to say, corporate is not happy that we will likely be delaying this yet again.

Personally, I am not happy about us delaying this yet again. There are a number of projects that I would like to get working on, but this thing keeps hanging over my head.

Posted in Operations | Leave a comment

Spring Changes

It happens every year.  Bonuses are paid in the middle of March and by the beginning of April we have our first round of resignations.  People off to find bigger and better.  It humors me some people’s definition of bigger and better, but what can you do.  So far we have two pretty high-profile people leaving, our lead DBA and our Sr. Architect.

Personally and professionally, I am happy that our lead DBA is leaving.  I don’t think he has been happy for the last 8 months and I think that the only reason that he has stuck around this long was to get the bonus.  He has become a cancer not only to his team but also to the organization as a whole.  I think it is telling when the inside joke around the organization is how little he communicates.  In the interest of full disclosure, we didn’t like each other very much personally lately.

Our Sr. Architect is much more of a loss than our DBA is.  He is a good guy that everybody likes.  What I wonder about is what they are going to do to replace that position.  Knowing all the key players as I do, I am not sure that we have the talent to replace him and I worry that someone on the approved list will be placed in that position as a figure-head more than anything else.

We always see the grass is greener on the other side, but sometimes we get over their and we find out that the “grass” is just concrete painted green.  Hopefully they both find happiness in their new endeavours.

Posted in Uncategorized | Leave a comment

Teaching an Old Dog

One of the things that I really like about what I do is that there is always something new to learn.  I’m not just talking about new technologies or new paradigms, but also new ways to do old things.  For example, coming from the Unix world, I have always used ‘ifconfig’ to view interface information.  It wasn’t until I started running Red Hat clusters that I even used the ‘ip’ command, which is supposed to be the “right” way to view interface information.

Another command that I had never really used until I started working in my current job was the ‘screen’ command.  An extremely useful command to be sure, I had just never heard of it until one of my co-workers had actually used it in his documentation.  Now it is one of the most indispensable commands I know and I use it constantly.   Before I learned about ‘screen’ I always had to have a desktop just so I could have a way for long running operations to continue when it was time to go home.

Today, I learned about the ‘w’ command from the Nubby Admin.  For most of my SysAdmin career I used the ‘who’ command to tell me who was on the system and how long they had been idle.  The command is nice since it gives you the actual idle time, not just the last time it was active like ‘who’ (basically, I don’t have to do math).  It also appeals to my SysAdmin nature as typing two less characters appeals to me. Kind of a win/win.

To cheers to Nubby for teaching an Old Dog a new trick.

Posted in Miscellaneous | Leave a comment

Keeping Track

One of the bigger challenges that I face is keeping track of all of the commitments that I have.  Besides being a Linux Admin Extraordinaire, I am also very active with Veteran’s groups (being that I am one) as well as the Scoutmaster for my son’s Boy Scout Troop.  While these things keep me satisfied, they also keep me extremely busy.  Time management is one of my top priorities.

I have read a bunch of different books on time management, including Getting Things Done, The 7 Habits of Highly Effective People, and of course Time Management for System Administrators.  I have found, like many people before I am sure, that while no one system truly works for me, I have been able to cobble my own working system together from the thoughts and ideas from these as well as many other sources.

What it comes down to, and I think that most of the systems out their subscribe to, is that you need to get everything out of your head.  The only difference between any of the methodologies really is the trusted system that you use to capture the information and the process you use to determine your order of operations.

I use a number of tools to help me manage what I have to do.  I recently found Trello, which I have found invaluable for managing some of my bigger projects.  Trello allows me to share my projects with other people as well, giving me the ability to assign tasks to others.  I have also recently started using Toodledo for my task manager.  I had been using Remember the Milk for the last few years, but the ability for Toodledo to do sub-tasks have compelled me to give it a try.

Whatever system or tools that you use for your time management system, remember that the most important thing to do is get it out of your head.  I trust my brain for a lot of things, by freeing it up from remembering the things I have to do, there is much more room available for helping solve problems.

Posted in Philosophy | Leave a comment

Being a DevOp Before Being a DevOp was Cool

The product team is offering a demo on our product today, and we started talking about he need for the new guy to go to get a better understanding of the application.  He made a comment that in his previous job, he would tell customers that he couldn’t help them because he didn’t know anything about the application.  That type of answer has never flown with me.

Through most of my career, I have managed web-based applications that usually written in Java with a database on the back-end.  As a result, I have spent a lot of time not only learning various java application (Tomcat, JBoss, Weblogic, etc.) and database (Oracle, MySQL, PostgreSQL) servers, but also on the web applications themselves.  I am familiar with the database internals, general usage patterns, common exceptions, and data issues.

My general feeling is that to be a good SysAdmin, you not only need to know what is going on with your servers, you need to understand it.  How can you tune your servers for optimal performance when you don’t understand what is going on with the application?

It is also important to understand what is going on when there are issues.  When a site issue occurs, who is the first person that is contacted?  The SysAdmin.  How many of us can get away with something like “The server is up, I’m good.  Call a developer or DBA”.  Probably not many. Many times we need to see what the system is doing when the issue happens, and it is much easier to recreate it ourselves rather than having to contact and coordinate with a product person or developer to do whatever application related stuff needs to be done to recreate the issue.

It is also needed for defense.  The only thing that gets blamed more than the system is the network.  Most of the time (see my earlier post regarding DB issues we have been having) we need to be look more deeply at the system or else every time we have downtime it will be our fault.   It wasn’t that the code has an infinite loop, it is because the system ran out of memory.

I didn’t set out to be a DevOp, and I don’t think that the term even existed when I started down the path that has brought me to where I am today.  Instead, it was logic, common sense, and self-preservation that drew me to learn everything about the entire tech stack that we run.  This way, I can help find the real problem, not just point fingers.

Posted in Operations, Philosophy | Leave a comment

Fighting the Blame Game

We had a pretty significant event happen yesterday with both of our database clusters.  By default, our Sr. Database Admin blames the hardware.  Period.  In this case, he saw the following in the alert log one one of the servers:

ORA-27090: Unable to reserve kernel resources for asynchronous disk I/O
Linux-x86_64 Error: 11: Resource temporarily unavailable

Now a quick Google of that error pulls up a lot of links that explain that the kernel parameter fs.aio-max-nr is set too load.  A quick check on the server not only shows that this is the case but that it is currently bumping up against that number:

[root@ora1 trace]# sysctl -a|grep aio
fs.aio-max-nr = 65536
fs.aio-nr = 65520

Instead, our DBA has decided that the issue is with the SAN.  It does not matter that no other host has saw any type of issue with connected storage.  He saw I/O in the alert log and as far as he was concerned, that was all the information he needed to blame the underlying hardware.

The problem is that if when you automatically make up your mind as to what the problem is, the problem will never get resolved.  Pointing the finger at somebody else is one of the leading indicators of a bad Admin as far as I am concerned.  It’s kind of like being a parent and thinking that your kid is an angel and never does anything wrong.  It can be extremely counter-productive if you don’t happen to have any other resources that have the ability to dig into the problem area.

When troubleshooting an issue that crosses functional boundaries, try to resist the urge to assume that the area you manage is without fault.  Finding the solution should be the ultimate goal, not affixing blame.

Posted in Operations, Pet Peeves | Leave a comment