[comp.unix.aix] IBM support

grboyce@rodan.acs.syr.edu (George Robert Boyce) (04/18/91)

In trying to add a 3rd party scsi disk to my RS6000/530 server (BTW,
why does IBM support make us call it a 7013?), I ran into two small
problems.

The first was that one of the commands, I forget which, forks a copy
of "mkboot" and I had my own copy of such a program which was found in
my path ahead of /etc/mkboot. My program of the same name, needless to
say, didn't do the expected thing and the command seemed to hang.
Before I knew this cause of the problem, I had decided that I needed
IBM software support since their procedure to add a 3rd party scsi
disk seemed to be failing. I was eager to test out IBM's support, and
IBM support for 3rd party hardware.

That was on Friday morning and I wanted to get this resolved before
the weekend. But since I had followed comp.unix.aix and had called IBM
software support directly in the past, I knew the procedure was to
call my local SE first. I could have told him over the phone that
"lcreatevg" was failing, and I could have read or faxed him the error
message. But he insisted on coming out to help, on Monday. Fine...

So on Monday my SE arrives, we start from scratch and after two or
three commands we run into the problem, he records the error message
and agrees we should call software support. Level one support wasn't
of much help but they did suggest that we reboot the system and see if
that helped. It *seemed* like a reasonable suggestion so that is what
we did. Enter problem number two...

It seems that something (maybe me) had trashed the boot block, err
boot logical volume, of the system disk and the system would not
reboot. This was obvious to me and it seemed to me that there should
be a software solution to this new, more serious, problem. But level
one software support, and my SE, said we had a hardware problem. This
was despite the fact that I could boot the maintenance disks and mount
the system disk and play around without any problems.

Ok, so now I get to call hardware support, report the problem, and
they dispatch a local HW engineer to deal with the problem. A few
hours later, he shows up and we try to run the HW diagnostics. I
offered to run them hours earlier, but my SE seemed to insist that we
let the HW guy do it. His first question when he arrived was, "So, you
run diagnostics yet?". Sigh. Well, the diags run just fine (as I
expected) and so he now calls level one hardware support. We all guess
their answer and sure enough, they say to reload the system. We say,
that is unacceptable and the call gets bumped up to level two hardware
support.

We play around a couple more hours, including trying to boot
diagnostics from the internal disk. We get the same errors as from
when we try to boot AIX, which seems to confirm, to me, that the boot
logical volume is messed up. It seems to confirm to level two hardware
support that we need to reload the system.

After insisting that reloading the system was not a valid option, and
hardware support insisting that there was no hardware problem, we get
the call transfered to level two software support. Once connected, we
got the magic commands needed to fix the problem. A third problem came
up; I was using an old set of maintenance disks and the instructions
didn't work. The level two support person was able to recognize my
error, and correct it and the whole procedure took 15 minutes. 

15 minutes is a damn good time for any support call and I was very
happy. But I am still wondering how to cut down on the *six hours* it
takes to get to the right support person.

On 4/9/91, Pierre Asselin wrote

> General conclusions from earlier exercises:
> 
>  o  Software Defect Support is officially limited to its narrow mandate.
>  o  Technical support is available for the RISC-6000's.
>     It's called comp.unix.aix.
>  o  Accurate information on the RISC-6000's is available, but only
>     on comp.unix.aix.
>  o  Accurate information on the IBM support structure is available,
>     but only on comp.unix.aix.
>  o  To this day, IBM is convinced that it's doing a fine job.
>  o  Hardware support does work.  Beats me.

I have to argue that level two software support knows their stuff. The
problem then is that IBM has a level one support system in place (a)
to protect the valuable and expensive resources of level two by (b)
answering the easy questions. I would argue that level one one does
half of their job. They do a hack of a job of protecting the level two
folks.

So that leaves us with comp.unix.aix for level one support, and a good
but well protected level two support. It could be worse. I think there
are also other possible solutions to this situation. We could try to
convince IBM that (a) they have a support problem and (b) that it is a
serious problem. That seems like it could be a lot of work and we
haven't even solved the problem yet, just made IBM recognize it.

My own oppinion is that IBM should subcontract level one support to
local and regional support service companies and provide all the
necessary support to make it work. But then I've just formed such a
company so my oppinion is biased. Regardless, I am calling my local
office right now to suggest it...

George
--
George R. Boyce, Manager, Systems Engineering Group, george@spica.npac.syr.edu
CASE: Computer Applications and Software Engineering Center
NPAC: Northeast Parallel Architectures Center
SCCS: Syracuse Center for Computational Science

And now also: The Computing Support Team

robin@pensoft.uucp (Robin Wilson) (04/19/91)

In article <1991Apr17.195425.8885@rodan.acs.syr.edu> grboyce@rodan.acs.syr.edu (George Robert Boyce) writes:
>In trying to add a 3rd party scsi disk to my RS6000/530 server (BTW,
>why does IBM support make us call it a 7013?), I ran into two small
>problems.

{stuff deleted}

>So on Monday my SE arrives, we start from scratch and after two or
>three commands we run into the problem, he records the error message
>and agrees we should call software support. Level one support wasn't
>of much help but they did suggest that we reboot the system and see if
>that helped. It *seemed* like a reasonable suggestion so that is what

You mean "Level 2" support.  LEVEL 1 SUPPORT IS ONLY THE "800" NUMBER!!!
THE LEVEL 1 PERSON MERELY TRANSFERS YOU TO LEVEL 2.  LEVEL 1 CAN (AT BEST)
ONLY GIVE YOU STATUS (BY READING THE PROBLEM RECORD TO YOU - THEY CAN'T
EVEN ATTEMPT TO INTERPRET IT FOR YOU).  Whenever you open a new problem,
you will provide the problem description to LEVEL 2.

>hardware support insisting that there was no hardware problem, we get
>the call transfered to level two software support. Once connected, we
>got the magic commands needed to fix the problem. A third problem came
>up; I was using an old set of maintenance disks and the instructions
>didn't work. The level two support person was able to recognize my
>error, and correct it and the whole procedure took 15 minutes. 
>
>15 minutes is a damn good time for any support call and I was very
>happy. But I am still wondering how to cut down on the *six hours* it
>takes to get to the right support person.

Sigh.  This is probably the fault of the original Level 2 person who took
your call.  (BTW, If; in fact, Level 1 advised you to do anything to your
machine, there is a serious problem with the support system... Since level 
1 people don't know anything about the machines -- they support ALL IBM 
SYSTEMS, it would be impossible for them to know anything about most of
them.)  

In this particular case, I would say that you got the exception rather than
the rule.  Try to remember that level 2 is training new people "ALL THE 
TIME".  While that doesn't excuse the fact that this person sent you on
a wild goose chase, it might explain it.  If the level 2 person seems to be
telling you something that doesn't make sense, and they cannot logically 
explain to you why it should make sense, then ask them to speak to a manager.
The manager will make sure that the "best" technical person for your problem
type will assist you.

>On 4/9/91, Pierre Asselin wrote
>
>> General conclusions from earlier exercises:
>> 
>>  o  Software Defect Support is officially limited to its narrow mandate.
>>  o  Technical support is available for the RISC-6000's.
>>     It's called comp.unix.aix.

Its is actually available through the SE, and IBMLINK.  And it is getting 
better every day.  (But if nobody calls them, they won't get any better -
you have to exercise the horse if you want to win the race.)

>>  o  Accurate information on the RISC-6000's is available, but only
>>     on comp.unix.aix.

Clearly, this is not true.  The above case illustrates that.  Getting to
the right people may take a little effort, but it is there.

>>  o  Accurate information on the IBM support structure is available,
>>     but only on comp.unix.aix.

This is also not completely true.  If you ask the Level 2 manager to explain
it to you, they will provide as complete a description as (IBM policies allow
for) possible.

>>  o  To this day, IBM is convinced that it's doing a fine job.
>>  o  Hardware support does work.  Beats me.
>
>I have to argue that level two software support knows their stuff. The
>problem then is that IBM has a level one support system in place (a)
>to protect the valuable and expensive resources of level two by (b)
>answering the easy questions. I would argue that level one one does
>half of their job. They do a hack of a job of protecting the level two
>folks.

Since you really only got Level 2 support for all of your assistance, your
comments are a double edged sword.  Level 2 failed to provide the right
answer the first time, but did provide the right answer the second time.
Try to remember that Level 1 knows nothing about any of the IBM systems, 
and are not intended to answer any questions (at least for AIX support) - 
they are merely data-entry operators for the level 2 people in Austin.

>So that leaves us with comp.unix.aix for level one support, and a good
>but well protected level two support. It could be worse. I think there
>are also other possible solutions to this situation. We could try to
>convince IBM that (a) they have a support problem and (b) that it is a
>serious problem. That seems like it could be a lot of work and we
>haven't even solved the problem yet, just made IBM recognize it.

Seems that IBM is working to solve this perception.  Training takes time.
The machine is not even 1 yr old yet (officially).  The people at level 2
who know the most, have also been there the longest (by-and-large).

>My own oppinion is that IBM should subcontract level one support to
>local and regional support service companies and provide all the
>necessary support to make it work. But then I've just formed such a
>company so my oppinion is biased. Regardless, I am calling my local
>office right now to suggest it...

Dream on ;-).  They do "sub-contract" some of the level 2 and level 3 work,
but level 1 is only the phone answering service, so that will never change.


+-----------------------------------------------------------------------------+
|The views expressed herein, are the sole responsibility of the typist at hand|
+-----------------------------------------------------------------------------+
|UUCP:     pensoft!robin                                                      |
|USNail:   701 Canyon Bend Dr.                                                |
|          Pflugerville, TX  78660                                            |
|          Home: (512)251-6889      Work: (512)343-1111                       |
+-----------------------------------------------------------------------------+

eravin@panix.uucp (Ed Ravin) (04/19/91)

I've reported a grand total of two problems to IBM support so far.  Here
are the results:

#1) getty messes up permissions for bi-directional ports.  After crawling up
the chain of IBM support staff, I was told they'd call me back.  They did, and
said they had no further information.  I then went back to this newsgroup,
when I found a few other people with the same problem and one IBM person on
the newsgroup gave me a fix number. I called up IBM again to ask for the fix,
and after a few days I received the diskette.  But it's clear that the "fix"
isn't really a fix, and only defers the problem (changing one kind of security
hole into another).  Also, the "fix list" in /usr/lpp/bos INCLUDES the APAR
number that I've got on the diskette that's supposed to be an update!  Sorry,
IBM, I don't trust this update disk and I think I'll live without it.

#2) sna services attachments hang up and exit on certain line events (like
FRMR or loss of DSR) and I think they shouldn't be doing that.  One call
to IBM so far got my problem reported.  They said they'd call me back when
they knew something.  Haven't heard from them yet.  My officemate called and
they offered to send us two updates to SNA Services.  But we've been through
that with the PC RT, and we no longer install updates like that unless the
support staff can explain to us why we need it.

I also asked for help over comp.unix.aix, and I got exactly one response, in
email.  It was from a guy working at Rabbit, telling me to buy my sna
software from a company that supports their products.

Final score: IBM 0, Usenet 1.

At other companies whose support I've dealt with in recent years, the first
person you talk to has at least enough understanding about computers to
answer the FAQ's and direct you to someone more technical or more specialized
if your problem was beyond him/her.  At IBM, they insist on having the lowest
qualified people possible answering the phone in order, so you have to
fight your way up the chain to talk to someone who might be able to answer
your questions instead of being just an "information operator" who can look
up keywords or fix numbers in a database and read back what shows up on his/her
screen.

-- 
Ed Ravin            | This random number tells the computer that you are 
cmcl2!panix!eravin  | a member in good standing.  It is not related to your
philabs!trintex!elr | membership number.   --- Sierra Club