SV650.org - SV650 & Gladius 650 Forum



Idle Banter For non SV and non bike related chat (and the odd bit of humour - but if any post isn't suitable it'll get deleted real quick).
There's also a "U" rating so please respect this. Newbies can also say "hello" here too.

Reply
 
Thread Tools
Old 15-11-06, 07:10 PM   #11
tricky
Guest
 
Posts: n/a
Default

I hate computers.
We've had some database "issues" over the past few days.

Unfortunately, just before it all went tits-up, the outsource boys in India made an un-documented and un-approved change (for which they received a proper bollocking from me).

Anyway the application support bods got wind of this and of course this had to be the cause of the incident (Of course it could never be the prehistoric version of Oracle or the ****ty application causing a problem :P )

I tried to explain that me putting the kettle on in Nottingham and causing a power spike in the data centre 50 miles away, was just as likely to have caused the problem, as the Mumbai lads adding an extra line in syslog.conf, but they wouldn't have it.

What are the sentencing guidelines for pre-meditated ABH ?

Sorry for the derail/rant.
  Reply With Quote
Old 16-11-06, 10:02 AM   #12
Baph
Guest
 
Posts: n/a
Default

OK, for the geek contingent on the site, the issue is now fixed (I think) so the system is running. It's far from perfect, but it's running.

Just a bit of background information first I think. Unfortunately I'm tied to an NDA, so I'll have to keep details vague. The two clients affected yesterday, one is a large car manufacturer (quite expensive cars too), the other is a world wide distribution/logistics company.

For the car manufacturer, our applications control everything from production line robotics right the way through their own distribution & GPS tracking systems. Because they had downtime, and I mean, complete downtime, everything was powered down for safety reasons (ever seen a robotic rivet gun getting fed random data when there's a chance for people to be nearby doing inspections? NOT GOOD).

For the logistic's company, all their route planning system, GPS tracking, automated warehousing, the lot, down. This meant that everything for them had to revert to manual, which isn't quite as big a deal as the car manufacturer, but still slows down operations.

fizzwheel, I understand fully what you were getting at. For the car manufacturer, I simply shutdown the servers. Their have to power down their hardware anyway, so what's the point of having servers commanding hardware that has no power. Also, two clients down, which one to fix first? I can't be on two systems at the same time. The logistics company, well, it works out that if they have to stop operations completely, they loose just short of £41,700 every minute! Crippled systems mean they loose less money, yes they still loose it, but not quite as much.

So the setup. Each client has 4 servers (at least). Database server, and 3 application servers. Of those application servers, 2 deal with users, the other automated processes (EDI stuff mainly). All of this runs on various Windows platforms (we don't care, it's up to them to choose), with IIS (yuck) and WebObjects (nice, but rather limiting). Our applications are written in Java, and interface like most other web based applications. User sees HTML, this is fed back through WebObjects to our application.

OK, so the steps to fix is were basically:
1) Get call from clients, have a brief look on both systems, realise that the car manufacturer has a potential to actually hurt people if it's left running, so shut that down. Screw the money they loose, policy states you don't put profit ahead of people, ever.
2) Investigate the logistics client a little more, and realise that this is a MAJOR issue. Call the client & get authorisation to "do whatever is needed to ensure productivity". Basically, I now have the green light to do everything up to and including re-imaging live servers in-situ. It's at this point is when I turn around to my boss, explain what I see before me, and his response was literally "OK, you'd best deal with it" as he walked out of the office. He never came back, I've no idea where he went, and I really don't care. He can explain himself to those who pay his wages. It was this point when I emailed him, CC'd the directors of our company & the client & basically said "This is what I plan on doing, if anyone has any objections I'm on extension 229, you've got 30 seconds before I start. If my first thoughts don't work, I've been left alone to deal with it, it'll be dealt with however I see fit."
3) Get a call from the client asking if I'm going to be able to fix this within critical SLA (4 hours). I explain that I'm not sure, as this is something no-one has ever encountered, and he says he'll call me back in an hour to get an update. If I'm still not sure, he'll arrange for transport to site (which means a private jet ) At this point I call the Mrs, and let her know that I'm dealing with a problem, and it's magnitude, and that I might not be home in the next month (if we go to site, we return whenever the client says so).
4) Seeing that the server is running, and I can keep a connection to it, but our application isn't talking to anything, either via IIS or standalone EDI file transaction, or our own port communications. Start scratching my head, decide to shut down the entire thing temporarily & restart it. Not having any of it.
5) Run some of our database integrity checking tools, which tell me that the DB has 'inconsistancies' (this could mean anything realisitcally), so drop to resillience & run the same scripts, same result. Bugger. Live DB, no backup system available. Oh ****.
6) OK, so shutdown & restart hasn't killed this thing. I've got no access to the applications. Need to start thinking now. I know nothing about it, so, I need to know something, how do I do that? Aha, I have a java compiler. So I quickly knock out a rough application to iterate the running processes, memory segments they reside in etc etc, run it on one of the servers & start ruling out legitimate services/applications.
7) Repeat 6, on another server, and cross-reference results. This leaves one standing out from the rest. I take this to be a virus (it might not be, but it damn well looks like it) so call a college friend of mine who just happens to work as a code monkey for an AV company Again, I have to keep things vague with him, which doesn't help the situation.
Spend the next 10mins bouncing ideas off him, explaining that 2 clients and our own AV software hasn't caught it. We come up with a plan to watch memory, live, to see what this thing is doing. At least we'll know more about it.
9) Damn it seems there's a pattern to it's processing. That's good. My friend emails me a modified Windows NTx kernel, which I send to one of the application servers and reboot it (there's another application server, so things will keep running). With the new kernel in place, I now have the power to overwrite any memory segments I choose (but so does the virus). So I quickly fire up a memory hex editor, and fill in a few NOOPs. It takes a few attempts, but I manage to commit these NOOPs at the right time, and kill the damn thing. (For those that know about it, I basically used the old NOOP sled attack that used to cause buffer overflows, but a customized version).
10) Restore the original kernel, reboot the server, test things (after locking it away from the rest of the LAN by firewall rules), good, she's running.
11) Repeat one server at a time, until the client is sorted out. Thank Allah for that!
12) Report back to the client, tell him to cancel my transport, and do everything above for the other client. By now my head hurts, a lot.
13) Everything fixed, everyone happy, I head home, not too late either

Then I get a call from the logistics client at 10pm last night "We've got the same issue again"... OH ****! So spend half the night fixing it again.

Now I have to fill in reports about what happened, and try to find out where this thing came from. My college friend will probably come in handy for that, and in reward, his company will get as much information I can provide (without breeching NDA) about the attack.

I'm also going to spell out that I'd like a chocolate fireguard as my new boss when I file the reports. It'd of been much more useful!

So there you have it folks
  Reply With Quote
Old 16-11-06, 10:15 AM   #13
21QUEST
Member
Mega Poster
 
21QUEST's Avatar
 
Join Date: Aug 2003
Location: HomeBound
Posts: 3,302
Default

Quote:
Originally Posted by Baph
....Loads of stuff in a foreign language ....

So there you have it folks
I think blondy( http://forums.sv650.org/viewtopic.php?t=47889 ) was right, you geeks are not normal

Sounds like you did a job.


Cheers
Ben
__________________
Nemo me impune lacessit.
Quote:
Originally Posted by Lissa View Post
Blue, mate, having read a lot of your stuff I'd say 'in your head' is unknown territory for most of us
21QUEST is offline   Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
I love my job! SAMMY650 Bikes - Talk & Issues 9 05-07-08 12:17 PM
Do you love me or love my wife? Stig Idle Banter 39 01-02-08 08:56 PM
Love will tear us apart cover - love it or loathe it ? fizzwheel Idle Banter 19 08-10-07 07:11 PM
Love is... Law Guildford Massive 14 23-03-07 07:29 AM
Sometimes I love my job... Bear Idle Banter 16 22-11-06 11:41 AM


All times are GMT. The time now is 09:42 PM.


Powered by vBulletin® - Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.