f13.net

f13.net General Forums => MMOG Discussion => Topic started by: UnSub on November 12, 2011, 02:23:39 AM



Title: The MMO That Deleted Itself
Post by: UnSub on November 12, 2011, 02:23:39 AM
Japanese MMO M2 has had to shut down after the development team have been unable to get it to work following server maintenance. (http://igxpro.net/2011/11/10/hangame-brings-sad-news-mmo-m2-lost-forever-due-to-critical-server-issue/1511503)

The article suggests that they may be using the "our MMO doesn't work any more" as cover for just shutting it down, but I'm not sure the dev team would want to highlight such incompetence unnecessarily before having to go off and find other jobs.


Title: Re: The MMO That Deleted Itself
Post by: cironian on November 12, 2011, 03:56:29 AM
The only scenario that barely makes sense to me would be something like their database server died and they're so far in the red that instead of buying new hardware, management just pulled the plug.


Title: Re: The MMO That Deleted Itself
Post by: Lantyssa on November 12, 2011, 04:39:04 AM
Or they didn't make a back-up.


Title: Re: The MMO That Deleted Itself
Post by: Sir T on November 12, 2011, 07:34:51 AM
Well, its a step up from eve, the MMO that deleted other peoples computers. Twice.


Title: Re: The MMO That Deleted Itself
Post by: Kageru on November 12, 2011, 05:16:16 PM

I guess it's only when you manage to corrupt the live data you discover just how reliable your backup process really is... that's still pretty sad either way though.


Title: Re: The MMO That Deleted Itself
Post by: IainC on November 13, 2011, 07:37:54 AM

I guess it's only when you manage to corrupt the live data you discover just how reliable your backup process really is... that's still pretty sad either way though.


Oh man, I have some stories about that...


Title: Re: The MMO That Deleted Itself
Post by: Shatter on November 13, 2011, 08:19:16 AM
We our whole MMO


Title: Re: The MMO That Deleted Itself
Post by: Mrbloodworth on November 13, 2011, 03:15:11 PM
Id hate to be that guy that said: "No, just do it, it will be fine."


Title: Re: The MMO That Deleted Itself
Post by: rk47 on November 14, 2011, 09:37:35 PM
haha. like my ex-boss who told me i can keep the old 1 tb server hard disk.
'i can take this home and format it?'
'sure! no problem, i'm getting 5 Tb!'

a week later, i still left the drive on my table...at home. When on Saturday noon, he suddenly called 'PLEASE TELL ME U DIDN'T FORMAT THE DRIVE'

'err.....no.'
'I NEED IT. NOW.'

Funny shit. He forgot to get the original data in the old HDD before telling me to take it home. He's lucky I'm being lazy as fuck.


Title: Re: The MMO That Deleted Itself
Post by: Xanthippe on November 15, 2011, 11:52:58 AM
I haven't really kept up with the tech end of things the past few years but having backups on some other site (in the cloud?) seems like a very sound business practice, assuming the costs are not out of whack.  Am I assuming that it's much more simple or costly than it really is?


Title: Re: The MMO That Deleted Itself
Post by: Mrbloodworth on November 15, 2011, 11:54:09 AM
500$ or less for a few terabyte external raid system.


Title: Re: The MMO That Deleted Itself
Post by: Lum on November 15, 2011, 12:08:46 PM
MTG Online came close. Fired old dev team, new dev team utterly failed to keep the game running. It limped along for about 5 years, 2003-2008, in maintenance/most features horribly broken mode, until a brand new version finally came out.

http://www.starcitygames.com/magic/misc/6985_The_MODO_FiascoCorporate_Hubris_and_Magic_Online.html


Title: Re: The MMO That Deleted Itself
Post by: Trippy on November 15, 2011, 12:10:51 PM
Wait, there's a version of MtG Online that's actually decent now?


Title: Re: The MMO That Deleted Itself
Post by: Ingmar on November 15, 2011, 12:15:19 PM
It isn't pretty but it is very playable.


Title: Re: The MMO That Deleted Itself
Post by: kildorn on November 15, 2011, 12:16:50 PM
500$ or less for a few terabyte external raid system.

Speed, man. Speed. I can get a backup system for under 2k that will hold our DB servers. But it would take 72 hours to write the changes that took place in the last 24. Backing up your home computer or a lone mysql server with a low rate of change is one thing. Backing up major systems starts to cost serious cash.

To date, I've only worked one place with a proper tested and frequently refreshed backup system. Everywhere else has been trying to do backups on the cheap, which never fucking works right or scales right.

But christ, if I see one more 7200 RPM RAID 5 backup solution, I'm going to stab someone in the eye. I've got one right now that running an rm on old data takes 48 hours to wipe 200G. That's just depressing.


Title: Re: The MMO That Deleted Itself
Post by: Ingmar on November 15, 2011, 12:21:35 PM
Yeah anything that has the change rate of an MMO needs to have significant hardware to have useful backups, unless you're willing to put down in an SLA that restores might not include the last few weeks of  character progression, which is pretty unacceptable. (Yes I know there's always the 'we might not be able to get your character back' clause that's always there as a CYA, but that's not part of the "real" SLA or quasi-SLA that companies use internally.)


Title: Re: The MMO That Deleted Itself
Post by: Mrbloodworth on November 15, 2011, 12:25:28 PM
500$ or less for a few terabyte external raid system.

Speed, man. Speed. I can get a backup system for under 2k that will hold our DB servers. But it would take 72 hours to write the changes that took place in the last 24. Backing up your home computer or a lone mysql server with a low rate of change is one thing. Backing up major systems starts to cost serious cash.

To date, I've only worked one place with a proper tested and frequently refreshed backup system. Everywhere else has been trying to do backups on the cheap, which never fucking works right or scales right.

But christ, if I see one more 7200 RPM RAID 5 backup solution, I'm going to stab someone in the eye. I've got one right now that running an rm on old data takes 48 hours to wipe 200G. That's just depressing.

I was talking bare minimum. Of course I am tainted by working on small projects. Seems even if they had this, they would still have something.


Title: Re: The MMO That Deleted Itself
Post by: Ironwood on November 15, 2011, 01:19:39 PM

To date, I've only worked one place with a proper tested and frequently refreshed backup system. Everywhere else has been trying to do backups on the cheap, which never fucking works right or scales right.

Testify.


Title: Re: The MMO That Deleted Itself
Post by: kildorn on November 15, 2011, 01:28:06 PM
I'm not surprised by shoddy backups, really. It becomes really freaking hard to make people buy the backup expansion with new production hardware. "Hey, we increased front end capacity by 30%!" "So uh, you increased the backup system and DR capacity by 30% as well, RIGHT?" "... what do you mean, we'll make it work!"

Backups, DR, and proper fucking version control are the three things I cannot seem to force people to keep in mind while planning systems. It's frustrating as all hell.


But for a small shop, it's easy to just have a backup system of "another fucking server running rsync", though I've also frequently found even that is secretly "some random dev's desktop running rsync"


Title: Re: The MMO That Deleted Itself
Post by: Ironwood on November 15, 2011, 01:29:03 PM
Which is why Cloud systems that have that shit built in or easily factored are the way to go here.



Title: Re: The MMO That Deleted Itself
Post by: IainC on November 15, 2011, 02:26:09 PM
Yeah anything that has the change rate of an MMO needs to have significant hardware to have useful backups, unless you're willing to put down in an SLA that restores might not include the last few weeks of  character progression, which is pretty unacceptable. (Yes I know there's always the 'we might not be able to get your character back' clause that's always there as a CYA, but that's not part of the "real" SLA or quasi-SLA that companies use internally.)

Not to mention that for most MMOs we are talking about multiple overlapping databases. If the character database and the item database are separate for some reason then, if they aren't in sync then you basically break every quest that's in progress on a character. I've worked with a major MMO where a database died and had to deal with the aftermath. It wasn't even remotely pretty.


Title: Re: The MMO That Deleted Itself
Post by: UnSub on November 15, 2011, 04:54:42 PM
This seems a possible time to ask: what does it cost for MMO servers to operate? I've seen the range from "it's really expensive!" to "it's really cheap!" but never a solid discussion of numbers.


Title: Re: The MMO That Deleted Itself
Post by: kildorn on November 15, 2011, 05:20:00 PM
This seems a possible time to ask: what does it cost for MMO servers to operate? I've seen the range from "it's really expensive!" to "it's really cheap!" but never a solid discussion of numbers.

It's probably not that pricey as SaaS operations go, considering the hardware Blizz was auctioning off as their servers.

It's a decent chunk of bandwith, a redundant switch that will probably cost half the setup because lolcisco, a pricey DB setup (for scaling later in life), decently pricey SAN and his cousin the pricey backup SAN, a small CDN contract for clients and patching, and generally some surprisingly inexpensive front end machines.

200k starting? 500k for solid? Just going off some random buildouts for other things recently. That's all up front and can be mitigated by using ec2 and the like so you're not directly sucking down power/cooling and hardware costs. But I'd imagine trying to cloud an entire semi popular MMO would get pricey as fuck rather quickly.


Title: Re: The MMO That Deleted Itself
Post by: IainC on November 16, 2011, 12:14:56 AM
GOA spent just shy of a million euro on the datacentre for W:AR. That was supporting ~80 actual game servers which were each composed of multiple machines plus auth and patch servers as well as some overhead for tools and metrics. I don't know the hardware breakdown exactly though.


Title: Re: The MMO That Deleted Itself
Post by: UnSub on November 16, 2011, 04:50:24 PM
Thanks for that info. I've often seen the "bandwidth is cheap, servers are cheap" meme thrown around and thought that it was probably cheap on a per subscriber basis, but that doesn't mean it is cheap overall.


Title: Re: The MMO That Deleted Itself
Post by: kildorn on November 16, 2011, 07:53:09 PM
Thanks for that info. I've often seen the "bandwidth is cheap, servers are cheap" meme thrown around and thought that it was probably cheap on a per subscriber basis, but that doesn't mean it is cheap overall.

Bandwidth and actual servers are quite cheap in the grand scheme of things. The majority of your costs are going to be the network gear, the SANs, and whatever runs your backend DBs. Server blades themselves (or random 1/2U machines) are inexpensive comparatively. ~2-3k for a decent machine that won't perform that much worse than a hilariously overspecced 25k version of the same box.

80 game servers seems a bit much for a mid to large sized MMO, but I'm blanking on how many shards WoW runs these days. I know it was heavily implied that each realm lives on one what looked vaguely like an HS40 sized blade. Given the crash habits though I'd be surprised if it didn't run one continent per blade.

In all honesty, EvE's model seems far better for MMOs, in that it in theory just uses a compute cloud of a ton of machines to deal with everything and moves the processes around as needed. In reality I think that's actually a somewhat manual process from the "please to be warning us about big fights so we can reinforce a node" crap. I just don't get why you wouldn't run 10 or so systems in a cluster with a good backend network, and adjust load per what the game world is doing. So you don't wind up with a blade doing jack and shit because nobody is in it's zones while another chokes and dies because you have some massive overland raid going down.


Title: Re: The MMO That Deleted Itself
Post by: Sir T on November 16, 2011, 08:15:19 PM
One of the problems with eves model is that its slow to react. Basicly one of the things that happens in downtime is that the system analyses traffic in systems and redirects resources based on activity on each system, and it does it on a sliding scale. In that it is quite sophisticated I'll grant you. The problem of course is that in wartile things can move quite quickly and massive fights can break out over systems that haven't seen activity in months. That was one of the reasons for the whole "renforcing" mechanic, to allow the players to give noticed that they were sieging system X.

The other problem is that when the system is giving power to one system its draining it from others. I was involved in a low sec war during the BOB-ASCN War and we literally could not enter systems, as somehow our systems were on the same node path as the place that fights were going on, meaning that by accident or by design (it was design) the system was giving all the juice to Bobs playground and leaving us poor plebs in the lurch and unable to fight.


Title: Re: The MMO That Deleted Itself
Post by: kildorn on November 16, 2011, 09:16:46 PM
See, that's silly. It's dynamic allocation of resources on a 24 hour change schedule. It's not hard to do that on a live basis with a proper setup, even if your app in theory doesn't support it. It's the entire reason people run VMWare instead of Xen or KVM. I mean, when setup properly and not just murdering itself with clock tics because you assigned every VM 16 virtual cores.

If you can do it without abstracting to a virtualized platform all the better, but that requires you to code planning on it from the ground up. VMs are basically about taking some slight abstraction hit in order to do things your app would be.. unhappy to learn were happening.

I would like to see more dynamic resource allocation in the MMO space (and logical use of on demand computing for things like your patching and authentication on release months), just because it strikes me as an environment tailor made to that form of computing.


Title: Re: The MMO That Deleted Itself
Post by: Ingmar on November 17, 2011, 10:09:10 AM
Thanks for that info. I've often seen the "bandwidth is cheap, servers are cheap" meme thrown around and thought that it was probably cheap on a per subscriber basis, but that doesn't mean it is cheap overall.

Bandwidth and actual servers are quite cheap in the grand scheme of things. The majority of your costs are going to be the network gear, the SANs, and whatever runs your backend DBs. Server blades themselves (or random 1/2U machines) are inexpensive comparatively. ~2-3k for a decent machine that won't perform that much worse than a hilariously overspecced 25k version of the same box.

Don't underestimate the cost in power, cooling, and dudes that you incur with a large amount of actual hardware, though.

It would not surprise me in the least to find out new MMOs will be on VMs rather than dedicated hardware going forward, with HA failures/restores should be almost non-existent.


Title: Re: The MMO That Deleted Itself
Post by: Yegolev on November 21, 2011, 09:53:11 AM
It's funny to read about MMOs maybe eventually doing things and using tech that I work on.  Keep hope alive!


Title: Re: The MMO That Deleted Itself
Post by: sam, an eggplant on November 21, 2011, 12:13:46 PM
Which is why Cloud systems that have that shit built in or easily factored are the way to go here.
Yeah, you couldn't be any more wrong about that. Going to the cloud is not a viable solution for several reasons. First off, stability is an illusion-- amazon's cloud can't even make four 9s. EC2 and S3 have a 99.95% SLA (3 9s); RDA is under no SLA at all. 99.95% is over 4 hours of unscheduled downtime per year-- that is simply not an enterprise product. Incidentally, they didn't make 99.95% this year-- the amazon cloud was down for over an entire day. Those users were certainly paid something for their service-level guarantee, but can even a full month of free service make up for an entire day outage? No way.

You hear the cloud this and the cloud that, but if you've never tried to use "the cloud" you don't understand its limitations. Performance is highly variable, particularly on storage (amazon ECB/S3). You might get as low as 500KB/sec transfers one minute, then when someone else in the cloud finishes their activity, it spikes back up to 20MB/sec. You can't run an enterprise DB on that shit. A wordpress blog, or web 3.0 startup bullshit, sure, but not a backend for an interactive game where response time really matters.

The cloud (meaning EC2) is decent for appservers and storing off-site backups. It's even decent for secondary disaster recovery, and has some uses for appservers. It's nice for startups because they can quickly scale up and down as their business grows, to a limit. But databases? No way in hell.

The problem here is very simply that the bunch of mickey-mouse clown-shoes buffoons running that jap MMO didn't manage their backups. Real basic IT stuff. Netbackup runs, and sends a report if it fails. Or if you're going ghetto, you have RMAN/export/mysqldump/xtrabackup/whatever shell scripts in crontab that email upon failure.

As for MMOs running their backends highly sharded onto separate servers, I've only worked on the backend for one major current MMO, and it wasn't setup that way. They used one oracle RAC DB per datacenter to hold account data, then had each game server (what players see, named server shards) on its own single oracle DB instance, with a physical standby. Personally-identifiable and creditcard data was on its own VLAN with a separate DB running host-based IDS that held nothing but that info, locked down like crazy. The forums and associated mysql backend, incidentally Valve, were on their own completely segregated VLAN, totally blocked and untrusted by everybody.

If I were to attempt to architect a new MMO, I would veer away from Oracle and look to run transactional stuff on very highly sharded and redundant MySQL(Percona) backends with solid-state storage, with the bulk of user data in some sort of scalable NoSQL DB like mongo or cassandra on supercheap hardware.


Title: Re: The MMO That Deleted Itself
Post by: KallDrexx on November 21, 2011, 01:20:27 PM
Yeah, you couldn't be any more wrong about that. Going to the cloud is not a viable solution for several reasons. First off, stability is an illusion-- amazon's cloud can't even make four 9s. EC2 and S3 have a 99.95% SLA (3 9s); RDA is under no SLA at all. 99.95% is over 4 hours of unscheduled downtime per year-- that is simply not an enterprise product. Incidentally, they didn't make 99.95% this year-- the amazon cloud was down for over an entire day. Those users were certainly paid something for their service-level guarantee, but can even a full month of free service make up for an entire day outage? No way.

... etc ...

First of all, just because high profile clouds did not have 4 9s does not mean that an internal IT team can do any better.  You only know about those outages because they are high profile.  I can guarantee that if you start measuring uptime on internal systems you will see MUCH shittier uptimes.  Hell our big corporation already fails the 3 9s for our internal email and TFS servers, as our DNS died for about 5 hours a few weeks back.  Another time our TFS systems completely went fubar.  We have had our website go down for I'm sure more than 4 hours throughout the whole year.  My friend's company can't meet anywhere near 3 9s because they always have some issue that brings down the VOIP servers they provide their clients.

Also, while you mention Amazon you fail to mention that if you had servers in their cloud across multiple data centers (and if you require more than 3 9s, then why don't you?) then you had zero downtime with the downtime they had.

Personally, Amazon, Microsoft and Google have MUCH more experience managing a high load infrastructure and disaster recovery than I could ever hope to, and every time the rare snafu comes up with them people clamor to say "see this is why the cloud is bad", when in reality no one considers the snafu's that happen at their current workplaces, most of the time they don't get as much notoriety (even within the company) as  "Cloud" service issue.


Title: Re: The MMO That Deleted Itself
Post by: sam, an eggplant on November 21, 2011, 01:45:24 PM
Obviously you need competent engineering to support an enterprise environment, whether that comes from internal or outsourced IT staff. Your big corporation and your friend's employee both either lack those competent engineers or were constrained by cost or a garbage legacy environment.

My company guarantees four 9s to our clients and has experienced zero client-visible outages this year.  There's no secret to this-- we use redundant systems and applications, plan all maintenance, and perhaps most importantly, we make all changes during scheduled downtime windows at 3AM, not 4PM Friday afternoon. We aren't cowboys.

FYI, the april 2011 amazon outage started in the US east availability zone, but it spread to every other avail zone in north america in a "remirroring storm". Everybody tried to failover to other avail zones at the same time, consuming all the bandwidth between them. The entire north america cloud was down for over a full day. Geolocating wouldn't have helped, unless you geolocated in Singapore.


Title: Re: The MMO That Deleted Itself
Post by: kildorn on November 21, 2011, 06:48:52 PM
"The Cloud" is a terrible idea for primary operations, imo. Because most of it is trusted and unverified. Sure, I think there are backups, but I thought that at my last managed datacenter and lo and behold when we ran to tape.. they'd missed backups constantly, and only had a full from three months back. They'd just been.. less than up front about their issues.

That and a performance guaranteed cloud is basically just managed datacenter space with VMware behind it. If you're going to be building out more than thirty or fourty machines, you can afford your own IT staff for them. Outsource networking if you absolutely have to until 80 or so machines. At that point you should be making network changes more than monthly and it would be best to have the expertise in house.

Anywho, not to shit all over the cloud. It's Awesome for what it's designed for: on demand burst computing. The cloud is the solution to running at 60% capacity and suddenly getting double or triple your expected load. Spin off new images and compensate until either the burst goes away or your new hardware comes in (depending if it's sustained growth or just a sudden burst of temporary interest)

Local "clouds" however, as much as you can misuse the term to mean "a fucking mess of processors in a virtualized environment with proper resource pooling" are fucking awesome for a number of application usage patterns to keep from having idle hardware sitting around wasting space, cooling and power.


Title: Re: The MMO That Deleted Itself
Post by: Ironwood on November 22, 2011, 04:03:35 AM
Sigh.

Whatever.


I particularly like the 'I think Cloud is good for backups' followed by a torrent of abuse, summed up by 'Cloud is only good for backups.'  Almost as good as the 'No Way, Cloud Sucks, but I work for a company that does this and I'd quite like Money Please.'


But you keep on shining, you wondrous diamonds.



Title: Re: The MMO That Deleted Itself
Post by: Kageru on November 22, 2011, 05:14:13 AM

I remember cloud computing. It was when you'd have a socket in the wall through which you'd get access to all the computer resources you needed.

... though writing Multics turned out to be harder than they thought. And it ended up being obsoleted by computing power becoming cheaper faster than they expected.

All that is old can be sold again.


Title: Re: The MMO That Deleted Itself
Post by: sam, an eggplant on November 22, 2011, 07:04:26 AM
Excellent attempt at deflection, but you said they should host in the cloud because it has "all that stuff built in". Using it as a offsite backup target would not have helped, since they didn't take local backups in the first place.


Title: Re: The MMO That Deleted Itself
Post by: Yegolev on November 22, 2011, 07:28:58 AM
My company guarantees four 9s to our clients and has experienced zero client-visible outages this year.

This is a very important statement.

For large stuff, the "cloud" looks a whole lot like what Evil Corporate IT already does, only it's owned by someone else.  Which I think is the point, that if your IT team/budget isn't up to it, you can let someone else worry about these things for a nominal fee.  These people will also probably scrub the outage reports before handing them over, but if the clients don't notice then who gives a fuck? :oh_i_see:

As for backups, they mean nothing.  Restores, on the other hand, are critical.


Title: Re: The MMO That Deleted Itself
Post by: sam, an eggplant on November 22, 2011, 07:34:27 AM
These people will also probably scrub the outage reports before handing them over, but if the clients don't notice then who gives a fuck? :oh_i_see:
I work at a MSP, so our clients notice outages very quickly. If traders can't make their trades, users stop buying shirts from an ecommerce website, or players can't login to a MMO, you hear about it right away. There's no way to cover that up. Additionally, we're contractually obligated to disclose outages immediately; if we covered up we'd be subject to stiff penalties.

That said, I was talking about internal outages, in our intranet and support infrastructure. Our clients have outages all the time, our job is to make sure they weren't our fault.


Title: Re: The MMO That Deleted Itself
Post by: Yegolev on November 22, 2011, 07:50:58 AM
I agree completely, and I feel fortunate I don't have to support ecommerce sites or more vigilant end-users than I already do.  We are probably talking about the same thing.  Not "my japanese panty e-store is down" but "we had a storage initiator go toes-up, have to engage hardware vendor to replace, get the change requests in place for the next maintenance weekend".  This isn't something I would expect a client to see or to appear in an executive report.  It would show up at the operational level, and some subsection of managers would eyeball this, but at the end of the day it's BAU and these managers aren't going to report all this to the uppers.

It takes more than one domino falling to bring down the panty e-store in a proper IT installation.

About the cloud, I read you typing that the performance would be terrible for some applications, and I agree based on the metrics we get from "traditional" tech across regions.  It requires some work to manage performance on storage arrays that you own, nevermind those you cannot touch.  Also partition resource sharing, etc, etc.  Some apps are fine, but some require special handling.  Although I suppose you could pay Amazon enough money to give you a top-tier setup.


Title: Re: The MMO That Deleted Itself
Post by: Ironwood on November 22, 2011, 08:08:22 AM
Excellent attempt at deflection, but you said they should host in the cloud because it has "all that stuff built in". Using it as a offsite backup target would not have helped, since they didn't take local backups in the first place.

Look at the post above mine.  The one I was replying to before you waded in with your nonsense.

There you go.


Title: Re: The MMO That Deleted Itself
Post by: kildorn on November 22, 2011, 09:35:38 AM
Excellent attempt at deflection, but you said they should host in the cloud because it has "all that stuff built in". Using it as a offsite backup target would not have helped, since they didn't take local backups in the first place.

Look at the post above mine.  The one I was replying to before you waded in with your nonsense.

There you go.


I have no idea what your post has to do with mine, either. I didn't mention backups at all. I don't give half a fuck where your backup target is, as long as it can handle the rate of change in a reasonable amount of time and restore properly somewhere.

We wandered off chatting about cloud computing, or at least one random definition of that lovely buzzword of the decade. I have my opinions on what it is and is not good for. I highly disagree with anyone trying to run all their production off EC2 and the like. It's just not what they are built for at all. And I'm definitely not looking for a job, so *boggle*.


Title: Re: The MMO That Deleted Itself
Post by: sam, an eggplant on November 22, 2011, 09:39:22 AM
Yeah, I think he got lost somewhere.

Anyway, how about that schadenfreude, eh? Deleted their whole MMO! Delicious!