| 
	
		| 
				
					| Pages: [1] 2   |  |  |  
	
		|  Author | Topic: The MMO That Deleted Itself  (Read 12635 times) |  
	| 
			| 
					
						| UnSub 
								Contributor 
								Posts: 8064
								
								   | 
 |  
						| 
 |  |  |  | 
			| 
					
						| cironian 
								Terracotta Army 
								Posts: 605
								
								play his game!: solarwar.net | 
 The only scenario that barely makes sense to me would be something like their database server died and they're so far in the red that instead of buying new hardware, management just pulled the plug. |  
						|  |  |  |  | 
			| 
					
						| Lantyssa 
								Terracotta Army 
								Posts: 20848
								
								 | 
 Or they didn't make a back-up. |  
						| 
 Hahahaha!  I'm really good at this! |  |  |  | 
			| 
					
						| Sir T 
								Terracotta Army 
								Posts: 14223
								
								 | 
 Well, its a step up from eve, the MMO that deleted other peoples computers. Twice. |  
						| 
 Hic sunt dracones. |  |  |  | 
			| 
					
						| Kageru 
								Terracotta Army 
								Posts: 4549
								
								 | 
 I guess it's only when you manage to corrupt the live data you discover just how reliable your backup process really is... that's still pretty sad either way though.
 
 |  
						| 
 Is a man not entitled to the hurf of his durf?- Simond
 |  |  |  | 
			| 
					
						| IainC 
								Developers 
								Posts: 6538
								
								Wargaming.net   | 
 I guess it's only when you manage to corrupt the live data you discover just how reliable your backup process really is... that's still pretty sad either way though.
 
 
 Oh man, I have some stories about that... |  
						| 
 |  |  |  | 
			| 
					
						| Shatter 
								Terracotta Army 
								Posts: 1407
								
								 | 
 We our whole MMO |  
						|  |  |  |  | 
			| 
					
						| Mrbloodworth 
								Terracotta Army 
								Posts: 15148
								
								 | 
 Id hate to be that guy that said: "No, just do it, it will be fine." |  
						| 
 |  |  |  | 
			| 
					
						| rk47 
								Terracotta Army 
								Posts: 6236
								
								The Patron Saint of Radicalthons | 
 haha. like my ex-boss who told me i can keep the old 1 tb server hard disk. 'i can take this home and format it?'
 'sure! no problem, i'm getting 5 Tb!'
 
 a week later, i still left the drive on my table...at home. When on Saturday noon, he suddenly called 'PLEASE TELL ME U DIDN'T FORMAT THE DRIVE'
 
 'err.....no.'
 'I NEED IT. NOW.'
 
 Funny shit. He forgot to get the original data in the old HDD before telling me to take it home. He's lucky I'm being lazy as fuck.
 |  
						| 
 Colonel Sanders is back in my wallet |  |  |  | 
			| 
					
						| Xanthippe 
								Terracotta Army 
								Posts: 4779
								
								 | 
 I haven't really kept up with the tech end of things the past few years but having backups on some other site (in the cloud?) seems like a very sound business practice, assuming the costs are not out of whack.  Am I assuming that it's much more simple or costly than it really is? |  
						|  |  |  |  | 
			| 
					
						| Mrbloodworth 
								Terracotta Army 
								Posts: 15148
								
								 | 
 500$ or less for a few terabyte external raid system. |  
						| 
 |  |  |  | 
			| 
					
						| Lum 
								Developers 
								Posts: 1608
								
								Hellfire Games | 
 |  
						|  |  |  |  | 
			| 
					
						| Trippy 
								Administrator 
								Posts: 23657
								
								 | 
 Wait, there's a version of MtG Online that's actually decent now?
 |  
						|  |  |  |  | 
			| 
					
						| Ingmar 
								Terracotta Army 
								Posts: 19280
								
								Auto Assault Affectionado | 
 It isn't pretty but it is very playable. |  
						| 
 The Transcendent One: AH... THE ROGUE CONSTRUCT.Nordom: Sense of closure: imminent.
 |  |  |  | 
			| 
					
						| kildorn 
								Terracotta ArmyPosts: 5014
 
 
 
 | 
 500$ or less for a few terabyte external raid system.
 Speed, man. Speed. I can get a backup system for under 2k that will hold our DB servers. But it would take 72 hours to write the changes that took place in the last 24. Backing up your home computer or a lone mysql server with a low rate of change is one thing. Backing up major systems starts to cost serious cash. To date, I've only worked one place with a proper tested and frequently refreshed backup system. Everywhere else has been trying to do backups on the cheap, which never fucking works right or scales right. But christ, if I see one more 7200 RPM RAID 5 backup solution, I'm going to stab someone in the eye. I've got one right now that running an rm on old data takes 48 hours to wipe 200G. That's just depressing. |  
						|  |  |  |  | 
			| 
					
						| Ingmar 
								Terracotta Army 
								Posts: 19280
								
								Auto Assault Affectionado | 
 Yeah anything that has the change rate of an MMO needs to have significant hardware to have useful backups, unless you're willing to put down in an SLA that restores might not include the last few weeks of  character progression, which is pretty unacceptable. (Yes I know there's always the 'we might not be able to get your character back' clause that's always there as a CYA, but that's not part of the "real" SLA or quasi-SLA that companies use internally.)  |  
						| 
 The Transcendent One: AH... THE ROGUE CONSTRUCT.Nordom: Sense of closure: imminent.
 |  |  |  | 
			| 
					
						| Mrbloodworth 
								Terracotta Army 
								Posts: 15148
								
								 | 
 500$ or less for a few terabyte external raid system.
 Speed, man. Speed. I can get a backup system for under 2k that will hold our DB servers. But it would take 72 hours to write the changes that took place in the last 24. Backing up your home computer or a lone mysql server with a low rate of change is one thing. Backing up major systems starts to cost serious cash. To date, I've only worked one place with a proper tested and frequently refreshed backup system. Everywhere else has been trying to do backups on the cheap, which never fucking works right or scales right. But christ, if I see one more 7200 RPM RAID 5 backup solution, I'm going to stab someone in the eye. I've got one right now that running an rm on old data takes 48 hours to wipe 200G. That's just depressing.I was talking bare minimum. Of course I am tainted by working on small projects. Seems even if they had this, they would still have something. |  
						| 
 |  |  |  | 
			| 
					
						| Ironwood 
								Terracotta Army 
								Posts: 28240
								
								 | 
 To date, I've only worked one place with a proper tested and frequently refreshed backup system. Everywhere else has been trying to do backups on the cheap, which never fucking works right or scales right.
 
 Testify. |  
						| 
 "Mr Soft Owl has Seen Some Shit." - Sun Tzu |  |  |  | 
			| 
					
						| kildorn 
								Terracotta ArmyPosts: 5014
 
 
 
 | 
 I'm not surprised by shoddy backups, really. It becomes really freaking hard to make people buy the backup expansion with new production hardware. "Hey, we increased front end capacity by 30%!" "So uh, you increased the backup system and DR capacity by 30% as well, RIGHT?" "... what do you mean, we'll make it work!"
 Backups, DR, and proper fucking version control are the three things I cannot seem to force people to keep in mind while planning systems. It's frustrating as all hell.
 
 
 But for a small shop, it's easy to just have a backup system of "another fucking server running rsync", though I've also frequently found even that is secretly "some random dev's desktop running rsync"
 |  
						|  |  |  |  | 
			| 
					
						| Ironwood 
								Terracotta Army 
								Posts: 28240
								
								 | 
 Which is why Cloud systems that have that shit built in or easily factored are the way to go here.
 
 |  
						| 
 "Mr Soft Owl has Seen Some Shit." - Sun Tzu |  |  |  | 
			| 
					
						| IainC 
								Developers 
								Posts: 6538
								
								Wargaming.net   | 
 Yeah anything that has the change rate of an MMO needs to have significant hardware to have useful backups, unless you're willing to put down in an SLA that restores might not include the last few weeks of  character progression, which is pretty unacceptable. (Yes I know there's always the 'we might not be able to get your character back' clause that's always there as a CYA, but that's not part of the "real" SLA or quasi-SLA that companies use internally.) 
 Not to mention that for most MMOs we are talking about multiple overlapping databases. If the character database and the item database are separate for some reason then, if they aren't in sync then you basically break every quest that's in progress on a character. I've worked with a major MMO where a database died and had to deal with the aftermath. It wasn't even remotely pretty. |  
						| 
 |  |  |  | 
			| 
					
						| UnSub 
								Contributor 
								Posts: 8064
								
								   | 
 This seems a possible time to ask: what does it cost for MMO servers to operate? I've seen the range from "it's really expensive!" to "it's really cheap!" but never a solid discussion of numbers.  |  
						| 
 |  |  |  | 
			| 
					
						| kildorn 
								Terracotta ArmyPosts: 5014
 
 
 
 | 
 This seems a possible time to ask: what does it cost for MMO servers to operate? I've seen the range from "it's really expensive!" to "it's really cheap!" but never a solid discussion of numbers. 
 It's probably not that pricey as SaaS operations go, considering the hardware Blizz was auctioning off as their servers. It's a decent chunk of bandwith, a redundant switch that will probably cost half the setup because lolcisco, a pricey DB setup (for scaling later in life), decently pricey SAN and his cousin the pricey backup SAN, a small CDN contract for clients and patching, and generally some surprisingly inexpensive front end machines. 200k starting? 500k for solid? Just going off some random buildouts for other things recently. That's all up front and can be mitigated by using ec2 and the like so you're not directly sucking down power/cooling and hardware costs. But I'd imagine trying to cloud an entire semi popular MMO would get pricey as fuck rather quickly. |  
						|  |  |  |  | 
			| 
					
						| IainC 
								Developers 
								Posts: 6538
								
								Wargaming.net   | 
 GOA spent just shy of a million euro on the datacentre for W:AR. That was supporting ~80 actual game servers which were each composed of multiple machines plus auth and patch servers as well as some overhead for tools and metrics. I don't know the hardware breakdown exactly though. |  
						| 
 |  |  |  | 
			| 
					
						| UnSub 
								Contributor 
								Posts: 8064
								
								   | 
 Thanks for that info. I've often seen the "bandwidth is cheap, servers are cheap" meme thrown around and thought that it was probably cheap on a per subscriber basis, but that doesn't mean it is cheap overall.  |  
						| 
 |  |  |  | 
			| 
					
						| kildorn 
								Terracotta ArmyPosts: 5014
 
 
 
 | 
 Thanks for that info. I've often seen the "bandwidth is cheap, servers are cheap" meme thrown around and thought that it was probably cheap on a per subscriber basis, but that doesn't mean it is cheap overall. 
 Bandwidth and actual servers are quite cheap in the grand scheme of things. The majority of your costs are going to be the network gear, the SANs, and whatever runs your backend DBs. Server blades themselves (or random 1/2U machines) are inexpensive comparatively. ~2-3k for a decent machine that won't perform that much worse than a hilariously overspecced 25k version of the same box. 80 game servers seems a bit much for a mid to large sized MMO, but I'm blanking on how many shards WoW runs these days. I know it was heavily implied that each realm lives on one what looked vaguely like an HS40 sized blade. Given the crash habits though I'd be surprised if it didn't run one continent per blade. In all honesty, EvE's model seems far better for MMOs, in that it in theory just uses a compute cloud of a ton of machines to deal with everything and moves the processes around as needed. In reality I think that's actually a somewhat manual process from the "please to be warning us about big fights so we can reinforce a node" crap. I just don't get why you wouldn't run 10 or so systems in a cluster with a good backend network, and adjust load per what the game world is doing. So you don't wind up with a blade doing jack and shit because nobody is in it's zones while another chokes and dies because you have some massive overland raid going down. |  
						|  |  |  |  | 
			| 
					
						| Sir T 
								Terracotta Army 
								Posts: 14223
								
								 | 
 One of the problems with eves model is that its slow to react. Basicly one of the things that happens in downtime is that the system analyses traffic in systems and redirects resources based on activity on each system, and it does it on a sliding scale. In that it is quite sophisticated I'll grant you. The problem of course is that in wartile things can move quite quickly and massive fights can break out over systems that haven't seen activity in months. That was one of the reasons for the whole "renforcing" mechanic, to allow the players to give noticed that they were sieging system X.
 The other problem is that when the system is giving power to one system its draining it from others. I was involved in a low sec war during the BOB-ASCN War and we literally could not enter systems, as somehow our systems were on the same node path as the place that fights were going on, meaning that by accident or by design (it was design) the system was giving all the juice to Bobs playground and leaving us poor plebs in the lurch and unable to fight.
 |  
						| 
 Hic sunt dracones. |  |  |  | 
			| 
					
						| kildorn 
								Terracotta ArmyPosts: 5014
 
 
 
 | 
 See, that's silly. It's dynamic allocation of resources on a 24 hour change schedule. It's not hard to do that on a live basis with a proper setup, even if your app in theory doesn't support it. It's the entire reason people run VMWare instead of Xen or KVM. I mean, when setup properly and not just murdering itself with clock tics because you assigned every VM 16 virtual cores.
 If you can do it without abstracting to a virtualized platform all the better, but that requires you to code planning on it from the ground up. VMs are basically about taking some slight abstraction hit in order to do things your app would be.. unhappy to learn were happening.
 
 I would like to see more dynamic resource allocation in the MMO space (and logical use of on demand computing for things like your patching and authentication on release months), just because it strikes me as an environment tailor made to that form of computing.
 |  
						|  |  |  |  | 
			| 
					
						| Ingmar 
								Terracotta Army 
								Posts: 19280
								
								Auto Assault Affectionado | 
 Thanks for that info. I've often seen the "bandwidth is cheap, servers are cheap" meme thrown around and thought that it was probably cheap on a per subscriber basis, but that doesn't mean it is cheap overall. 
 Bandwidth and actual servers are quite cheap in the grand scheme of things. The majority of your costs are going to be the network gear, the SANs, and whatever runs your backend DBs. Server blades themselves (or random 1/2U machines) are inexpensive comparatively. ~2-3k for a decent machine that won't perform that much worse than a hilariously overspecced 25k version of the same box.Don't underestimate the cost in power, cooling, and dudes that you incur with a large amount of actual hardware, though. It would not surprise me in the least to find out new MMOs will be on VMs rather than dedicated hardware going forward, with HA failures/restores should be almost non-existent. |  
						| 
 The Transcendent One: AH... THE ROGUE CONSTRUCT.Nordom: Sense of closure: imminent.
 |  |  |  | 
			| 
					
						| Yegolev 
								Moderator 
								Posts: 24440
								
								2/10 WOULD NOT INGEST   | 
 It's funny to read about MMOs maybe eventually doing things and using tech that I work on.  Keep hope alive! |  
						| 
 Why am I homeless?  Why do all you motherfuckers need homes is the real question.They called it The Prayer, its answer was law
 Mommy come back 'cause the water's all gone
 |  |  |  | 
			| 
					
						| sam, an eggplant 
								Terracotta ArmyPosts: 1518
 
 
 
 | 
 Which is why Cloud systems that have that shit built in or easily factored are the way to go here. Yeah, you couldn't be any more wrong about that. Going to the cloud is not a viable solution for several reasons. First off, stability is an illusion-- amazon's cloud can't even make four 9s. EC2 and S3 have a 99.95% SLA (3 9s); RDA is under no SLA at all. 99.95% is over 4 hours of unscheduled downtime per year-- that is simply not an enterprise product. Incidentally, they didn't make 99.95% this year-- the amazon cloud was down for over an entire day . Those users were certainly paid something for their service-level guarantee, but can even a full month of free service make up for an entire day outage? No way. You hear the cloud this and the cloud that, but if you've never tried to use "the cloud" you don't understand its limitations. Performance is highly variable, particularly on storage (amazon ECB/S3). You might get as low as 500KB/sec transfers one minute, then when someone else in the cloud finishes their activity, it spikes back up to 20MB/sec. You can't run an enterprise DB on that shit. A wordpress blog, or web 3.0 startup bullshit, sure, but not a backend for an interactive game where response time really matters. The cloud (meaning EC2) is decent for appservers and storing off-site backups. It's even decent for secondary disaster recovery, and has some uses for appservers. It's nice for startups because they can quickly scale up and down as their business grows, to a limit. But databases? No way in hell. The problem here is very simply that the bunch of mickey-mouse clown-shoes buffoons running that jap MMO didn't manage their backups. Real basic IT stuff. Netbackup runs, and sends a report if it fails. Or if you're going ghetto, you have RMAN/export/mysqldump/xtrabackup/whatever shell scripts in crontab that email upon failure.  As for MMOs running their backends highly sharded onto separate servers, I've only worked on the backend for one major current MMO, and it wasn't setup that way. They used one oracle RAC DB per datacenter to hold account data, then had each game server (what players see, named server shards) on its own single oracle DB instance, with a physical standby. Personally-identifiable and creditcard data was on its own VLAN with a separate DB running host-based IDS that held nothing but that info, locked down like crazy. The forums and associated mysql backend, incidentally Valve , were on their own completely segregated VLAN, totally blocked and untrusted by everybody. If I were to attempt to architect a new MMO, I would veer away from Oracle and look to run transactional stuff on very highly sharded and redundant MySQL(Percona) backends with solid-state storage, with the bulk of user data in some sort of scalable NoSQL DB like mongo or cassandra on supercheap hardware. |  
						| 
								|  |  
								| « Last Edit: November 21, 2011, 12:23:16 PM by sam, an eggplant » |  | 
 |  |  |  | 
			| 
					
						| KallDrexx 
								Terracotta Army 
								Posts: 3510
								
								 | 
 Yeah, you couldn't be any more wrong about that. Going to the cloud is not a viable solution for several reasons. First off, stability is an illusion-- amazon's cloud can't even make four 9s. EC2 and S3 have a 99.95% SLA (3 9s); RDA is under no SLA at all. 99.95% is over 4 hours of unscheduled downtime per year-- that is simply not an enterprise product. Incidentally, they didn't make 99.95% this year-- the amazon cloud was down for over an entire day. Those users were certainly paid something for their service-level guarantee, but can even a full month of free service make up for an entire day outage? No way.
 ... etc ...
 
 First of all, just because high profile clouds did not have 4 9s does not mean that an internal IT team can do any better.  You only know about those outages because they are high profile.  I can guarantee that if you start measuring uptime on internal systems you will see MUCH shittier uptimes.  Hell our big corporation already fails the 3 9s for our internal email and TFS servers, as our DNS died for about 5 hours a few weeks back.  Another time our TFS systems completely went fubar.  We have had our website go down for I'm sure more than 4 hours throughout the whole year.  My friend's company can't meet anywhere near 3 9s because they always have some issue that brings down the VOIP servers they provide their clients. Also, while you mention Amazon you fail to mention that if you had servers in their cloud across multiple data centers (and if you require more than 3 9s, then why don't you?) then you had zero downtime with the downtime they had. Personally, Amazon, Microsoft and Google have MUCH more experience managing a high load infrastructure and disaster recovery than I could ever hope to, and every time the rare snafu comes up with them people clamor to say "see this is why the cloud is bad", when in reality no one considers the snafu's that happen at their current workplaces, most of the time they don't get as much notoriety (even within the company) as  "Cloud" service issue. |  
						|  |  |  |  | 
			| 
					
						| sam, an eggplant 
								Terracotta ArmyPosts: 1518
 
 
 
 | 
 Obviously you need competent engineering to support an enterprise environment, whether that comes from internal or outsourced IT staff. Your big corporation and your friend's employee both either lack those competent engineers or were constrained by cost or a garbage legacy environment.
 My company guarantees four 9s to our clients and has experienced zero client-visible outages this year.  There's no secret to this-- we use redundant systems and applications, plan all maintenance, and perhaps most importantly, we make all changes during scheduled downtime windows at 3AM, not 4PM Friday afternoon. We aren't cowboys.
 
 FYI, the april 2011 amazon outage started in the US east availability zone, but it spread to every other avail zone in north america in a "remirroring storm". Everybody tried to failover to other avail zones at the same time, consuming all the bandwidth between them. The entire north america cloud was down for over a full day. Geolocating wouldn't have helped, unless you geolocated in Singapore.
 |  
						| 
								|  |  
								| « Last Edit: November 21, 2011, 01:49:58 PM by sam, an eggplant » |  | 
 |  |  |  | 
			| 
					
						| kildorn 
								Terracotta ArmyPosts: 5014
 
 
 
 | 
 "The Cloud" is a terrible idea for primary operations, imo. Because most of it is trusted and unverified. Sure, I think there are backups, but I thought that at my last managed datacenter and lo and behold when we ran to tape.. they'd missed backups constantly, and only had a full from three months back. They'd just been.. less than up front about their issues.
 That and a performance guaranteed cloud is basically just managed datacenter space with VMware behind it. If you're going to be building out more than thirty or fourty machines, you can afford your own IT staff for them. Outsource networking if you absolutely have to until 80 or so machines. At that point you should be making network changes more than monthly and it would be best to have the expertise in house.
 
 Anywho, not to shit all over the cloud. It's Awesome for what it's designed for: on demand burst computing. The cloud is the solution to running at 60% capacity and suddenly getting double or triple your expected load. Spin off new images and compensate until either the burst goes away or your new hardware comes in (depending if it's sustained growth or just a sudden burst of temporary interest)
 
 Local "clouds" however, as much as you can misuse the term to mean "a fucking mess of processors in a virtualized environment with proper resource pooling" are fucking awesome for a number of application usage patterns to keep from having idle hardware sitting around wasting space, cooling and power.
 |  
						|  |  |  |  | 
			| 
					
						| Ironwood 
								Terracotta Army 
								Posts: 28240
								
								 | 
 Sigh.
 Whatever.
 
 
 I particularly like the 'I think Cloud is good for backups' followed by a torrent of abuse, summed up by 'Cloud is only good for backups.'  Almost as good as the 'No Way, Cloud Sucks, but I work for a company that does this and I'd quite like Money Please.'
 
 
 But you keep on shining, you wondrous diamonds.
 
 
 |  
						| 
 "Mr Soft Owl has Seen Some Shit." - Sun Tzu |  |  |  |  |  
	
		| 
				
					| Pages: [1] 2   |   |  |  
	
 
  |