tag:blogger.com,1999:blog-57300358249185687502024-02-20T20:06:22.420-08:00FRED tekThe tech that runs FRED, the USA's Fencing Tournament resource.Unknownnoreply@blogger.comBlogger16125tag:blogger.com,1999:blog-5730035824918568750.post-4413088350674669772013-11-11T14:08:00.002-08:002013-11-11T14:12:02.195-08:00FRED's new payment processor
As some of you know, FRED has recently switched to a new credit card processor. The new service, called Stripe.com, is a much more modern and web-app-friendly service. It has two main advantages over the previous service:
Better customer security: Stripe uses some javascript and cryptographic magic to authorize credit cards without the sensitive data ever being sent to FRED. This means that Unknownnoreply@blogger.com1tag:blogger.com,1999:blog-5730035824918568750.post-10571551047949699172011-12-14T09:21:00.000-08:002011-12-14T09:21:19.521-08:00Upcoming event rating and size searches are backHello tournament seekers-
Over a year ago, I had to disable the "event size" and "expected event rating" search criteria in FRED's upcoming event list. At long last, they are back. These are one of the more valuable features of FRED's event search, so I'm very happy to have them back, and I'm sure you will be too.
You'd be surprised how much work it was. Admittedly, most of it was "under the Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-5730035824918568750.post-57039031865629755392011-08-07T03:33:00.000-07:002011-08-07T03:35:42.660-07:00Turns out it was a Chinese bot.As it turns out, FRED's recent downtime was caused by an ill-behaved crawler run by a Chinese search engine. When this issue first arose, one of the first things I did was to look for excessive numbers of requests coming from single IPs, and this bot had been among the top 3 or 4 clients over the better part of that day. But because it made fewer requests than other crawlers such as Google, Bing,Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-5730035824918568750.post-16075108249049535072011-08-04T08:38:00.000-07:002011-08-04T08:38:15.686-07:00Tweaked Apache configOk so I opted for the "tweak apache" option. I lowered the MaxClients setting and set MaxRequestsPerChild to 1000. Neither of those will directly stop the number of processes from escalating, but they might cause different behavior to occur when the number of processes gets too high.
We'll see.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-5730035824918568750.post-4765981568849461112011-08-03T20:37:00.000-07:002011-08-03T20:37:37.003-07:00FRED is flappingFRED's webserver has been down for a number of short (~3 minute) periods all day today. Beginning with a few incidents on friday and a few over the weekend, escalating to 24 such incidents today (so far). Apache simply spawns gradually more and more child processes until it exhausts memory on the server, at which point it fails to respond to a probe from FRED's auto-restart monitor. At that Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-5730035824918568750.post-17888294857234172842011-04-21T18:43:00.000-07:002011-04-22T08:01:48.910-07:00Well *that* was painful....FRED is back from today's EC2 outage. Amazon has three of the four availability zones in their us-east-1 region (virginia) datacenter operating. Unfortunately, FRED's database server was running in the one zone that is still sick. However, I was able to snapshot the database's EBS volume and create a new volume from that snapshot in one of the other zones, then fire up a new DB server instance inUnknownnoreply@blogger.com0tag:blogger.com,1999:blog-5730035824918568750.post-77526230468941786692011-04-21T09:10:00.000-07:002011-04-21T09:10:45.251-07:00EC2 outageHello faithful FRED users- Today Amazon EC2's us-east-1 region is experiencing a serious, sustained outage in EBS connectivity and creation. FRED lives in us-east-1a, so his database server (whose files live in an EBS volume) became inaccessible at about 1am PDT.
All I can do is wait for EC2 to fix the issue. Very curious or geeky folks can follow their progress here: http://Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-5730035824918568750.post-56838266203955795322011-03-26T14:57:00.000-07:002011-03-26T14:57:03.390-07:00Another try at handling the accented characters in fencers' namesFRED is getting used more and more in Canada these days, which is very cool. However, it's brought a long-standing problem with FRED closer to the surface: Multibyte characters.
FRED is written in PHP, a great language for quickly building complex web applications. However, PHP's support for multibyte encodings is less than awesome. Also, lots of the code in FRED was written in Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-5730035824918568750.post-3903630734207780082011-03-03T07:51:00.000-08:002011-03-18T05:24:24.004-07:00And here I was getting all happy about the uptime...Only to have FRED go down hard for 6 hours this morning! What happened: Every night logrotate rotates FRED's web server logs (among others) and restarts the web server process. Last night the webserver didn't come back from the restart for some reason. All it took was a simple (re)start to get FRED back up (not of the whole server, just the webserver process). It's hard to tell why it died,Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-5730035824918568750.post-72402544439459290932011-02-16T07:46:00.000-08:002011-02-16T07:46:55.942-08:00Query of Death Part DeuxWell, I've made some code changes in an attempt to prevent the offending "expensive" query from being executed. To be honest, only time will tell if I've actually eliminated the exact code path that has been running that query, and even if I have, we'll have to see if that stops these little 5 minute outages from happening.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-5730035824918568750.post-68940212140892261492011-02-09T19:13:00.000-08:002011-02-09T19:13:27.320-08:00Query of deathFRED continues to have short periods of downtime ranging from 3-6 minutes a few times per week. These appear to be caused by a request that runs a very expensive query that takes a couple minutes to run and consumes most of the database's resources during that time. This slows or blocks other users' queries, causing page requests to stack up and take all of the available memory on the web server,Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-5730035824918568750.post-65180697709037476682011-02-06T18:25:00.000-08:002011-02-06T18:25:34.714-08:00Wow, FRED sends a lot of mailWith FRED's current SMTP (outgoing email) service, it'll cost about $550 per year to send email. Yikes! EC2 did just launch an outgoing email service that would be WAY more cost-effective (maybe even free for FRED since he might make it under their free tier), but it's not an SMTP service. Instead, it's a webservice API, which makes it harder for FRED to use it. I have hope that soon they'll add Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-5730035824918568750.post-50065623800436462702011-01-21T02:15:00.001-08:002011-01-21T02:19:12.162-08:00So far so goodWell, it's been a week and change since the last problem related to the server move. Is it too early to declare the move a success? At this point the only thing that is still an open issue is the volume of email FRED sends. The new outgoing mail service I'm using charges a certain amount of money for a certain number of emails per month. At first I just guessed how much quota FRED would need, andUnknownnoreply@blogger.com1tag:blogger.com,1999:blog-5730035824918568750.post-47055805299068277332011-01-11T22:16:00.000-08:002011-01-14T00:46:42.419-08:00Ok, maybe the backup wasn't the problem...Now it looks like FRED's PCI compliance scanner took him offline.
Background: Like all web based businesses that accept credit cards online, FRED is required by the credit card companies to have a security scan done periodically by an approved independent vendor. In FRED's case, that's done weekly. As it happens, they do it at noon PST monday, exactly the time FRED went offline today. Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-5730035824918568750.post-11556244584409731522011-01-11T15:55:00.000-08:002011-01-11T15:56:42.842-08:00DB goes haywireWell, that was embarrassing.
Today at noon PST, FRED's database server stopped responding, which of course takes the site offline. I'm still investigating, but I believe it was caused by a problem in the hourly data backup snapshot job.
Some background: In EC2, Amazon provides highly available redundant network "drive" storage called Elastic Block Store, or EBS. FRED's data files are Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-5730035824918568750.post-11879112966165120582011-01-04T21:43:00.001-08:002011-01-06T01:13:19.332-08:00FRED moves to the cloudMany FRED users may have noticed that in the past half year or so, FRED has developed a progressively worsening case of narcolepsy. That is, he seems to fall unconscious at times, and fails to respond when you come calling. At first it happened only now and then, and not for very long. These days though, it seems to happen at least once a week, sometimes a couple times a day, for as much as an Unknownnoreply@blogger.com3