Is your environment ready for the onslaught that you may or may not be aware is coming your way?
One commonly known example of this is what is called “the Slashdot effect.” This is what happens when the popular site Slashdot (or others like it) links to a small site. The combined effect of thousands of people attempting to view the site all at once can bring it to its knees – or fill up the traffic quota in a hurry.
Other situations may be the introduction of a popular product (the introductions of the Iphone and of Halo 3 come to mind), or a popular conference (such as EAA‘s Airventure, which had some overloading problems).
Examine what happens each time a request is made. Does it result in multiple database queries? Then if there are x requests, and each results in y queries, there will be x*y database queries. This shows that as requests go up, database queries go up dramatically.
Or let’s say each request results in a login which may be held for 5 minutes. If you get x requests per second, then in 5 minutes you’ll have 300x connections if none drop. Do you have buffers and resources for this?
Check your kernel tunables, and run real world tests to see. Examine every aspect of the system in order to see what resources it will take. Check TCP buffers for networking connections, number of TTYs allowed, and anything else that you can think of. Go end to end, from client to server to back-end software and back.
Some of the choices in alleviating pressure would be using caching proxies, clusters, rewriting software, changing buffers, and others.
James Hamilton already has collected a wide number of articles about how the big guys have handled such scaling problems already (although focused on database response), including names such as Flickr, Twitter, Amazon, Technorati, Second Life, and others. Do go check his article out!