So, here’s the deal. A friend of mine posed this scenario (hypothetical or not, I don’t know and I didn’t ask!).
There is a site that has three Web Servers and all of them pointing to one Database. She said that the site wasn’t working and asked how I would help troubleshoot this.
Yes – clear as mud.
I asked if there was a Load balancer that was diverting traffic; at first she said, she did not know but later changed it to, “No.”
It sounded like that was immaterial.
So I suggested that perhaps it would be good to temporarily stop two Web Servers and allow the traffic to go through only one server, to this she suggested that then how will the customers shop?
Yes, I was surprised, since the site was not working, how will it matter if anyone shopped or not? But, she clarified, that she meant that the site was slow!
Now, as I began to wonder aloud, I figured the issue might like with one of these three:
Let’s look at each of these:
My next follow-up was to find out of the three Web Servers at the same location or if these were in geographically diverse locations. The rationale is that, if these are in geographically different locations and the issue was happening regardless, then network might not be an issue.
The common aspect to all three Web Servers would be the Application Code that is engineering the site. It might be worthwhile to see if the same issue is happening in other non-Production environments. If it does, it might suggest that the Application Code could be a likely candidate.
Since all We Servers point to the same Database, if the DB is not properly indexed or is poorly designed that might also cause the issues.
As a Technical Program Manager, I have faced all kinds of issues where there is not a lot of information available. The situation is riddled with ambiguity and poses additional challenge. Sometimes you do not get “closure.” The issue might occur suddenly and then go away; this does not happen often, but I have seen it happen.
Coming back to my friend, she and I discussed these and debated for some time. She is still looking for some answers.
How would you go about identifying the root cause in the above problem to help my friend? ♦