After the great Amazon Prime Day failure of 2018, you’ve probably heard a lot of talk about website monitoring. After this incident, it has you thinking about your website. You’ve never really monitored your site much before so you are not sure where to begin, how to monitor your website or why you even need to monitor it.
Why should you monitor your website?
It’s important that your website is always running at its peak abilities. A slow site can, affect SEO performance, lose a potential customer conversation and affect your brand's reputation. Case in point, Amazon’s #PrimeDayFail last week. It’s vital that you monitor your website so you can make sure your business is always up and running when customers come looking for you.
What should I monitor?
Determining what to monitor can be the most challenging aspect of the process. Knowing what your options are and what type of budget is available to you for this purpose should help you narrow down where to go next. There are simple health checks to tell if a website is up or down, or far more complex checks to tell if a form on a site is down for users in a specific geographical region. Let’s take a look at some of the different types of monitors available:
This is the most basic network monitoring test. A ping is similar to playing the old game of Marco Polo. A monitoring server will send a request out to a given server/device and will wait for a response. If the device is available, it will simply respond with an acknowledgment and the time this took to accomplish will be displayed. This will tell you if the server can be reached, but not necessarily that a website is “online” or performing well. In the event of site slowness, this basic test can provide insight to network latency or if a device is offline.
Examples of services for Ping Monitoring: Nagios, Zabbix
URL / Geographic
This is a more common monitoring metric. In this test, a monitoring application will send a web request to a website and check its response. At a core level, it will ensure it receives a response back and that it was OK (more commonly referred to as a 200 OK). If a website begins to show an error message or starts to timeout for end users, this type of test will provide more insight beyond a Ping test because it can tell you the error message that it received. An advanced version of this test typically includes a Geographic component where the testing servers are located in different locations around the globe. This can assist in understanding if certain users are seeing slowness/outages which you might not be experiencing in your own browser.
Examples of services for URL/Geographic Monitoring: Pingdom, New Relic
Application / Database
Knowing if a site is online or offline is only part of understanding the websites true health. You may get indications from a URL monitor that your site is performing slow or is seeing errors, but without digging into the site’s error logs, you might not understand why. Adding application level monitoring can perform a variety of probe tests. For example, the test can walk through a checkout process and let you know how long it took or if it was unable to complete the pre-determined set of steps. It can also monitor the application to see what lines of programming code appear to be taking a long time to complete. This can let a developer know to focus their attention on a SQL/Database query or to investigate why certain data isn’t being stored in memory correctly. There are very few limits on “what” you can test, but it can be easy to get overly aggressive in testing and create a variety of false positives which can send a developer or support team on a wild goose chase.
Examples of services for Application/Database Monitoring: New Relic, AWS CloudWatch/APM, Azure Insights
The most common thing to monitor is how hard your server is working. This typically includes CPU, Memory (RAM) / Disk Utilization and Network Utilization (%). Once a hosting environment reaches 80+ % with any of these metrics, performance issues can arise, or a site can become unusable.
Another good thing to monitor is the amount of disk space available. Running out of space will degrade site performance, change a site's behavior or even bring a site/server down.
Most hosting providers offer a set of monitoring tools/graphs (with some form of graphing history). Understanding how to read these graphs and trends can help you pinpoint and understand potential performance issues. It also can be used post-mortem to determine why a site become degraded during a certain period of time.
Examples of services for Server/Hardware Monitoring: Nagios, Zabbix
Surprisingly, most developers and IT professionals we run into, don’t typically use a website’s analytics in their collection of monitoring tools. However, understanding how many visitors were on the site and what they were up to at a given time can give a quick indication to a developer on what pages were causing performance issues. Overlaying the visitor traffic with hardware graphing can also help understand growth patterns (typically, one new user does not necessarily equal a certain percentage in server usage).
Examples of services for Analytics: Google Analytics, SE Ranking
What are my next steps?
Now that you know the basics about website monitoring you should have a pretty good idea what parts of the site you would like to monitor or should be monitoring. Start by putting together a game plan and walking through your website to see what kind of monitoring would best suit your website. From there, begin adding the appropriate analytics and monitoring services. Then simply start reporting on those and evaluating the areas of your site that need improvement. If you still have any questions about your specific site/application or would like help in getting started or setting up any of these monitoring options, feel free reach out and we’d be happy to assist.