Documentation / F.A.Q. and Best Practice

F.A.Q and Best Practice

Here we keep questions that are frequently asked at Slack or at Github.

Running tests

Read this before you start to collect metrics.

How do I test cached pages?

How do I test cached pages? The easiest way to do that is to use the –preURL parameter:

docker run --shm-size=1g --rm -v "$(pwd)":/sitespeed.io sitespeedio/sitespeed.io:7.7.2 --preURL https://www.sitespeed.io/documentation/ https://www.sitespeed.io/

In the example the browser will first go to https://www.sitespeed.io/documentation/ and then with a primed cache navigate to https://www.sitespeed.io/.

Since 7.2.0 the best way to add a cookie is by using --cookie name=value where the name is the name of the cookie and the value … the value :) The cookie will be set on the domain that you test.

How do I test multiple pages in the same run?

If you want to test multiple URLs, you can used line them up in the cli:

docker run --shm-size=1g --rm -v "$(pwd)":/sitespeed.io sitespeedio/sitespeed.io:7.7.2 https://www.sitespeed.io https://www.sitespeed.io/documentation/

You can also use a plain text file with one URL on each line. Create a file called urls.txt (but you can call it whatever you want):

http://www.yoursite.com/path/
http://www.yoursite.com/my/really/important/page/
http://www.yoursite.com/where/we/are/

Another feature of the plain text file is you can add aliases to the urls.txt file after each URL. To do this, add a non-spaced string after each URL that you would like to alias:

http://www.yoursite.com/ Start_page
http://www.yoursite.com/my/really/important/page/ Important_Page
http://www.yoursite.com/where/we/are/ We_are

And then you give feed the file to sitespeed.io:

docker run --shm-size=1g --rm -v "$(pwd)":/sitespeed.io sitespeedio/sitespeed.io:7.7.2 urls.txt

How many runs should I do on the same page to get stable metrics?

How many runs depends on your site, and what you want to collect. Pat told us about how he is doing five runs when testing for Chrome. Hitting a URL 3-5 times is often ok when you want to fetch timing metrics, but increasing to 7-11 can give better values. Start low and if you see a lot of variations between runs, increase until you get some solid values.

Getting timing metrics is one thing, but it’s also important to collect how your page is built. You need to keep track of the size of pages, how many synchronously loaded javascript you have and so on. For that kind of information you only need one run per URL.

You should also try out our new setup with WebPageReplay.

I want to test a user journey (multiple pages) how do I do that?

We currently don’t support that but feel free to do a PR in Browsertime.

I want to test on different CPUs how do I do that?

We currently don’t built in support for changing the CPU. What we do know is that you should not use the built in support in Chrome or try to simulate slow CPUs by running on slow AWS instance. What should do is what WPTAgent do. You can check the code at https://github.com/WPO-Foundation/wptagent/blob/master/wptagent.py and do the same before you start a run and then remove it after the run.

Throttle or not throttle your connection?

PLEASE, YOU NEED TO ALWAYS THROTTLE YOUR CONNECTION! You should always throttle/limit the connectivity because it will make it easier for you to find regressions. If you don’t do it, you can run your tests with different connectivity profiles and regresseions/improvements that you see is caused by your servers flakey internet connection. Check out our connectivity guide.

Clear browser cache between runs

By default Browsertime creates a new profile for each iteration you do, meaning the cache is cleared through the webdriver. If you really want to be sure sure everything is cleared between runs you can use our WebExtension to clear the browser cache by adding --browsertime.cacheClearRaw.

That means if you test https://www.sitespeed.io with 5 runs/iterations, the browser cache is cleared between each run, so the browser has no cached assets between the runs.

When you run --preURL the browser starts, then access the preURL and then the URL you want to test within the same session and not clearing the cache. Use this if you want to measure more realistic metrics if your user first hit your start page and then another page (with responses in the cache if the URL has the correct cache headers).

If you use the --preScript feature, it is the same behavior, we don’t clear the cache between preScript and the URL you want to test.

My pre/post script doesn’t work?

We use Selenium pre/post script navigation. You can read more about of our pre/post script setup and focus on the debug section if you have any problem.

If you have problem with Selenium (getting the right element etc), PLEASE do not create issues in sitespeed.io. Head over to the Selenium community and they can help you.

How do you pass HTML/JavaScript as a CLI parameter?

The easiest way to pass HTML to the CLI is to pass on the whole message as a String (use a quotation mark to start and end the String) and then do not use quotation marks inside the HTML.

Say that you want to pass on your own link as an annotation message, then do like this:

--graphite.annotationMessage "TEXT <a href='https://github.com/***' target='blank'>link-text</a>"

If you need to debug CLI parameters the best way is to turn on verbose logging. Do that by adding -vv to your run and check the log for the message that starts with Config options. Then you will see all parameters that gets from the CLI to sitespeed.io and that they are interpreted the right way.

I want a JSON from Browsertime/Coach other tools, how do I get that?

There’s a plugin bundled with sitespeed.io called analysisstorer plugin that isn’t enabled by default. It stores the original JSON data from all analyzers (from Browsertime, Coach data, WebPageTest etc) to disk. You can enable this plugin:

docker run --shm-size=1g --rm -v "$(pwd)":/sitespeed.io sitespeedio/sitespeed.io:7.7.2 https://www.sitespeed.io --plugins.add analysisstorer

The JSON files for the whole run (summary files) will end up in $RESULT/data/. JSON for each individual page is stored in $RESULT/data/pages/$PAGE/data/.

How do I test pages with #-URLS?

By default the # part of a URL is stripped off from your page. Yep we know, it isn’t the best but in the old days the # rarely added any value and crawling a site linking to the same page with different sections made you test the same page over and over again.

If you have pages that are generated differently depending of what’s after you #-sign, you can use the --useHash switch. Then all pages will be tested as a unique page.

docker run --shm-size=1g --rm -v "$(pwd)":/sitespeed.io sitespeedio/sitespeed.io:7.7.2 --useHash https://www.sitespeed.io/#/super 

You can also use the --urlAlias if you want to give the page a friendly name. Use it multiple times if you have multiple URLs.

docker run --shm-size=1g --rm -v "$(pwd)":/sitespeed.io sitespeedio/sitespeed.io:7.7.2 --useHash --urlAlias super --urlAlias duper https://www.sitespeed.io/#/super https://www.sitespeed.io/#/duper

Servers

What you should know before you choose where to run sitespeed.io.

Cloud providers

We’ve been testing out different cloud providers (AWS, Google Cloud, Digital Ocean, Linode etc) and the winner for us has been AWS. We’ve been using c4.large and testing the same size (or bigger) instances on other providers doesn’t give the same stable metric overtime.

One important learning is that you can run on <60% usage on your server, and everything looks fine but the metrics will not be stable since your instance is not isolated from other things that runs on your servers.

Bare metal

We haven’t tested on bare metal so if you have, please let us know how it worked out.

Kubernetes

On Kubernetes you cannot use tc or Docker networks to set the connectivity but there has been tries with TSProxy, check out #1829.

Running tests from multiple locations

Can I test the same URLs from different locations and how do I make sure they don’t override each others data in Graphite?

You should set different namespaces depending on location (–graphite.namespace). If you run one test from London, set the namespace to --graphite.namespace sitespeed_io.london. Then you can choose individual locations in the dropdown in the pre-made dashboards.

Store the data

By default you can choose to store your metrics in a time series database (Graphite or InfluxDB).

Should I choose Graphite or InfluxDB?

If your organisation is running Graphite, use that. If you’re used to InfluxDB, use that. If you don’t use any of them then use Graphite since we have more ready made dashboards for Graphite.

Handling big amount of data

sitespeed.io will generate lots of metrics and data, how do I handle that?

Configuring features

If you you want to store less data from sitespeed.io one way is to configure and compress data more.

The heaviest data that sitespeed.io generates is the video, screenshot and video filmstrip screenshots. You can disable those features but it will make it harder for you to verify that everything works ok and to pinpoint regressions.

If you have limited space (and do not store the data on S3 and configure it to automatically remove old data) you can use the following configurations.

Video

You can change the Constant rate factor. Default is 23. If you change that you can have videos with lower quality but it will take less space. Use --browsertime.videoParams.crf.

You can also change the quality of the video filmstrip screenshots. Default it is set to 75 but you can set a value between 0 - 100. Use --browsertime.videoParams.filmstripQuality.

If you dont use the filmstrip (at the moment the filmstrip screenshots isn’t used within the sitespeed.io result pages) you can disable it. --browsertime.videoParams.createFilmstrip false will disable the filmstrip.

Screenshot

If you want to decrease the size of the screenshots, you should first enable screenshots with jpg instead of png. --screenshot.type jpg will do that.

You can then also set the jpg quality. Default is 80 but you can set the value between 0-100. Use --screenshot.jpg.quality.

As a last thing: You can set the max size of the screenshot (in pixels, max in both width and height). Default is 2000 meaning you the screenshot will probably be full sized (depending on how you configured your viewport). Change it with --screenshot.maxSize.

Disabling features/plugins

Another alternative to minimize the amount of data is to disable plugins. You should be really careful doing that since it will make it harder for you to verify that everything works ok and to pinpoint regressions.

You can list which plugins that are running by adding the flag --plugins.list and in the log you will see something like.

INFO: The following plugins are enabled: assets, browsertime, budget, coach, domains, harstorer, html, metrics, pagexray, screenshot, text, tracestorer

If you want to disable the screenshot plugin (that stores screenshots to disk) you do that by adding --plugins.remove screenshot.

Graphite

Make sure to edit your storage-schemas.conf to match your metrics and how long time you want to keep them. See Graphite setup in production.

S3

When you create your buckets at S3, you can configure how long time it will keep your data (HTML/screenshots/videos). Make it match how long time you keep your metrics in Graphite or how long back in time you think you need it. Usually that is shorter than you think :) When you find an regression (hopefully within an hour or at least day) you want to compare that data with what it looked like before. Storing things at S3 for 2 weeks should be ok, but you choose yourself.

Alerting

We’ve been trying out alerts in Grafana for a while and it works really good for us. Checkout the alert section in the docs.

Difference in metrics between WebPageTest and sitespeed.io

Now and then it pops up an issue on Github where users ask why some metrics differs between WebPageTest and sitespeed.io.

There’s a couple of things to know that differs between WebPageTest and Browsertime/sitespeed.io but first I wanna say that it is wrong to compare between tools, it is right to continuously compare within the same tool to find regressions :)

WPT and sitespeed.io differs by default when they end the tests. WPT ends when there hasn’t been any networks traffic for 2 seconds (if I remember correctly). sitespeed.io ends 2 seconds after loadEventEnd. Both tools are configurable.

WebPageTest on Windows (old version) records the video with 10 fps. 5.x of sitespeed.io uses 60 fps, coming sitespeed.io 6.0 will have 30 fps per default. New WebPageTest on Linux will have 30 fps per default. Running 60 fps will give you more correct numbers but then you need to have a server that can record a video of that pace.

And a couple of generic things that will make your metrics differ:

  • Connectivity matters - You need to set the connectivity.
  • CPU matters - Running the same tests with the same tool on different machines will give different results.
  • Your page matters - It could happen that your page has different sweat spots on connectivity (that makes the page render faster) so even a small change, will make the page much slower (we have that scenario on Wikipedia).