For me, one of the technology highlights of 2020 was the release of the Raspberry Pi 400. Its performance is very similar to high end set-top boxes and streaming media players. As the Raspberry Pi OS also includes an up to date Chromium browser, I was very keen to do some head-to-head performance comparisons with Flow.
I was really pleased with Flow’s performance on some of the benchmarks I tried. MotionMark, one of the benchmarks we regularly use, showed Flow to be over six times faster than Chromium. However, when I replaced my TV (720p) with an HD monitor (1080p), the margin on some was lower than I’d expected. Our engineers found the cause, but it took me a few minutes to get my head around what they were saying…
The Raspberry Pi 400 has an inbuilt algorithm which can slow down both the CPU and GPU to save power. It monitors the load on the CPU which, for many applications, is a good enough indicator. But for Flow, this doesn’t always hold. Unlike Chromium, Flow’s CPU load is often very small during heavy animation as all the drawing work is being done directly by the GPU. With the Particle Acceleration benchmark for instance, the Pi 400’s power management algorithm was wrongly assuming there was little to do and therefore starving the GPU.
This algorithm can be disabled to keep the CPU running at full speed all the time. Making this change can improve the performance of both Flow and Chromium. The improvement for Flow is larger than that for Chromium whose load appears to be more CPU heavy. For Flow, the performance gain on the Particle Acceleration benchmark was the most impressive: at HD resolution, Flow ran at 56 fps compared with just 26 fps previously. Chromium ran at 15 fps in both cases.
As an embedded browser Flow is already very capable, but as a desktop browser it’s still a work in progress. There are a few things that don’t work on the Raspberry Pi 400 just yet, like video. There’s also no toolbar so it doesn’t feel like a desktop browser, but even with these restrictions it’s a good basis for an embedded browser evaluation. As we continue to improve Flow, we will update the Raspberry Pi preview so anyone interested can now see the progress for themselves.
When I was first implementing support for 3D CSS transforms in Flow a couple of years ago I came across a Web demo by Keith Clark which created an entire 3D environment purely using CSS: https://keithclark.co.uk/labs/css-fps.
The way it uses CSS to create not only the 3D geometry, but also lighting and shadow effects is very impressive. Unfortunately this didn’t work very well in Flow because our support for transform-style: preserve-3d was minimal. This meant the ordering of elements in the scene was incorrect and visually confusing. I thought it would be nice to get this demo working and complete our implementation.
For trees of html elements with transform-style: preserve-3d, the elements all share the same 3D space and their rendering order is no longer defined by their location in the DOM, but instead by their z-position in 3D space.
then the rendering order will be quite simple: #C should be rendered first since it has the lowest z-coordinate of -100px, followed by #B with 0px, and finally #A with a z-coordinate of 100px. Sorting items by their centres’ z-coordinates is very simple, and is often good enough to render a 3D scene, especially if items are all similar size. But we can easily break that ordering, for example rotating item #B. If we have a rotation of 70 degrees suddenly the rendering order must be exactly the opposite of the previous order, even though the z-coordinates of the items have not changed. This is because item #C now obscures item #B, and #B obscures #A. See the image below to visualise this.
Even more complicated case occurs if we rotate #B by 50 degrees. Now #B intersects both #A and #C in 3D space. Now the rendering order must be: the back part of #B, then #C, then the centre of #B, then #A and finally the front part of #B.
Since Flow renders everything on the GPU using standard 3D graphics APIs my initial plan was to implement this using a technique that 3D games typically use, namely depth-buffering. In this method a screen pixel buffer is used to store the depth of items for every pixel on the screen. Then items can be rendered in any order, but only pixels with a smaller z-value will be drawn to the screen and to the depth buffer.
However the main problem with this technique is that it does not work with alpha-blending, required for edge anti-aliasing, and for rendering items with transparency. When games use anti-aliasing they typically use other techniques, such as multisampling, which don’t require blending. However the overall rendering quality is not as good as that expected of a browser. Games typically try to limit the number of items with transparency and then render them separately in the correct order. However they do not handle the case of transparent items intersecting other items, and content authors must prevent this happening. Since in CSS having transparent intersecting elements is allowed, the direct approach is required, and we need to split elements into different parts along the lines of intersection with other elements.
The algorithm suggested by the CSS transforms specification for sorting items in a scene is given by Newell’s algorithm and is described in detail in the original 1972 paper. Essentially we start with a list of element items sorted by their z-coordinates, and then do extra checks on each element to decide whether they obscure any items with greater z-coordinates. If they do then the items are reordered, and the same tests are performed with the new ordering. If an item is moved more than once then items must cyclically obscure each other, such as when they intersect each other. In that case the items are split up into pieces, and added back into the list to be sorted separately.
It was quite challenging to implement the algorithm efficiently, especially dealing with the 3D maths to test whether items obscure each other, and to split the items into polygons when they intersected. However the final result was very satisfying, and Flow manages to render the scene without any glitches at a solid 60 frames-per-second.
However, looking closely at some of the objects, in particular the pipes, there are some rendering artefacts which are not present in other browsers. They are divs with 3D transforms applied, and these divs use multiple background layers. Flow renders both 2D and 3D transformed elements directly to the screen, unlike other browsers.
Anti-aliasing in browsers is typically implemented by alpha-blending against the background using partially transparent edges. Browsers use this technique for almost all rendering, including fonts and transformed elements. One downside of this technique is when edges overlap exactly, the color from layers below can leak through this partially transparent edge. The left hand image below shows a browser rendering of two touching 2D transformed divs having two background layers: a black layer on top of a red layer. Text also suffers from the same problem, so to demonstrate this the text in the image shows white text on top of green text. The red colour can be seen leaking through the edges of the divs, and the green colour can be seen through the edges of the text.
When a 3D transform is applied to the div, most browsers will no longer have any red around the edges, as shown on the right hand image below. This is because they render it without a transform into a newly allocated layer, and then that layer is rendered with a 3D transform. This has the bonus that any colors from layers below will be obscured correctly on edges, but doesn’t solve the more common cases of 2D high-quality antialiasing and of text.
The edge rendering artefacts in the demo happen because Flow renders 3D items in the same way as 2D items: directly to the screen, meaning we get the same artefacts on edges as other browsers have in 2D rendering. This was a conscious design choice as we wanted to avoid allocating a layer for every 3D transformed element which can have a massive impact on memory usage. For example, when rendering this demo at 4K resolution, Flow uses around 7MB in static GPU textures, and depending on the viewpoint, 50-100MB in temporary GPU surfaces (to deal with lots of intersecting geometry). When using the Frame Rendering Stats tool in Chrome, the GPU memory usage is shown to be anywhere between 200-600MB depending on the viewpoint. Flow’s approach to 3D is possible because its rendering engine is designed around the GPU, and we’ve tried to ensure that everything that can be rendered in 2D can also be rendered in 3D. This means that rendering 3D effects in Flow is viable even on embedded devices with limited memory.
Although the pipes in the demo can look worse than in other browsers, the walls and floors look better in Flow. Flow renders images using trilinear filtering instead of the bilinear filtering used by other browsers, smoothing out the pixels in the distance. When panning around the view, this prevents pixel artefacts popping in and out.
Last month we were shown a new HTML UI from one of our embedded partners. When displayed in Flow, it showed a button with text incorrectly displayed over an icon. We’ve known for a while we had a few problems with buttons as many of them were the wrong size, or their content wrapped too soon. This gave us a good reason to investigate them all.
At Ekioh we’re regularly looking ahead to see what browser features are set to become popular. As part of this, I’ve recently spent some time reading the future design predictions from a number of web designers and design houses. As you might suspect there are some wacky ideas being peddled, but on the whole, most designers seem to be agreeing on a few concepts and techniques that look set to shape the way websites and product user interfaces will evolve over the next few years.
Back in 2003, Honda released Cog. It was an amazing advert. I even ordered the free DVD – there was no YouTube back then. If you haven’t seen it, it’s worth watching – it has no CGI and no smoke and mirrors. It, apparently, took 606 takes.
Because of lockdown we’re all working from home at the moment so my access to hardware is somewhat limited. On a whim, I decided to try Flow on my Amazon FireTV Stick and was a little surprised to find it wouldn’t run. I grabbed some logs and asked colleagues what they thought the problem might be. They were able to quickly spot what the issue was and they gave me a build to test within a few hours. Armed with a working version I thought I’d see how Flow compares with the other browser options that are available.
For a while now, people have been expressing concern about Chrome’s dominance creating a browser monoculture. With Microsoft’s recent switch to Chrome, the choice of rendering engine was effectively reduced to Blink (Chrome), WebKit (Safari) or Gecko (Firefox). Whilst there are a large number of browsers for desktop and mobile users out there, almost all are based on Blink/Chromium.
As well as displaying web content, HTML browsers are used in a wide variety of products for application and user interface rendering. The potential effect of one company dominating the implementation of the standards is that they are biased towards web page rendering, their largest use case. Focusing solely on web content could limit the wider appeal of HTML.
It's been a long time since we last posted MotionMark scores. I thought it would make sense to re-run them with the latest versions of all browsers.
All tests were run on a MacBook Pro (15-inch, 2018), with a 2.9GHz 6-Core Intel Core i9. I disabled mail and Time Machine to minimise interruptions. Unlike our previous, more thorough, analysis, I only ran each test three times. That averaged the results but, since I wanted screenshots, I chose the test run with the highest score in each case.
The scores for the various browsers were:
Flow (5.8.0): 1087.51, 1196.89, 964.05
Safari (13.0.4): 755.13, 756.02, 734.09
Chrome (79.0.3945.130): 268.70, 287.16, 286.46
Firefox (72.0.2): 83.27, 110.63, 85.52
Firefox (72.0.2 with WebRender): 226.53, 284.43, 272.61
On every embedded device we have tried, Flow also comes top, but macOS is the only platform where Safari, Chrome and Firefox are available.
Being a big fan of online shopping, I’ve been keen to see when Flow could successfully complete a transaction on amazon.co.uk. It’s fair to say that online shopping is not a target use case for Flow, so I wasn’t going to get any engineering time dedicated to this; instead I was relying upon the product’s ongoing development to do the job for me.
I had an interesting problem with one of the sites I tried loading a while ago. weibo.com was using jQuery (version 1.7.2) which tries to detect browser features, and decided Flow was actually Internet Explorer. It later tried to use IE-specific filters rather than CSS opacity, so no fades worked. I tracked this down to jQuery’s startup code setting opacity to 0.55 and then getting it back out again. Since we converted 0.55 into 16.16 fixed point too early, it was retrieving it as 0.549988. That wasn’t equal to 0.55, and so jQuery decided Flow didn’t support opacity and therefore must be Internet Explorer.
If you buy a new TV today, the chances are it is 4K. It’s quite strange that the UIs on these TVs are still rendered at 1080p and upscaled, but that’s the reality. It means that any text and graphics in the video stream, such as football scores, are visually sharper and clearer than the TV’s own menus and app UIs.
Browser testing has come a long way in the last 15 years. Back then I worked for a small embedded browser company with a test team that manually checked websites. This was tedious, and inefficient as there are only so many sites a person can visit in a day.
When I joined Ekioh, I was pleased to see they had taken a more modern approach from the start. There was a genuine passion for product stability and a strong desire to avoid the embarrassment of regression bugs.
Ekioh’s multithreaded HTML browser is rapidly making a name for itself as being the fastest browser available. Whilst Blink based browsers like Chrome currently dominate the market, they might not be the best choice for application and middleware rendering. As the MotionMark benchmark confirms, Flow’s performance is streets ahead of the competition.
Since starting work on Flow, our focus for the rendering engine has been on HTML/CSS. The number of basic shapes and painting styles used in HTML/CSS is quite small, which has allowed us to create a highly specialised engine using the GPU for all painting tasks. We’ve also supported elements in Flow for a while, but until recently all canvas rendering was performed on the CPU.
Recently I was discussing Flow, our multithreaded browser, with a friend of mine who questioned whether a browser using all the cores would be beneficial in battery operated products like their new smart watch. This prompted me to do some research and the results were surprisingly in favour of our multithreaded approach.
When rendering web pages most browsers use a general purpose graphics library to do all their drawing. For example Chrome uses the Skia graphics library. This makes sense for cross platform browsers since they can use a single drawing API and leave the implementation details to the graphics library. The graphics library can try to optimise the drawing operations using some platform specific 2D hardware acceleration, or using a 3D library such as OpenGL/DirectX to take advantage of the GPU. If there is no hardware acceleration available the graphics library can do all the drawing in software using the CPU.
Flow only recently added limited HTML form support and that lets us log into Google. We hadn’t concentrated on forms as they’re barely, if ever, used in TV UIs and there was plenty of other stuff to get on with. Pleasingly, Google Mail (Basic HTML version) rendered very well the first time we were able to log in. Full Google Mail doesn’t work yet, but it makes sense to start with the basic mode first.
In 2006 we started writing a clean room SVG browser, primarily targeting set-top boxes. Back then, user interfaces were written in native code (usually ugly and inflexible) or HTML (very slow). We emphasised how it was equivalent to a web browser but, rather than an HTML parser with CSS box model layout, we parsed SVG markup. SVG takes negligible time to lay out and uses CSS sparsely, so we massively outperformed HTML browsers on equivalent UIs.