How are we improving Firefox Snap performance? Part 2
by Oliver Smith on 16 June 2022
Welcome to Part 2 of our investigation into Firefox snap performance. To minimise duplication we recommend checking out Part 1 of this series. There you’ll find a summary of our approach, glossary of terms, and tips on standardised benchmarking.
Update: Part 3 of this series is now live with details of our latest fixes.
Welcome back, Firefox fans! It’s time for another update on our Firefox snap improvements.
The Firefox snap offers a number of benefits to daily users of Ubuntu as well as a range of other Linux distributions. It improves security, delivers cross-release compatibility and shortens the time for improvements from Mozilla to get into the hands of users.
Currently, this approach has trade-offs when it comes to performance, most notably in Firefox’s first launch after a system reboot. This series tracks our progress in improving startup times to ensure we are delivering the best user experience possible.
Along the way we’ll also be addressing specific use-case issues that have been identified with the help of the community.
Let’s take a look at what’s we’ve been up to!
Current areas of focus
Here we cover recent fixes, newly identified areas of interest and an update on our profiling investigations.
Jupyter Notebook support – FIXED
For a number of data scientists, Jupyter Notebook support in the browser is critical to their workflow. When launching a notebook there are two ways to view it:
- Opening a file at ~/.local/share/jupyter/…
- Navigating to an http://localhost/ URL
The second route is more compliant with sandboxed environments and has no issues in the Firefox snap. However, the default recommendation is to open the file directly. This caused problems since .local isn’t accessible to confined snaps, which limit access to dot directories by default.
We have merged a snapd workaround giving the browser interface access specifically to ~/.local/share/jupyter to enable the default workflow. We also reported the issue upstream and suggested changing the help text to point to the http://localhost/ URL first as the recommended user journey.
Last time, we talked about the Firefox snap failing to determine which GPU driver it should use. In this circumstance it falls back to software rendering on devices like the Raspberry Pi, which significantly impacts performance. To address this we’ve updated the snapd OpenGL interface to allow access to more PCI IDs, including the ones used on the Rasberry Pi.
However, this fix doesn’t seem to fully address the issue. There are still reports of acceleration being blocked by the platform. Resolving this has the potential to make a large difference on Raspberry Pi so we are continuing to investigate.
Copying the large number of language packs during Firefox’s first start remains a consistent issue in all of our benchmarks.
Mozilla intend to mirror a change in the snap that they made on the Windows version of Firefox. This would add the ability to only load one locale at a time based on the system locale.
Native messaging support
Native messaging support enables key features such as 2FA devices and GNOME extensions. The new XDG Desktop Portal has landed in Jammy but the Firefox integration continues to be iterated on. Things are progressing well and the fix should land soon.
The Firefox snap and flatpak are currently unable to interact with network shares. This problem has to do with the XDG Desktop Portal working in local mode. The fact that the fileselector portal is listing those mounts in the sidebar is also adding to the confusion.
Until the portal issue gets resolved, one workaround is to access the mount through /var/run/user/USERUID/gvfs (NOTE: you need gvfs-fuse installed, which creates local mount points).
Font and icon handling
New benchmarks for font and icon handling on amd64 suggest that the cache building of icons, themes and fonts is relatively minor when it comes to resource usage. Firefox spends some time doing this, whereas Chromium does not. For most systems this is around 300ms, but on Raspberry Pi the impact is much larger (up to 6-7 seconds).
Investigations show that the caching process is very I/O intensive, and I/O is a lot slower on an SD card with a Raspberry Pi 4 CPU.
This is likely a symptom of an underlying issue that we’re working to identify.
Futex() syscall time
We analyzed the behavior of the confined snap of Firefox against the unconfined version, as well as the Firefox setup confined from the tarball (available as a direct download from the Mozilla site).
With the confined version, in the strace run summaries, we noticed that the futex() system call takes about 20000us to complete on average on Kubuntu 22.04 and about 7000us on Fedora 36, both installed on identical hardware. These numbers indicate memory locking contention, especially when compared to the same results gathered from the unconfined or non-snap versions of Firefox. There, the futex() system call averages only about 20us
Furthermore, we noticed that the snap version executes far more futex() system calls (as well as others). Some of this is expected, as the execution of the snap differs from non-snap applications, including the use of base and content snaps, as well as security confinement
The problem has been reported consistently on different hardware platforms and Linux distributions, with the overall futex() system call average time correlating linearly with the observed startup time.
For instance, a sample strace summary (strace -c) of a Firefox snap run:
% time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ------------------- 82.18 388.576521 18131 21431 2272 futex 10.31 48.737839 7583 6427 4 poll 4.09 19.350524 7660 2526 6 epoll_wait 1.50 7.114924 72601 98 38 wait4 0.69 3.258415 574 5676 2715 recvmsg 0.51 2.406544 41492 58 clock_nanosleep 0.13 0.633050 71 8843 5 read 0.12 0.564651 34 16452 11403 openat
And in comparison, the tar version on the same host:
% time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ------------------- 46.13 0.397783 8 47957 clock_gettime 19.76 0.170379 21 7828 1245 futex 6.90 0.059470 8 6888 gettimeofday 5.22 0.044991 8 5353 4324 recvmsg 3.49 0.030111 8 3745 poll 2.57 0.022125 22125 1 wait4 1.75 0.015092 8 1829 read 1.68 0.014518 15 942 319 openat
We have observed similar results with other snaps, including Thunderbird as well as Chromium. While the actual startup time differs from snap to snap, the overall behavior is consistent, and underlines an excess of computation with snap binaries
We tried to isolate the source of this phenomenon in different ways. First, we tried to understand whether the security confinement may be the cause of whatever contention in memory management would cause Firefox (and other binaries) to experience userspace locking issues. This would then translate into the excess of futex() system calls and their subsequently very high time per call. Disabling the AppArmor and Seccomp confinement did not yield any improvements.
Likewise, we compiled Firefox with its internal sandboxing disabled, as well as compiled the browser with the use of tmalloc (to understand if there may be other reasons for memory contention), but these attempts also did not yield any startup time improvements
We’re continuing to explore this issue with further strace and memory checks against a non-confined snapd version.
Keep in touch
That’s all for this week’s update! Look out for Part 3 in the next few weeks. In the meantime don’t forget to keep an eye on our key aggregators of issues and feedback:
If you want to take benchmarks and track improvements on your own machine, don’t forget to read our section ‘Create your own benchmarks’ in Part 1 of this series and share them on our Discourse topic.