As a software engineer writing diffs (also called PRs – Pull Requests, or CRs – Code Reviews) is your bread and butter. You want to code your changes, have them reviewed, merge them, and start the process again. Repeating these cycles effectively is essential for delivering new features and building products quickly. There are not many things that can slow this process down like low-quality diffs. Some signs of a low-quality diffs are:
Compilation errors
Tests failures
Lint warnings
Low quality diffs leave both the author and the reviewer(s) frustrated. The author is frustrated because there is a lot of back and forth and they fell like they will never be able to merge their changes. The reviewers are frustrated because they feel their time is being wasted on diffs that are clearly not ready for review.
Sending a low-quality diff happens occasionally to everyone (e.g., this new file you forgot to include in your diff or that test you didn’t run due to a typo). However, when done repeatedly, it will have a detrimental effect on your career because your team members will simply try to avoid reviewing your code. This will prevent you from iterating quickly and will make it harder for you to adhere to schedules.
While spending a few minutes to double check your diff is in good shape may seem like a waste of time – especially if timelines are tight, it is an investment that in a long run will save you even more time. And once you build this habit it will become your second nature.
The std::optional type is a great addition to the standard C++ library in C++ 17. It allows to end the practice of using some special (sentinel) value, e.g., -1, to indicate that an operation didn’t produce a meaningful result. There is one caveat though – std::optional<bool> may in some situation behave counterintuitively and can lead to subtle bugs.
While this is likely not what we expected, it’s not a bug. The std::optional type defines an explicit conversion to bool that returns true if the object contains a value and false if it doesn’t (exactly the same as the has_value() method). In some contexts – most notably the if, while, for expressions, logical operators, the conditional (ternary) operator (a complete list can be found in the Contextual conversions section on cppreference) – C++ is allowed to use it to perform an implicit cast. In our case it led to a behavior that at the first sight seems incorrect. Thinking about this a bit more, the seemingly intuitive behavior should not be expected. An std::optional<bool> variable can have one of three possible values:
true
false
std::nullopt (i.e., not set)
and there is no interpretation under which the behavior of expressions like if (std::nullopt) is universally meaningful. Having said that, I have seen multiple engineers (myself included) fall into this trap.
The problem is that spotting the bug can hard as there are no compiler warnings or any other indications of the issue. This is especially problematic when changing an existing variable from bool to std::optional<bool> in large codebases because it is easy to miss some usages and introduce regressions.
The problem can also sneak easily to your tests. Here is an example of a test that happily passes but only due to a bug:
Before we discuss the ways to handle std::optional<bool> type in code, it could be useful to mention a few strategies that can prevent bugs caused by std::optional<bool>:
raise awareness of the unintuitive behavior of std::optional<bool> in some contexts
when a new std::optional<bool> variable or function is introduced make sure all places where it is used are reviewed and amended if needed
have a good test unit coverage that can detect bugs caused by introducing std::optional<bool> to your codebase
if feasible, create a lint rule that flags suspicious usages of std::optional<bool>
In terms of code there are few ways to handle the std::optional<bool> type:
Compare the optional value explicitly using the == operator
If your scenario allows treating std::nullopt as true or false you can use the == operator like this:
This works because the std::nullopt value is never equal to an initialized variable of the corresponding optional type. One big disadvantage of this approach is that someone will inevitably want to simplify the expression by removing the unnecessary== false and, as a result, introducing a regression.
Unwrap the optional value with the value() method
If you know that the value in the given codepath is always set you can unwrap the value by calling the value() method like in the example below:
Note that it won’t work if the value might not be set – invoking the .value() method if the value was not set will throw the std::bad_optional_access exception
Dereference the optional value with the * operator
This is very similar to the previous option. If you know that the value on the given code path is always set you can use the * operator to dereference it like this:
One big difference from using the value() method is that the behavior is undefined if you dereference an optional whose value was not set. Personally, I never go with this solution.
Use value_or() to provide the default value for cases when the value is not set
std::optional has the value_or() method that allows providing the default value that will be returned if the value is not set. Here is an example:
If your scenario allows treating std::nullopt as true or false using value_or() could be a good choice.
Handle std::nullopt explicitly
There must have been a specific reason you decided to use std::optional<bool> – you wanted to enable the scenario where the value is not set. Now you need to handle this case. Here is how:
std::optional<bool> isMorning = std::nullopt;
if (isMorning.has_value()) {
if (isMorning.value()) {
std::cout << "Good Morning!" << std::endl;
} else {
std::cout << "Good Afternoon!" << std::endl;
}
} else {
std::cout << "I am lost in time..." << std::endl;
}
Fixing tests
If your tests use ASSERT_TRUE or ASSERT_FALSE assertions with std::optional<bool> variables they might be passing even if they shouldn’t as they suffer from the very same issue as your code. As an example, the following assertion will happily pass:
ASSERT_TRUE(std::optional{false});
You can fix this by using ASSERT_EQ to explicitly compare with the expected value or by using some of the techniques discussed above. Here are a couple of examples:
We spent a lot of time discussing the std::optional<bool> case. How about other types? Do they suffer from the same issue? The std::optional type is a template so its behavior is the same for any type parameter. Here is an example with std::optional<int>:
std::optional<int> n = 0;
if (n) {
std::cout << "n is not 0" << std::endl;
}
which generates the following output:
n is not 0
The problem with std::optional<bool> is just more pronounced due to the typical usages of bool. For non-bool types it is fortunately no longer a common practice to rely on the implicit cast to bool. These days it is much common to write the condition above explicitly as: if (n != 0) which will compare with the value of as long as it is populated.
Exhaustive switch statements are switch statements that do not have the default case because all possible values of the type in question have been covered by one of the switch cases. Exhaustive switch statements are a perfect match when working with scoped enum types. This can be illustrated by the following code:
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
the compiler will show warnings pointing to all the places that need to be fixed:
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
If you configure warnings as errors, which you should do if you can, you will have to address all these errors to successfully compile your program.
Without the exhaustive switch, the compiler would happily compile the program after adding the new enum member because it would be handled by the default case. Even if the default case was coded to throw an exception to help detect unhandled enum members, this exception would only be thrown at runtime which could be too late to prevent failures. In a bigger system or application there could be many switch statements like this and without the help from the compiler it can be hard to find and test all of them. To sum up – exhaustive switch statements help quickly find usages that need to be looked at and fixed before the code can ship.
So far so good, but there is a problem – C++ allows this:
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Believe it or not, this is valid C++ as long as the value being cast is within the range of the underlying enum type. If you flow an enum value created this way through the exhaustive switch it won’t match any of the switch cases and since there is no default case the behavior is undefined.
The right thing to do is to always use enums instead of mixing integers and enums which ultimately is the reason to cast. Unfortunately, this isn’t always possible. If you receive values from users or external systems, they would typically be integer numbers that you may need to convert to an enum in your system or library and the way to do this in C++ is static casting.
Because you can never trust values received from external systems, you need to convert them in a safe way. This is where the exhaustive switch statement can be extremely useful as it allows to write a conversion function like this:
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
If a new enum member is added, the compiler will fail to compile the convertToColor function (or at least will show warnings), so you know you need to update it. For enum values outside of the defined set of members the convertToColor throws an exception. If you use a safe conversion like this immediately after receiving the value, you will prevent unexpected values from entering your system. You will also have a single place where you can detect invalid values and reject and log incorrect invocations.
Puppeteer is a Node.js module that allows interacting with a (headless) web browser programmatically. This is extremely useful for automating website testing, generating screenshots and PDFs of web pages or programmatic form submission.
Docker offers numerous benefits including a standardized environment, isolation, and rapid deployment to name a few. These benefits might be why you’d want to run Puppeteer inside a Docker container. Unfortunately, doing so on Raspberry Pi is not straightforward. There are a few issues that make it harder than usual. Luckily, they are all solvable. Let’s take a look.
Problem 1: Chromium included in Puppeteer does not work on Raspberry Pi
Puppeteer by default downloads a matching version of Chromium which is guaranteed to work out of the box on supported platforms. Unfortunately, Chromium does not currently provide an arm build that works on Raspberry Pi and running stock Puppeteer on Raspberry Pi will end up with a crash. This can be solved by installing Chromium with apt-get install chromium -y and telling Puppeteer to use it by passing the executablePath: '/usr/bin/chromium' to the launch() function as follows:
When doing this, Puppeteer no longer should need to download Chromium as it will be using the version installed with apt-get, so it makes sense to skipping this step by setting the corresponding environment variable in the Dockerfile:
ENV PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=true
This will significantly reduce the time needed to install node modules.
Problem 2: The base image installs an old version of Chromium
Some node Docker images are based on old distributions that contain only older versions of Chromium. Most notably the node:16 image is based on buster. If you use this base image the only version of Chromium you will be able to install with apt-get is 90.0.4430.212-1. Unfortunately this version doesn’t work in a Docker container – it just hangs indefinitely. Moving to the node:16-bullseye base image allows installing a much newer version of Chromium (108.0.5359.124) where this is no longer a problem.
Problem 3: Puppeteer crashes on launch
Puppeteer will not launch in a Docker container without additional configuration. Chromium is not able to provide sandboxing when running inside a container so it needs to be launched at least with the --no-sandbox argument. Otherwise it will crash with the following error message
Failed to move to new namespace: PID namespaces supported, Network namespace
supported, but failed: errno = Operation not permitted
Sandbox is a security feature and running without a sandbox is generally discouraged. Unfortunately, running without a sandbox appears to be currently the only way to run Puppeteer inside a Docker container. In the past the --no-sandbox option required running Puppeteer as root, only increasing the risk. Luckily, this no longer seems to be the case – it is possible now to launch puppeteer with the --no-sandbox option as a non-privileged user.
There are a few more options that might be worth exploring if launching Puppeteer inside a container fails:
--disable-gpu – disables GPU hardware acceleration (which is usually not available when running in Docker)
--disable-dev-shm-usage – prevents from using shared RAM (/dev/shm/)
--disable-setuid-sandbox – disabled setui sandbox
Putting everything together
The information provided above should be all that is needed to be build a Docker image for a Node.js app that uses Puppeteer and runs on Raspberry Pi. Below is an example Dockerfile for such a Docker image. It contains comments to make it easy to notice how the solutions discussed above were applied.
# Ensure an up-to-date version of Chromium
# can be installed (solves Problem 2)
FROM node:16-bullseye
# Install a working version of Chromium (solves Problem 1)
RUN apt-get update
RUN apt-get install chromium -y
ENV HOME=/home/app-user
RUN useradd -m -d $HOME -s /bin/bash app-user
RUN mkdir -p $HOME/app
WORKDIR $HOME/app
COPY package*.json ./
COPY index.js ./
RUN chown -R app-user:app-user $HOME
# Run the container as a non-privileged user (discussed in Problem 3)
USER app-user
# Make `npm install` faster by skipping
# downloading default Chromium (discussed in Problem 1)
ENV PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=true
RUN npm install
CMD [ "node", "index.js" ]
Because the application also requires a couple modifications to how the headless browser is launched here is a small example application illustrating these changes with comments:
const puppeteer = require('puppeteer');
(async() => {
const browser = await puppeteer.launch({
// use Chromium installed with `apt` (solves Problem 1)
executablePath: '/usr/bin/chromium',
args: [
// run without sandbox (solves Problem 3)
'--no-sandbox',
// other launch flags (discussed in Problem 3)
// '--disable-gpu,
// '--disable-dev-shm-usage',
// '--disable-setuid-sandbox',
]
});
const page = await browser.newPage();
await page.goto('https://www.google.com/', {waitUntil: 'networkidle2'});
let e = await page.$('div#hplogo');
let p = await e?.getProperty('title');
if (p) {
console.log(`Today's doodle: ${await p.jsonValue()}`);
} else {
console.log('No Doodle today :(');
}
browser.close();
})();
Finally, here is the output of this application when run in a container:
Running Puppeteer inside a Docker container is tricky – especially, when doing so on Raspberry Pi. The post discussed the key obstacles and provided solutions to overcome them. In addition, a demo containerized app was included to illustrate the main points.
Docker containers run by default as root. In general, it is not a recommended practice as it poses a serious security risk. This risk can be mitigated by configuring a non-root user to run the container. One of the ways to achieve this is to use the USER instruction in the Dockerfile. While running a container as a non-root user is the right thing to do, it can often be problematic as insufficient permissions can lead to hard to diagnose errors. This post uses an example node application to discus a few permission-related issues that can pop up when building a non-root container along with some strategies that can help troubleshoot this kind of issues.
Example
Let’s start with a very simple Dockerfile for a node application:
FROM node:16
WORKDIR /usr/app
COPY . .
RUN npm install
CMD [ "node", "/usr/app/index.js" ]
The problem with this Docker file is that any Docker container created based on this file will run as root:
To fix that we can modify the Docker file to create a new user (let’s call it app-user) and move the application to a sub-directory in the user home directory like this:
FROM node:16
ENV HOME=/home/app-user
RUN useradd -m -d $HOME -s /bin/bash app-user
RUN chown -R app-user:app-user $HOME
USER app-user
WORKDIR $HOME/app
COPY . .
RUN npm install
CMD [ "node", "index.js" ]
Unfortunately, introducing these changes makes it impossible to build a docker image – npm install now errors out due to insufficient permissions:
Inspecting the app directory shows that the owner of this directory is root and other users don’t haver the write permission:
app-user@d0b48aa18141:~$ ls -l ~
total 4
drwxr-xr-x 1 root root 4096 Jan 15 05:48 app
The error is related to using the WORKDIR instruction to set the working directory to $HOME/app. It’s not a problem by itself – it’s actually recommended to use WORKDIR to set the working directory. The problem is that because the directory didn’t exist, WORKDIR created one, but it maderoot the owner. The issue can be easily fixed by explicitly creating a working directory with the right permissions before the WORKDIR instruction runs to prevent WORKDIR from creating the directory. The new Dockerfile that contains this fix looks as follows:
FROM node:16
ENV HOME=/home/app-user
RUN useradd -m -d $HOME -s /bin/bash app-user
RUN mkdir -p $HOME/app
RUN chown -R app-user:app-user $HOME
USER app-user
WORKDIR $HOME/app
COPY . .
RUN npm install
CMD [ "node", "index.js" ]
Unfortunately, this doesn’t seem to be enough. Building the image still fails due to a different permission issue:
The error message indicates that this time the problem is that npm install cannot access the package-lock.json file. Listing the files shows again that all copied files are owned by root and other users don’t have the write permission:
ls -l
total 12
-rw-r--r-- 1 root root 71 Jan 15 02:03 index.js
-rw-r--r-- 1 root root 849 Jan 15 01:36 package-lock.json
-rw-r--r-- 1 root root 266 Jan 15 05:21 package.json
Apparently, the COPY instruction by default uses root privileges so, the files will be owned by root even if the COPY instruction appears the USER instruction. An easy fix is to change the Dockerfile to copy the files before configuring file ownership (alternatively, it is possible specify a different owner for the copied files with the --chown switch):
FROM node:16
ENV HOME=/home/app-user
RUN useradd -m -d $HOME -s /bin/bash app-user
RUN mkdir -p $HOME/app
COPY . .
RUN chown -R app-user:app-user $HOME
USER app-user
WORKDIR $HOME/app
RUN npm install
CMD [ "node", "index.js" ]
Annoyingly, this still doesn’t work – we get yet another permission error:
This time the error indicates that npm install tried creating the node_modules directory directly in the root directory. This is unexpected as the WORKDIR instruction was supposed to set the default directory to the app directory inside the newly created user home directory. The problem is that the last fix was not completely correct. Before, COPY was executed afterWORKDIR so it copied the files to the expected location. The fix moved the COPY instruction so that it is now executed before the WORKDIR instruction. This resulted in copying the application files to the container’s root directory, which is incorrect. Preserving the relative order of these two instructions should fix the error:
FROM node:16
ENV HOME=/home/app-user
RUN useradd -m -d $HOME -s /bin/bash app-user
RUN mkdir -p $HOME/app
WORKDIR $HOME/app
COPY . .
RUN chown -R app-user:app-user $HOME
USER app-user
RUN npm install
CMD [ "node", "index.js" ]
Indeed, building an image with this Dockerfile finally yields:
Successfully built b36ac6c948d3
Yay!
The application also runs as expected:
Debugging strategies
Reading about someone’s errors is one thing, figuring the errors out oneself is another. Below are a few debugging strategies I used to understand the errors described in the first part of the post. Even though I mention them in the context of permission errors they can be applied in a much broader set of scenarios.
Carefully read error messages
All error messages we looked at were very similar, yet each signaled a different problem. While the errors didn’t point directly to the root cause, the small hints were very helpful in understanding where to look to investigate the problem.
Check Docker documentation
Sometimes our assumptions about how the given instruction runs may not be correct. Docker documentation is the best place to verify these assumptions and understand if the wrong assumptions could be the culprit (e.g. the incorrect assumption that the COPY will make the current user the owner of the copied files).
Add additional debug info to Dockerfile
Sometimes it is helpful to print additional debug information when building a docker image. Some commands I used were:
RUN ls -al
RUN pwd
RUN whoami
They allowed me understand the state the container was in at a given time. One caveat is that by default docker caches intermediate steps when building containers which may result in not printing the debug information when re-building a container if no changes were made as the step was cached.
Run the failing command manually and/or inspect the container
This is the ultimate debugging strategy – manually reproduce the error and inspect the container state. One way to make it work is to comment out all the steps starting from the failing one and then build the image. Once the image is build start a container like this (replace IMAGE with the image id):
docker run -d IMAGE tail -f /dev/null
This will start the container whose state is just as it was before the failing step was executed. The command will also keep the container running which makes it possible for you to launch bash within the container (replace CONTAINER with the container id returned by the previous command):
docker exec -it CONTAINER /bin/bash
Once inside the container you can run the command that was failing (e.g. npm install). Since the container is in the same state it was when it failed to build you should be able to reproduce the failure. You can also easily check for the factors that caused the failure.
Conclusion
This post showed how to create a docker container that is not running as root and discussed a few permission issues encountered in the process. It also described a few debugging strategies that can help troubleshoot a wide range of issues – including issues related to permissions. The code for this post is available on github in my docker-permissions repo.