Skip to content

Release 3.0.0

Compare
Choose a tag to compare
@gavinhoward gavinhoward released this 18 Jun 22:57
· 1401 commits to master since this release

Notes for package maintainers:

First, the 2.7.0 release series saw a change in the option parsing. This made me change one error message and add a few others. The error message that was changed removed one format specifier. This means that printf() will seqfault on old locale files. Unfortunately, bc cannot use any locale files except the global ones that are already installed, so it will use the previous ones while running tests during install. If bc segfaults while running arg tests when updating, it is because the global locale files have not been replaced. Make sure to either prevent the test suite from running on update or remove the old locale files before updating. Once this is done, bc should install without problems.

Second, the option to build without signal support has been removed. See below for the reasons why.

This is a production release with some small bug fixes, a few improvements, three major bug fixes, and a complete redesign of bc's error and signal handling. Users and package maintainers should update to this version as soon as possible.

The first major bug fix was in how bc executed files. Previously, a whole file was parsed before it was executed, but if a function is defined after code, especially if the function definition was actually a redefinition, and the code before the definition referred to the previous function, this bc would replace the function before executing any code. The fix was to make sure that all code that existed before a function definition was executed.

The second major bug fix was in bc's lib2.bc. The ceil() function had a bug where a 0 in the decimal place after the truncation position, caused it to output the wrong numbers if there was any non-zero digit after.

The third major bug is that when passing parameters to functions, if an expression included an array (not an array element) as a parameter, it was accepted, when it should have been rejected. It is now correctly rejected.

Beyond that, this bc got several improvements that both sped it up, improved the handling of signals, and improved the error handling.

First, the requirements for bc were pushed back to POSIX 2008. bc uses one function, strdup(), which is not in POSIX 2001, and it is in the X/Open System Interfaces group 2001. It is, however, in POSIX 2008, and since POSIX 2008 is old enough to be supported anywhere that I care, that should be the requirement.

Second, the BcVm global variable was put into bss. This actually slightly reduces the size of the executable from a massive code shrink, and it will stop bc from allocating a large set of memory when bc starts.

Third, the default Karatsuba length was updated from 64 to 32 after making the optimization changes below, since 32 is going to be better than 64 after the changes.

Fourth, Spanish translations were added.

Fifth, the interpreter received a speedup to make performance on non-math-heavy scripts more competitive with GNU bc. While improvements did, in fact, get it much closer (see the [benchmarks][19]), it isn't quite there.

There were several things done to speed up the interpreter:

First, several small inefficiencies were removed. These inefficiencies included calling the function bc_vec_pop(v) twice instead of calling bc_vec_npop(v, 2). They also included an extra function call for checking the size of the stack and checking the size of the stack more than once on several operations.

Second, since the current bc function is the one that stores constants and strings, the program caches pointers to the current function's vectors of constants and strings to prevent needing to grab the current function in order to grab a constant or a string.

Third, bc tries to reuse BcNum's (the internal representation of arbitary-precision numbers). If a BcNum has the default capacity of BC_NUM_DEF_SIZE (32 on 64-bit and 16 on 32-bit) when it is freed, it is added to a list of available BcNum's. And then, when a BcNum is allocated with a capacity of BC_NUM_DEF_SIZE and any BcNum's exist on the list of reusable ones, one of those ones is grabbed instead.

In order to support these changes, the BC_NUM_DEF_SIZE was changed. It used to be 16 bytes on all systems, but it was changed to more closely align with the minimum allocation size on Linux, which is either 32 bytes (64-bit musl), 24 bytes (64-bit glibc), 16 bytes (32-bit musl), or 12 bytes (32-bit glibc). Since these are the minimum allocation sizes, these are the sizes that would be allocated anyway, making it worth it to just use the whole space, so the value of BC_NUM_DEF_SIZE on 64-bit systems was changed to 32 bytes.

On top of that, at least on 64-bit, BC_NUM_DEF_SIZE supports numbers with either 72 integer digits or 45 integer digits and 27 fractional digits. This should be more than enough for most cases since bc's default scale values are 0 or 20, meaning that, by default, it has at most 20 fractional digits. And 45 integer digits are a lot; it's enough to calculate the amount of mass in the Milky Way galaxy in kilograms. Also, 72 digits is enough to calculate the diameter of the universe in Planck lengths.

(For 32-bit, these numbers are either 32 integer digits or 12 integer digits and 20 fractional digits. These are also quite big, and going much bigger on a 32-bit system seems a little pointless since 12 digits in just under a trillion and 20 fractional digits is still enough for about any use since 10^-20 light years is just under a millimeter.)

All of this together means that for ordinary uses, and even uses in scientific work, the default number size will be all that is needed, which means that nearly all, if not all, numbers will be reused, relieving pressure on the system allocator.

I did several experiments to find the changes that had the most impact, especially with regard to reusing BcNum's. One was putting BcNum's into buckets according to their capacity in powers of 2 up to 512. That performed worse than bc did in 2.7.2. Another was putting any BcNum on the reuse list that had a capacity of BC_NUM_DEF_SIZE * 2 and reusing them for BcNum's that requested BC_NUM_DEF_SIZE. This did reduce the amount of time spent, but it also spent a lot of time in the system allocator for an unknown reason. (When using strace, a bunch more brk calls showed up.) Just reusing BcNum's that had exactly BC_NUM_DEF_SIZE capacity spent the smallest amount of time in both user and system time. This makes sense, especially with the changes to make BC_NUM_DEF_SIZE bigger on 64-bit systems, since the vast majority of numbers will only ever use numbers with a size less than or equal to BC_NUM_DEF_SIZE.

Last of all, bc's signal handling underwent a complete redesign. (This is the reason that this version is 3.0.0 and not 2.8.0.) The change was to move from a polling approach to signal handling to an interrupt-based approach.

Previously, every single loop condition had a check for signals. I suspect that this could be expensive when in tight loops.

Now, the signal handler just uses longjmp() (actually siglongjmp()) to start an unwinding of the stack until it is stopped or the stack is unwound to main(), which just returns. If bc is currently executing code that cannot be safely interrupted (according to POSIX), then signals are "locked." The signal handler checks if the lock is taken, and if it is, it just sets the status to indicate that a signal arrived. Later, when the signal lock is released, the status is checked to see if a signal came in. If so, the stack unwinding starts.

This design eliminates polling in favor of maintaining a stack of jmp_buf's. This has its own performance implications, but it gives better interaction. And the cost of pushing and popping a jmp_buf in a function is paid at most twice. Most functions do not pay that price, and most of the rest only pay it once. (There are only some 3 functions in bc that push and pop a jmp_buf twice.)

As a side effect of this change, I had to eliminate the use of stdio.h in bc because stdio does not play nice with signals and longjmp(). I implemented custom I/O buffer code that takes a fraction of the size. This means that static builds will be smaller, but non-static builds will be bigger, though they will have less linking time.

This change is also good because my history implementation was already bypassing stdio for good reasons, and unifying the architecture was a win.

Another reason for this change is that my bc should always behave correctly in the presence of signals like SIGINT, SIGTERM, and SIGQUIT. With the addition of my own I/O buffering, I needed to also make sure that the buffers were correctly flushed even when such signals happened.

For this reason, I removed the option to build without signal support.

As a nice side effect of this change, the error handling code could be changed to take advantage of the stack unwinding that signals used. This means that signals and error handling use the same code paths, which means that the stack unwinding is well-tested. (Errors are tested heavily in the test suite.)

It also means that functions do not need to return a status code that every caller needs to check. This eliminated over 100 branches that simply checked return codes and then passed that return code up the stack if necessary. The code bloat savings from this is at least 1700 bytes on x86_64, before taking into account the extra code from removing stdio.h.

$ sha512sum bc-3.0.0.tar.xz
4961e030274e763aa02541457aa5aab6cd0d61758861b98d2cdac6acc42c3fb55b6adba72749edd3b663225ab844d7ef60809972478992165b071645fe6af65f  bc-3.0.0.tar.xz

$ sha256sum bc-3.0.0.tar.xz
4a7c5cbd5c7c2d3fea4a898c6ce87ff705756dd362cb2e3b241ae55e514e8280  bc-3.0.0.tar.xz

$ stat -c '%s  %n'
199304  bc-3.0.0.tar.xz

$ sha512sum bc-3.0.0.tar.xz.sig
db495a449b528a6bee555bafdeb965c1a780d0f9d15d069749e50d96ac9e1fff14a2487bf7dc7f2268011bc0c53093f880fadc4172c50a9ac59d29b280d7f6bf  bc-3.0.0.tar.xz.sig

$ sha256sum bc-3.0.0.tar.xz.sig
980fadbac5e7b5f722cb43df6fd8546e2eb3cc0cbdc2940606a63be523c3023e  bc-3.0.0.tar.xz.sig

$ stat -c '%s  %n'
662  bc-3.0.0.tar.xz.sig