logoalt Hacker News

adrian_blast Tuesday at 9:57 PM3 repliesview on HN

RISC-V does not have the pitfalls of experimental ISAs from 45 years ago, but it has other pitfalls that have not existed in almost any ISA since the first vacuum-tube computers, like the lack of means for integer overflow detection and the lack of indexed addressing.

Especially the lack of integer overflow detection is a choice of great stupidity, for which there exists no excuse.

Detecting integer overflow in hardware is extremely cheap, its cost is absolutely negligible. On the other hand, detecting integer overflow in software is extremely expensive, increasing both the program size and the execution time considerably, because each arithmetic operation must be replaced by multiple operations.

Because of the unacceptable cost, normal RISC-V programs choose to ignore the risk of overflows, which makes them unreliable.

The highest performance implementations of RISC-V from previous years were forced to introduce custom extensions for indexed addressing, but those used inefficient encodings, because something like indexed addressing must be in the base ISA, not in an extension.


Replies

camel-cdryesterday at 9:37 PM

OK, look.

Since my previous attempt to measure the impact of trap on signed overflow didn't seem to have moved your position one bit, I thought I'd give it a go in the most representable way I could think of:

I build the same version of clang on a x86, aarch64 and RISC-V system using clang. Then I build another version with the `-ftrapv` flag enabled and compared the compiletimes of compiling programs using these clang builds running on real hardware:

    runtime:         x86         | aarch64                    | RISC-V (RVA23)
                     Zen1        |  A78          A55*         |  X100         A100  !!! all cores clocked to about 2.2GHz, Zen1 can reach almost 4GHz
    clang A:         3.609±0.078 |  4.209±0.050   9.390±0.029 |  5.465±0.070  11.559±0.020
    clang-ftrapv A:  3.613±0.118 |  4.290±0.050   9.418±0.056 |  5.448±0.060  11.579±0.030
    clang B:         8.948±0.100 | 10.983±0.188  22.827±0.016 | 13.556±0.016  28.682±0.023
    clang-ftrapv B:  8.960±0.125 | 11.099±0.294  22.802±0.039 | 13.511±0.018  28.741±0.050


As you can see, once again the overhead of -ftrapv is quite low.

Suprizinglt the -ftrapv overhead seems the highest on the Cortex-A78. My guess is that this because clang generates a seperate brk with unique immediate for every overflow check, while on RISC-V it always branches to one unimp per function.

Please tell me if you have a better suggestion for measuring the real world impact.

Or heck, give me some artificial worst case code. That would also be an interesting data point.

Notes:

* The format is mean±variance

* Spacemit X100 is a Cortex-A76 like OoO RISC-V core and A100 an in-order RISC-V core.

* I tried to clock all of the cores to the same frequency of about 2.2GHz. *Except for the A55, which ran at 1.8GHz, but I linearly scaled the results.

* Program A was the chibicc (8K loc) compiler and program B microjs (30K loc).

    binary size:
                  x86        aarch64    RISC-V
    clang:        212807768  216633784  195231816
    clang-ftrapv: 212859280  216737608  195419512
    increase:     0.24%      0.047%     0.09%
hackyhackylast Tuesday at 10:04 PM

> On the other hand, detecting integer overflow in software is extremely expensive, increasing both the program size and the execution time considerably,

Most languages don't care about integer overflow. Your typical C program will happily wrap around.

If I really want to detect overflow, I can do this:

    add t0, a0, a1
    blt t0, a0, overflow
Which is one more instruction, which is not great, not terrible.
show 4 replies
adgjlsfhk1last Tuesday at 10:20 PM

> On the other hand, detecting integer overflow in software is extremely expensive

this just isn't true. both addition and multiplication can check for overflow in <2 instructions.

show 3 replies