12 Commits

Author SHA1 Message Date
Jacob Young
2e31077fe0 Coff: implement threadlocal variables 2025-10-10 22:47:47 -07:00
Jacob Young
07c3f9ef8e x86_64: fix bool vector init register clobber
Closes #25439
2025-10-03 12:18:53 -04:00
Jacob Young
1fa11e0954 Coff: delete 2025-10-02 17:44:52 -04:00
Jacob Young
e1f3fc6ce2 Coff2: create a new linker from scratch 2025-10-02 17:44:52 -04:00
Jacob Young
d5f09f56e0 x86_64: fix windows calling convention abi 2025-10-02 15:59:51 -04:00
Jacob Young
a896a22932 x86_64: fix @mulAdd miscomp 2025-09-27 20:10:32 -04:00
Jacob Young
a744fbd22f x86_64: fix ~/! miscomps 2025-09-27 18:30:52 -04:00
Jacob Young
b206b0626a x86_64: fix @floatFromInt miscomps 2025-09-27 18:30:52 -04:00
mlugg
611c38e6da x86_64: fix unencodable rem lowerings
The memory operand might use one of the extended GPRs R8 through R15 and
hence require a REX prefix, but having a REX prefix makes the high-byte
register AH unencodeable as the src operand. This latent bug was exposed
by this branch, presumably because `select` now happens to be putting
something in an extended GPR instead of a legacy GPR.

In theory this could be fixed with minimal cost by introducing a way to
communicate to `select` that neither the destination memory nor the
other temporary can be in an extended GPR. However, I just went for the
simple solution which comes at a cost of one trivial instruction: copy
the remainder from AH to AL, and *then* copy AL to the destination.
2025-09-27 18:30:52 -04:00
mlugg
77fca1652f x86_64: fix miscompilation of mul on vectors of large ints 2025-09-27 18:30:52 -04:00
mlugg
0c476191a4 x86_64: generate better constant memcpy code
`rep movsb` isn't usually a great idea here. This commit makes the logic
which tentatively existed in `genInlineMemcpy` apply in more cases, and
in particular applies it to the "new" backend logic. Put simply, all
copies of 128 bytes or fewer will now attempt this path first,
where---provided there is an SSE register and/or a general-purpose
register available---we will lower the operation using a sequence of 32,
16, 8, 4, 2, and 1 byte copy operations.

The feedback I got on this diff was "Push it to master and if it
miscomps I'll revert it" so don't blame me when it explodes
2025-09-27 18:30:52 -04:00
Alex Rønne Petersen
86077fe6bd compiler: move self-hosted backends from src/arch to src/codegen 2025-09-26 02:02:07 +02:00