Molto interessante secondo me...
Version 0.7.2 and AMD Zen: (March 14, 2017)
I went through a lot of trouble to do this in time for Pi day, but here it is. y-cruncher v0.7.2 has a new binary specifically optimized for AMD's Ryzen 7 processors.
The performance gain is about 5% over the Broadwell-tuned binary and 15% over v0.7.1. It turns out that the optimizations between v0.7.1 and v0.7.2 happened to be more favorable to AMD Zen than to Intel processors. Nevertheless, this is not enough to make Ryzen beat Haswell-E or Broadwell-E.
It's unlikely that any amount of Zen-specific optimizations can make Ryzen beat Haswell/Broadwell-E. The difference in memory bandwidth and 256-bit AVX throughput is simply far too large to overcome.
AMD made a conscious decision to sacrifice HPC to focus on mainstream.
As for the Ryzen platform itself: It's a bit immature at this point.
I went out on launch day to grab the Zen parts. In the end, it took me 3 sets of memory and 2 weeks before I finally found a stable configuration that I could use. From what I've seen on Reddit and various forums, I've been unlucky, but I'm definitely not alone.
Slightly more concerning is a system freeze with FMA instructions which appears to be have been confirmed by AMD as a processor errata. Fortunately, the source also says this is fixable via a microcode update. So it won't lead to something catastrophic like a recall or a fix that disables processor features.
Bjt2? Cosa dici di questo? Secondo te?
As for the Zen architecture itself. Here are my (early) observations:
The FPU block diagrams that were released about Zen appear to be accurate. The FLOPs benchmark is able achieve 4 128-bit CPU instructions/cycle if there is an equal distribution of FP-adds and FP-multiplies. As expected, FMAs have twice the cost
since they need both an FP-add and an FP-multiply.
256-bit AVX instructions are handled efficiently enough that it seems to be beneficial to use 256-bit instructions when there's no overhead to doing so.
Memory is huge bottleneck. Latencies are very high and dual-channel memory is simply not enough to feed this much computational throughput.
Questa è un'affermazione importante
imho... pare che il quad channel sarebbe necessario in un HPC environment..
The Ryzen 7 can run FMA4 instructions even though the corresponding CPUID flag is cleared. This is probably to enable some compatibility with code written for AMD Bulldozer while discouraging further use of FMA4. On the other hand, XOP instructions did not get this treatment so they will crash. Ryzen's ability to run FMA4 instructions makes it possible to run y-cruncher's XOP binary without crashing for small computations.
For software developers, compiling code on the 1800X is about as fast as the 5960X at stock clocks.
But the 5960X has much more overclocking headroom, so it ends up winning by around 15%. For a $500 processor, the R7 1800X is very impressive.
Secondo lui l'IPC è quello di Haswell paro-paro.