×

Data parallel pretty-printing by Athas in ProgrammingLanguages

[–]matthieum 2 points3 points  (0 children)

Disclaimer: this is nitpicking on a completely unimportant detail, just so you know...

By convention, we say that the first node in an array of exprs is the root of the tree.

I can see how it's handy for processing, but isn't it a pain to build: it requires the root of the tree to refer to not-yet existing elements.

I mean, I guess you could build then reverse, to get there, but isn't the reverse a bit "wasted"?

It's all the more jarring in:

[#op 3 #mul 4, #num 8, #num 20, #op 1 #add 2, #num 42]

Since #op 3 #mul 4 is foreshadowing, while #op 1 #add 2 refers to previous expressions...

Lore: a version control system from Epic Games optimized for non-textual/binary assets by kibwen in rust

[–]matthieum 0 points1 point  (0 children)

Actually, this is what I mean by "attaching edits".

The idea is that you would push one more "node" into the git tree which would "amend" the wording of a specific ancestor commit.

The default view would then display the amended commit message, while the actual git tree, being append-only, would keep the record that of the initial commit message and the N successive edits.

Or otherwise said, just because it's important for the git tree to remain immutable doesn't mean we can't have nice things.

Rust PNG crate gets even faster, used by GNOME and Chromium by Shnatsel in rust

[–]matthieum 128 points129 points  (0 children)

Some of the maintainers of image-png were only able to dedicate time to collaborating with Chromium engineers and meeting GNOME’s needs thanks to an investment by Sovereign Tech Agency. We hope that putting memory-safe PNG decoding into the hands of billions of people is a good return on their investment!

This seems like a high bar to beat for the next projects seeking funding from the Sovereign Tech Agency!

OpenAI joins The Rust Foundation as a Platinun member and donates funds to support Rust maintenance by Kobzol in rust

[–]matthieum 6 points7 points  (0 children)

They do go some power in the running of the Rust Foundation itself, notably in the vetting (or vetoing) of the CEO.

Which is why the Rust Foundation Board of Directors is setup so that 50% of the seats are occupied by Rust Project Members, ensuring that Big Tech representatives do not get a strict majority by themselves, even if somehow the interests of all the various companies aligned.

Lore: a version control system from Epic Games optimized for non-textual/binary assets by kibwen in rust

[–]matthieum 0 points1 point  (0 children)

Even with an immutable tree, it really should be possible to attach "notes" to a given commit later (like CI results).

I could see actually editing the commit title/message being a bit more controversial... though then again, since we're talking about a VCS... why not just "attach" an edit post-facto to the commit title/message?

Attaching edits wouldn't scale so well, but hopefully there's not that many for any given commit!

Lore: a version control system from Epic Games optimized for non-textual/binary assets by kibwen in rust

[–]matthieum 2 points3 points  (0 children)

that you already pushed.

(And, implicitly, that others already pulled)

Lore: a version control system from Epic Games optimized for non-textual/binary assets by kibwen in rust

[–]matthieum 5 points6 points  (0 children)

But that of course still changes history and you'll have to force push your branch and rebase others building on it.

Which is the problem OP is complaining about in the first place.

OpenAI joins The Rust Foundation as a Platinun member and donates funds to support Rust maintenance by JuanAG in programming

[–]matthieum 5 points6 points  (0 children)

They're definitely buying influence in the Foundation itself. Then again, given they're contributing $600,000 it's kinda expected they'd want a word on how it's spent.

With that said:

  1. Do note that the Rust Foundation is setup (in its charter) so that 50% of the seats on the Board of Directors are reserved for team members of the Rust Project, so even if the interests of all Big Tech companies aligned, they still wouldn't have a strict majority in the Foundation.
  2. The Rust Project -- which "controls" the toolchain, standard library, crates.io, etc... -- is a completely separate entity. The Rust Foundation can of course exercise some pressure -- by withholding funding, or deciding what to fund -- but it has no direct control, and a sufficient determined team member can decide to forge ahead without Foundation funding. I mean, many already do at the moment, since there's not enough funding anyway.

What would make you consider using a new sorting algo? by Standard-Cow-4126 in cpp

[–]matthieum -1 points0 points  (0 children)

I know you said serial but don't discount parallel on a good implementation. MSVC's parallel std::sort starts to win at somewhere between 1000 and 4000 elements.

I'm not saying that parallel implementations cannot win latency-wise. That's not the issue.

The issue is that I am regularly in a position where:

  • Either all cores are already spoken for, so additional threads are not welcome.
  • I have strict requirements on how many & where additional threads may run, and most "auto-parallelization" libraries do not allow me to specify these.

Strictly serial algorithms may not be the fastest from a pure latency perspective, but at least I don't have to worry about rogue threads destroying my carefully arranged plans.

Also just because the keys aren't integers might not matter very much.

Firstly, I wanted to eliminate Radix sorts & the like from the equation. Those are too specialized.

Secondly, int are cheap. There's 16 of them per cache line. They're copied/moved through registers, etc... This is pretty different from a 32-bytes std::string: only 2 per cache line, with possible indirection, more expensive to move, very expensive to copy if out of line, etc...

The profiles between these two keys are different enough that I expect slightly different trade-offs in the sorting algorithm may lead to different performance.

Hence, while I don't mind benchmarks with int, I sure want to see benchmarks with larger keys, with more expensive to compare keys, etc... just to make sure that on a more "realistic" workload (for me) the good results actually hold.

Whippyunits 0.2.0 - Stable Rust Units of Measure for Applied Numerics by oblarg in rust

[–]matthieum 3 points4 points  (0 children)

Affine

How do you handle operations with affine units?

  • Can I multiply two temperatures expressed in Celsius?
  • Can I rescale from Celsius2 to Kelvin2?

Point & Vector

There's a prominently featured Quantity, but I could not find a Point or Vector. Do you not differentiate between, say, a Timestamp and a Duration?

The differentiation between Point and Vector is quite useful to prevent more nonsensical operations. For example, Timestamp - Timestamp makes sense (it's a Duration), but Timestamp + Timestamp doesn't.

It's also quite interesting with the aforementioned affine units.

  • If you have a Point expressed in Celsius/m, you cannot scale it to Kelvin/m.
  • If you have a Vector expressed in Celsius/m, you can scale it to Kelvin/m.

This is a result of Vector, by its nature, erasing the "affine" nature of the unit.

Which raises the point that your statement that 0C = 273.15K is wrong...

  • You're correct for a Point.
  • You're wrong for a Vector.

If I'm saying that the temperature rose by 9F today, it rose by 5C (or 5K), not by -12C.

zlib-rs in Firefox - Trifecta Tech Foundation by folkertdev in rust

[–]matthieum 30 points31 points  (0 children)

Thanks for the link to Oodle, and OH MY GOD.

The worst bug I ever had in my life was bad codegen from GCC, but at least I still could map my code to the assembly and point out the nonsense.

Debugging a black box? Ouch.

the best definition of rust i have ever come across by akmessi2810 in rust

[–]matthieum 3 points4 points  (0 children)

I disagree.

Beyond safety, the language provides great tools for correctness: enums & pattern-matching!

But more importantly, there's a culture in the Rust community, starting with the standard library, putting correctness first and foremost.

As a trivial example, str only accepts UTF-8. Attempts at constructing a str from non-UTF-8 bytes will result in an Err. An Err which must be dealt with, lest a lint triggers.

This is not really the language at play here. It's the culture making use of the tools provided by the language to promote correctness, and front-loading error scenarios.

Rimalloc may be a of the art allocator. by PatienceSpiritual134 in rust

[–]matthieum 2 points3 points  (0 children)

I love allocators. Perhaps too much. Brace yourself...

Port?

Is this just a mimalloc port to Rust, or is it independently designed?

Memory Footprint?

It's easy to be fast by leaking. Even if not leaking, adjusting the bar for returning memory to the OS can also make a difference -- especially when memory is returned only to be immediately re-acquired.

A good general-purpose allocator isn't just fast, it balances speed and memory footprint.

On the other hand, a pedal-to-the-metal allocator may simply never release the memory to the OS.

How does rimalloc stand there?

Multi-threading?

I see a few benchmarks which seem related to multi-threading, but it's not clear to me exactly how well they exercise (or not) the workload.

For example, checking xthread, the MPSC queue will be a clear contention point, which may very well dwarf the overhead of the allocators being put to the test. Furthermore, the MPSC is bounded to 1024 elements, when the benchmark plans to push 2M elements through it... the consumer may not keep up, and it becomes difficult to assess what, exactly, is being measured here as allocation & deallocation performances collide.

Finally, xthread is fixed -- 4 producers/1 consumer. I advise designing some scaling benchmarks, to show how the performance evolves at various counts... if it scales sub-linearly, it's typically an indication of contention somewhere. (Perhaps in the benchmark code, perhaps in the benchmarked code, who knows)

You may want to have a look at the bursty crate I wrote specifically to benchmark contention. It allows executing a serie of steps in lock steps: all threads wait, then execute their step 1, then wait, then execute their step 2, etc... maximizing contention at each step by ensuring all threads start as close to possible to each others as possible.

deconvolution - a comprehensive image deconvolution and restoration library by [deleted] in rust

[–]matthieum 1 point2 points  (0 children)

I reviewed most of the code manually before committing. I wouldn’t say all of it, because I did skim portions.

Thank you for your honesty.

To be frank with you, I work a full-time job, and I do not intend to spend all of my free time writing boilerplate.

I hear you. My own full-time job is also leaving me with little free time.

We may have a different idea of what boilerplate means, however.

Is the only way for a project to be respectable for every single line of code to be written by a human?

No, but we do ask that a human understand, be able to explain the trade-offs, and ultimately take responsibility for every single line of Rust code. We are more lenient on JS/TS code, for example, being r/rust, not r/javascript.

That is, we make a difference between:

  • Co-Pilot: the human is driving, the AI is assisting.
  • Auto-Pilot: the AI is driving, the human is (at times) asleep.

They too, do not include a disclaimer. Some have the Claude co-contributor in commit history, some don't.

Rules always lag behind usage.

An AI usage disclaimer helps readers understand what they're dealing with, including us, moderators.

For example, it may help a user deciding which portions of a codebase they may want to rely on, and which they don't. Or as another example, it may help a reviewer focusing their efforts on the areas which have seen the least human involvement, to ensure they're up to snuff (and raise issues if they're not).

Transparency on the quality/production-readiness of every part of the codebase helps everyone.

Please explain to me what rule I violated which justified the removal of my post. Do you believe my project is “slop"?

Yes, in parts.

The current crop of AI are statistical parrots: their output has good form, but too often poor function. Chances are there are many edge cases which are simply not dealt with, or dealt with incorrectly, and may cause panics, or incorrect results.

How memory safety CVEs differ between Rust and C/C++ by Kobzol in rust

[–]matthieum 116 points117 points  (0 children)

I think this is only part of the reasons why there are, actually, so few CVEs in C/C++.

Another reason is that in C/C++ code segfaults tend to be common enough occurrences that their authors tend to consider them to be "just bugs", and fix them without ever raising CVEs in the first place.

deconvolution - a comprehensive image deconvolution and restoration library by [deleted] in rust

[–]matthieum[M] [score hidden] stickied comment (0 children)

It's pretty clear AI was involved in the creation of this repository, yet there is no AI usage disclaimer explaining exactly how.

For now, I'll proceed on the assumption that this repository was vibe-coded.

Feel free to edit the README with a precise disclaimer to change my mind.

Git merges can be better by agentvenom1 in programming

[–]matthieum 8 points9 points  (0 children)

But why?

We already have perfectly good names -- remote for the remote branch, local for the local branch -- so why invent new names -- ours & theirs -- rather than stick to remote & local?

What Do Engineers Mean When We Say "Taste"? by funnybong in programming

[–]matthieum 1 point2 points  (0 children)

Function vs Form.

The form may be annoying -- tainted by AI -- but the function is still worth it, in my opinion.

Taste is an elusive concept, and I do think the article does a good job of identifying it as "calibration" across multiple dimensions, and highlighting some dimensions.

There Is Life Before and After Main in Rust by mmastrac in rust

[–]matthieum 1 point2 points  (0 children)

Scattered Collect is definitely cool indeed.

Code Readability Comparison by Mean-Decision-3502 in Compilers

[–]matthieum 0 points1 point  (0 children)

Expression like this [...] Might be short and powerful, but I find it hard to read.

It definitely is.

As the lengthy comment above mentions, this is hitting a particular bad spot of the standard library & type inference algorithm forcing extensive type annotations.

Ideally, we should be looking at darr.iter().sum() and calling it a day.

What would make you consider using a new sorting algo? by Standard-Cow-4126 in cpp

[–]matthieum 2 points3 points  (0 children)

Though then again, when you're looking to optimize a particular part of an application, 2x as fast sorting could be pretty neat.

Code Readability Comparison by Mean-Decision-3502 in Compilers

[–]matthieum 2 points3 points  (0 children)

Everything is readable in small examples.

What matters is that a language remain mutable at scale:

  • 10 parameters.
  • Random [10, 20] characters per name & type.
  • 10s of lines of code.

Your Rust code is completely unidiomatic, so it's a bogus comparison:

  • Idiomatic Rust does not use globals.
  • Idiomatic Rust does not use pointers.
  • WTF are you using black_box for???

Rewritten, with inline comments explaining the choices for folks not used to Rust:

fn fill_array_idiomatic(max_val: i32, darr: &mut Vec<i32>) {
    darr.clear();

    darr.extend(0..max_val);
}

fn fill_array_for(max_val: i32, darr: &mut Vec<i32>) {
    darr.clear();

    for i in 0..max_val {
        darr.push(i);
    }
}

fn calc_sum_idiomatic_i32(darr: &[i32]) -> i32 {
    //  Type inference for `sum` generally sucks, for some reason.
    //
    //  The type of its output is known (return type), yet must be specified...
    darr.iter().sum::<i32>()
}

fn calc_sum_for_i32(darr: &[i32]) -> i32 {
    let mut sum = 0;

    for i in darr {
        sum += i;
    }

    sum
}

fn calc_sum_idiomatic(darr: &[i32]) -> i64 {
    //  Using `as` is not recommended.
    //
    //  It works, but it can do _anything_: lossless casts, lossy casts,
    //  integer to float, integer to pointer, etc...
    //
    //  Into should be used for lossless casts... but:
    //  - Since `sum` accepts _anything_, type inference is unable to know what we
    //    wish to convert to, requiring a type annotation.
    //  - `<&i32 as Into<i64>>` does not exist, so we must dereference `i` first.
    darr.iter().map(|&i| -> i64 { i.into() }).sum::<i64>()
}

fn calc_sum_for(darr: &[i32]) -> i64 {
    let mut sum = 0;

    for &i in darr {
        let i: i64 = i.into();

        sum += i;
    }

    sum
}

There is no pointer version, because nobody sane would use a pointer version. Pointers are to be used sparingly in Rust, as each use must be carefully annotated with comments discharging soundness obligations.

(I mean, ideally it should be the same in C or C++, but they're so omnipresent it would be untenable...)