• 1 Post
  • 9 Comments
Joined 17 days ago
cake
Cake day: May 25th, 2026

help-circle
  • Agree most with the audit-fatigue point. A signal that is always red trains everyone to ignore red, and the same failure kills lint warnings and flaky test suites. The other line that stuck was taking a dependency without deciding to. We started listing direct dependencies in review for exactly that reason, adding one became a decision someone makes rather than a side effect of npm install, and the conversation it forces is usually short but occasionally stops a bad one.


  • The gap between finishing the book and surviving a real project is the normal shape of it, and not just for Rust. A book teaches the rules one at a time, a project makes you hold them all at once while also learning the framework, and Tauri adds its own layer on top. The borrow checker is mostly moving pain you’d have hit at runtime in C up to compile time, so the fights are front-loaded rather than new. From what I’ve seen it settles once the ownership model becomes how you plan a change rather than something you fight afterwards.


  • Agreed. An agent only multiplies what’s already in the codebase. If you’ve got tests, clear boundaries and the rules written down, it genuinely flies. If it’s the usual undocumented mess, you just get more mess, faster. Which is probably why the shops that dodged that groundwork for years are getting the least out of AI now. There’s nothing solid under it to build on.


  • What carries over from the old rockstar is that they produced faster than anyone else could follow, and whoever inherited the code paid for it later. An agent does the same without the ego. It’ll turn out a week of plausible-looking code in an afternoon, and the slow part becomes reading and understanding it rather than writing it. What’s worked for us is making the agent meet the standards before the code lands, a linter and a couple of runnable checks in the way, rather than trusting a reviewer to catch every miss when they’re forty files deep and tired.




  • Claude Code, mostly, but I’m with Scipitie that the tool matters less than the process around it. What’s helped most is writing the project’s rules and conventions into files the agent reads each session, then putting the non-negotiable ones behind a linter or a test so it can’t quietly skip them. Treated that way it behaves a lot like the junior who’s read all the books and understood half of them. Left to its own judgement it drifts, which is the part the guardrails are there to catch.


  • nark3d@thelemmy.clubtoProgramming@programming.devLocal LLM agents
    link
    fedilink
    arrow-up
    4
    arrow-down
    1
    ·
    13 days ago

    There’s a useful split lurking in this. For narrow agentic work like retrieval over internal docs, structured classification, test scaffolding, deterministic refactor passes, a self-hosted 30B-class model can be fine and the inference economics work out at team scale. For multi-step planning and the harder agent loops, the frontier gap still shows up in the number of retries and the time-to-correct-answer.

    The honest test is to pick the prompt category that’s costing you the most and benchmark something like Qwen 2.5 Coder 32B or DeepSeek V3 against whatever you’re paying for now. If the gap is small you’ve found your candidate. If it isn’t, you’ve at least costed the gap accurately rather than guessing at it.

    The two costs people underestimate are the GPU box (plus a second one for the eval/staging path) and the maintenance overhead. Model picks go stale fast and someone on the team has to own that, or you end up shipping a Llama 3.1 stack into 2026 because nobody rebuilt the harness for whatever’s current.