Bio field too short. Ask me about my person/beliefs/etc if you want to know. Or just look at my post history.

  • 0 Posts
  • 35 Comments
Joined 2 years ago
cake
Cake day: August 3rd, 2023

help-circle
  • I really like this comment. It covers a variety of use cases where an LLM/AI could help with the mundane tasks and calls out some of the issues.

    The ‘accuracy’ aspect is my 2nd greatest concern: An LLM agent that I told to find me a nearby Indian restaurant, which it then hallucinated is not going to kill me. I’ll deal, but be hungry and cranky. When that LLM (which are notoriously bad at numbers) updates my spending spreadsheet with a 500 instead of a 5000, that could have a real impact on my long-term planning, especially if it’s somehow tied into my actual bank account and makes up numbers. As we/they embed AI into everything, the number of people who think they have money because the AI agent queried their bank balance, saw 15, and turned it into 1500 will be too damn high. I don’t ever foresee trusting an AI agent to do anything important for me.

    “trust”/“privacy” is my greatest fear, though. There’s documentation for the major players that prompts are used to train the models. I can’t immediately find an article link because ‘chatgpt prompt train’ finds me a ton of slop about the various “super” prompts I could use. Here’s OpenAI’s ToS about how they will use your input to train their model unless you specifically opt-out: https://openai.com/policies/how-your-data-is-used-to-improve-model-performance/

    Note that that means when you ask for an Indian restaurant near your home address, Open AI now has that address in it’s data set and may hallucinate that address as an Indian restaurant in the future. The result being that some hungry, cranky dude may show up at your doorstep asking, “where’s my tikka masala”. This could be a net-gain, though; new bestie.

    The real risk, though, is that your daily life is now collected, collated, harvested and added to the model’s data set; all without your clear explicit actions: using these tools requires accepting a ToS that most people will not really read and understand. Maaaaaany people will expose what is otherwise sensitive information to these tools without understanding that their data becomes visible as part of that action.

    To get a little political, I think there’s a huge downside on the trust aspect of: These companies have your queries(prompts), and I don’t trust them to maintain my privacy. If I ask something like “where to get abortion in texas”, I can fully see OpenAI selling that prompt to law enforcement. That’s an egregious example for impact, but imagine someone could query prompts (using an AI which might make shit up) and asks “who asked about topics anti-X” or “pro-Y”.


    My personal use of ai: I like the NLP paradigm for turning a verbose search query into other search queries that are more likely to find me results. I run a local 8B model that has, for example, helped me find a movie from my childhood that I couldn’t get google to identify.

    There’s use-case here, but I can’t accept this as a SaaS-style offering. Any modern gaming machine can run one of these LLMs and get value without the tradeoff from privacy.

    Adding agent power just opens you up to having your tool make stupid mistakes on your behalf. These kinds of tools need to have oversight at all times. They may work for 90% of the time, but they will eventually send an offensive email to your boss, delete your whole database, wire money to someone you didn’t intend, or otherwise make a mistake.


    I kind of fear the day that you have a crucial confrontation with your boss and the dialog goes something like:

    Why did you call me an asshole?

    I didn’t the AI did and I didn’t read the response as much as I should have.

    Oh, OK.


    Edit: Adding as my use case: I’ve heard about LLMs being described as a blurry JPEG of the internet, and to me this is their true value.

    We don’t need a 800B model, we need an easy 8B model that anyone can run that helps turn “I have a question” into a pile of relevant actual searches.



  • korazail@lemmy.myserv.oneto196@lemmy.blahaj.zoneCorruption rule
    link
    fedilink
    English
    arrow-up
    7
    ·
    edit-2
    16 days ago

    I don’t have a link, but I remember an exposé a few years ago where some politician sold out their constituents for like 10k-100k in campaign contributions.

    The response was along the lines of, ‘why don’t we just make a Kickstarter to buy them back’

    Obviously this results in a bidding war we probably can’t win… And it’s, in theory, what PAC is supposed to be; but it might be useful in both defining a given politicians price, and in driving up the cost of corruption.

    That feels very ‘free market’ to me.

    edit: fixed autocorrect typo.


  • I think that adage used to work… however nowadays, with corporate greed enshittifying everything, I think it’s safe to presume malice by default, at least when the actor is a company. Your neighbor probably didn’t mean to do that thing that made you mad.

    They no longer get the ‘benefit of the doubt’ after years of evidence that they will attempt to squeeze every penny out of their customers.


  • I tripped over this awesome analogy that I feel compelled to share. “[AI/LLMs are] a blurry JPEG of the web”.

    This video pointed me to this article (paywalled)

    The headline gets the major point across. LLMs are like taking the whole web as an analog image and lossily digitizing it: you can make out the general shape, but there might be missed details or compression artifacts. Asking an LLM is, in effect, googling your question using a more natural language… but instead of getting source material or memes back as a result, you get a lossy version of those sources and it’s random by design, so ‘how do I fix this bug?’ could result in ‘rm -rf’ one time, and something that looks like an actual fix the next.

    Gamers’ Nexus just did a piece about how youtube’s ai summaries could be manipulative. While I think that is a possibility and the risk is real, go look at how many times elmo has said he’ll fix grok for real this time; but another big takeaway was how bad LLMs still are at numbers or tokens that have data encoded in them: There was a segment where Steve called out the inconsistent model names, and how the ai would mistake a 9070 for a 970, etc, or make up it’s own models.

    Just like googling a question might give you a troll answer, querying an ai might give you a regurgitated, low-res troll answer. ew.



  • And this is why Digit wanted a clarification. Let’s make a quick split between “Tech Bro” and Technology Enthusiast.

    I’d maybe label myself a “tech guy”, and forego the “bro”, but I could see other people calling me a “tech bro”. I like following tech trends and innovations, and I’m often a leading adopter of things I’m interested in if not bleeding edge. I like talking about tech trends and will dive into subjects I know. I’ll be quick to point out how machine learning can be used in certain circumstances, but am loudly against “AI”/LLMs being shoved into everything. I’m not the CEO or similar of a startup.

    Your specific and linked definition requires low critical thinking skills, big ego and access to “too much” money. That doesn’t describe me and probably doesn’t describe Digit’s network.

    Their whole point seemed to be that the tech-aware people in their sphere are antagonistic to the idea of “AI” being added to everything. That doesn’t deserve derision.


  • Hell, I don’t submit help requests without a confident understanding of what’s wrong.

    Hi Amazon. My cart, ID xyz123, failed to check out. Your browser javascript seems to be throwing an error on line 173 of “null is not an object”. I think this is because the variable is overwritten in line 124, but only when the number of items AND the total cart price are prime.

    Generally, by the time I have my full support request, I have either solved my problem or solved theirs.


  • I agree that this is a problem.

    “Responsible disclosure” is a thing where an organization is given time to fix their code and deploy before the vulnerability is made public. Failing to fix the issue in a reasonable time, especially a timeline that your org has publicly agreed to, will cause reputational harm and is thus an incentive to write good code that is free of vulns and to remediate ones when they are identified.

    This breaks down when the “organization” in question is just a few people with some free time who made something so fundamentally awesome that the world depends on it and have never been compensated for their incredible contributions to everyone.

    “Responsible disclosure” in this case needs a bit of a redesign when the org is volunteer work instead of a company making profit. There’s no real reputational harm to ffmpeg, since users don’t necessarily know they use it, but the broader community recognizes the risk, and the maintainers feel obligated to fix issues. Additionally, a publicly disclosed vulnerability puts tons of innocent users at risk.

    I don’t dislike AI-based code analysis. It can theoretically prevent zero-days when someone malicious else finds an issue first, but running AI tools against that xkcd-tiny-block and expecting that the maintainers have the ability to fit into a billion-dollar-company’s timeline is unreasonable. Google et al. should keep risks or vulnerabilities private when disclosing them to FOSS maintainers instead of holding them to the same standard as a corporation by posting issues to a git repo.

    A RCE or similar critical issue in ffmpeg would be a real issue with widespread impact, given how broadly it is used. That suggests that it should be broadly supported. The social contract with LGPL, GPL, and FOSS in general is that code is released ‘as is, with no warranty’. Want to fix a problem, go for it! Only calling out problem just makes you a dick: Google, Amazon, Microsoft, 100’s of others.

    As many have already stated: If a grossly profitable business depends on a “tiny” piece of code they aren’t paying for, they have two options: pay for the code (fund maintenance) or make their own. I’d also support a few headlines like “New Google Chrome vulnerability will let hackers steal you children and house!” or “watching this youtube video will set your computer on fire!”


  • I’m happy you provided a few examples. This is good for anyone else reading along.

    Equifax in 2017: Penalty was, let’s assume the worst case, 700$M. The company in 2017 made 3.3$B, and I’d assume that was after the penalty, but even if it wasn’t, that was a penalty of 27% of revenue. That actually seems like it would hurt.

    TSB in 2022: Fined ~48.6£M by two separate agencies. TSB made 183.5£M in revenue in 2022, still unclear if that was pre- or post- penalty, but this probably actually hurt.

    Uber in 2018: your link suggests Uber avoided any legal discovery that might have exposed their wrongdoing. There are no numbers in the linked article and a search suggest the numbers are not public. Fuck that. A woman was killed by an AI driven car and the family deserves respect and privacy, but uber DOES NOT. Because it’s not a public record, I can’t tell how much they paid out for the death of the victim, and since uber is one of those modern venture-capital-loss-leader companies, this is hard to respond to.

    I’m out of time – and won’t likely be able to finish before the weekend, so trying to wrap up – and Boeing seems complicated and I’m more familiar with Crowdstrike and I know they fucked up. In both cases, I’m not sure how much of a penalty they paid out relative to income.

    I’ll cede the point: There are some companies who have paid a price for making mistakes. When you’re talking companies, though, the only metric is money-paid/money-earned. I would really like there to be criminal penalties for leadership who chase profit over safety, so there’s a bit of ‘wishful thinking’ in my worldview. If you kill someone as a human being (or 300 persons, Boeing), you end up with years in prison, but company just pays 25% of it’s profit that year instead.

    I still think Cassandra is right, and that more often than not, software companies are not held responsible for their mistakes. And I think your other premise, that ‘if software is better at something’ carries a lot: Software is good at explicit computation, such as math, but is historically incapable of empathy (a significant part of the original topic… I don’t want to be a number in a cost/benefit calculation). I don’t want software replacing a human in the loop.

    Back to my example of a flock camera telling the police that a stolen car was identified… the software was just wrong. The police department didn’t admit any wrongdoing and maaaaybe at some point the victim will be compensated for their suffering, but I expect flock will not be on the hook for that. It will be the police department, which is funded by taxpayers.

    Reading your comments outside this thread, I think we would agree on a great many things and have interesting conversations. I didn’t intend to come across as snide, condescending or arrogant. You made the initial point, cassandra challenged you and I agreed with them, so I joined where they seemed not to.

    The “bizarre emotion reaction” is probably that I despise AI and want it nowhere near any decision-making capability. I think that as we embed “AI” in software, we will find that real people are put at more risk and that software companies will be able to deflect blame when things go wrong.


  • The burden of proof is on you. Show me one example of a company being held liable (really liable, not a settlement/fine for a fraction of the money they made) for a software mistake that hurt people.

    The reality is that a company can make X dollars with software that makes mistakes, and then pay X/100 dollars when that hurts people and goes to court. That’s not a punishment, that’s a cost of business. And the company pays that fine and the humans who mode those decisions are shielded from further repercussions.

    When you said:

    the idea that the software vendor could not be held liable is farcical

    We need YOU to back that up. The rest of us have seen it never be accurate.

    And it gets worse when the software vendor is a step removed: See flock cameras making big mistakes. Software decided that this car was stolen, but it was wrong. The police intimidated an innocent civilian because the software was wrong. Not only were the police not held accountable, Flock was never even in the picture.




  • Super This:

    Organized, non-violent protests are not riots. They are people, in mass, using their freedom of speech to complain about something.

    A common issue is that some people, either within the protest group, or outside instigators, will then prod the protest into violence in order to discredit it. Two examples:

    • Police using rubber bullets/tear-gas/pepper-spray to disperse a lawful gathering. This escalates and adds tension. Not everyone is prepared to weather abuse to stay non-violent. Gassing a peaceful protest is going to make at least some of them really mad and is a pretty trivial way to turn a peaceful protest into something else and remove it’s message, making it just a “riot.”
    • Agitators claiming to be within the group, but who are actually against, it performing actions such as property damage or violence in order to discredit the whole event. If a non-violent march is walking down a street and some dick throws a rock through a store window and steals something, the whole march is called a riot by the media.

    It’s important that if you are involved in a protest that you stay calm despite what is thrown your way. The protest is the message and fighting back during that event is only harming your message. Please do things like capture pictures/videos of people inciting violence, of police using crowd control on peaceful protesters, of generic unfair treatment; but during that event, the goal is to be calm. Afterwards, you can take all your grievances to the medias. If you’ve been harmed during a protest, find a lawyer – many will work pro-bono for cases like this and if your first pick doesn’t… fuck 'em: Name and shame – and then fight back after the event, when you have legal standing.

    Your grievances are real. Your pain is real. The people in power will use every trick to discredit your issues. Don’t give them ammo.


  • Thanks for your reply, and I can still see how it might work.

    I’m curious if you have any resources that do some end-to-end examples. This is where I struggle. If I have an atomic piece of code I need and I can maybe get it started with a LLM and finish it by hand, but anything larger seems to just always fail. So far the best video I found to try a start-to-finish demo was this: https://www.youtube.com/watch?v=8AWEPx5cHWQ

    He spends plenty of time describing the tools and how to use them, but when we get to the actual work, we spend 20 minutes telling the LLM that it’s doing stuff wrong. There’s eventually a prototype, but to get there he had to alternate between ‘I still can’t jump’ and ‘here’s the new error.’ He eventually modified code himself, so even getting a ‘mario clone’ running requires an actual developer and the final result was underwhelming at best.

    For me, a ‘game’ is this tiny product that could be a viable unit. It doesn’t need to talk to other services, it just needs to react to user input. I want to see a speed-run of someone using LLMs to make a game that is playable. It doesn’t need to be “fun”, but the video above only got to the ‘player can jump and gets game over if hitting enemy’ stage. How much extra effort would it take to make the background not flat blue? Is there a win condition? How to refactor this so that the level is not hard-coded? Multiple enemy types? Shoot a fireball that bounces? Power Ups? And does doing any of those break jump functionality again? How much time do I have to spend telling the LLM that the fireball still goes through the floor and doesn’t kill an enemy when it hits them?

    I could imagine that if the LLM was handed a well described design document and technical spec that it could do better, but I have yet to see that demonstrated. Given what it produces for people publishing tutorials online, I would never let it handle anything business critical.

    The video is an hour long, and spends about 20 minutes in the middle actually working on the project. I probably couldn’t do better, but I’ve mostly forgotten my javascript and HTML canvas. If kaboom.js was my focus, though, I imagine I could knock out what he did in well under 20 minutes and have a better architected design that handled the above questions.

    I’ve, luckily, not yet been mandated that I embed AI into my pseudo-developer role, but they are asking.


  • I think this is what will kill vibe coding, but not before there’s significant damage done. Junior developers will be let go and senior devs will be told they have to use these tools instead and to be twice as efficient. At some point enough major companies will have had data breaches through AI-generated code that they all go back to using people, but there will be tons of vulnerable code everywhere. And letting Cursor touch your codebase for a year, even with oversight, will make it really tricky to find all the places it subtly fucked up.


  • I have 3 questions, and I’m coming from a heavily AI-skeptic position, but am open:

    1. Do you believe that providing all that context, describing the existing patterns, creating an implementation plan, etc, allows the AI to both write better code and faster than if you just did it yourself? To me, this just seems like you have to re-write your technical documentation in prose each time you want to do something. You are saying this is better than ‘Do XYZ’, but how much twiddling of your existing codebase do you need to do before an AI can understand the business context of it? I don’t currently do development on an existing codebase, but every time I try to get these tools to do something fairly simple from scratch, they just flail. Maybe I’m just not spending the hours to build my AI-parsable functional spec. Every time I’ve tried this, asking something as simple as (and paraphrased for brevity) “write an Asteroids clone using JavaScript and HTML 5 Canvas” results in a full failure, even with multiple retries chasing errors. I wrote something like that a few years ago to learn Javascript and it took me a day-ish to get something that mostly worked.

    2. Speaking of that context. Are you running your models locally, or do you have some cloud service? If you give your entire codebase to a 3rd party as context, how much of your company’s secret sauce have you disclosed? I’d imagine most sane companies are doing something to make their models local, but we see regular news articles about how ChatGPT is training on user input and leaking sensitive data if you ask it nicely and I can’t imagine all the pro-AI CEOs are aware of the risks here.

    3. How much pen-testing time are you spending on this code, error handling, edge cases, race conditions, data sanitation? An experienced dev understands these things innately, having fixed these kinds of issues in the past and knows the anti-patterns and how to avoid them. In all seriousness, I think this is going to be the thing that actually kills AI vibe coding, but it won’t be fast enough. There will be tons of new exploits in what used to be solidly safe places. Your new web front-end? It has a really simple SQL injection attack. Your phone app? You can tell it your username is admin’joe@google.com and it’ll let you order stuff for free since you’re an admin.

    I see a place for AI-generated code, for instant functions that do something blending simple and complex. “Hey claude, write a function to take a string and split it at the end of every sentence containing an uppercase A”. I had to write weird functions like that constantly as a sysadmin, and transforming data seems like a thing an AI could help me accelerate. I just don’t see that working on a larger scale, though, or trusting an AI enough to allow it to integrate a new function like that into an existing codebase.


  • I’d wager that the votes are irrelevant. Stock overflow is generously <50% good code and is mostly people saying ‘this code doesn’t work – why?’ and that is the corpus these models were trained on.

    I’ve yet to see something like a vibe coding livestream where something got done. I can only find a lot of ‘tutorials’ that tell how to set up tools. Anyone want to provide one?

    I could… possibly… imagine a place where someone took quality code from a variety of sources and generate a model that was specific to a single language, and that model was able to generate good code, but I don’t think we have that.

    Vibe coders: Even if your code works and seems to be a success, do you know why it works, how it works? Does it handle edge cases you didn’t include in your prompt? Does it expose the database to someone smarter than the LLM? Does it grant an attacker access to the computer it’s running on, if they are smarter than the LLM? Have you asked your LLM how many 'r’s are in strawberry?

    At the very least, we will have a cyber-security crisis due to vibe coding; especially since there seems to be a high likelihood of HR and Finance vibe coders who think they can do the traditional IT/Dev work without understanding what they are doing and how to do it safely.


  • This is my fear. It’s still possible, barely, to buy a dumb TV. When my current fridge/dishwasher/stove/etc dies in a few years, will there even be a dumb version? Will it cost 5x the price of a spyware version? How about my thermostat. HVAC? Car? And will attempting to disable any of this spyware land me in prison?

    Right now, uninformed/unaware/stupid people are affected by this. Pretty soon, everyone will be, or they will have to forego things we consider to be necessities now, like refrigeration and cell phones or be rich enough to buy the privacy-focused models.

    I can’t immediately find it, but I just saw another post about a new privacy-focused cellphone with a huge price tag. The established manufacturers have a cost advantage. Samsung et al. can easily make a new fridge with fewer consumer rights, but a new company will have to spend tons of capital to make a factory to put out a comparable product; and they won’t have the advantage of selling your data to subsidize the price.

    Privacy is and will become more-so a commodity unless we fight for it.


  • That new hire might eat resources, but they actually learn from their mistakes and gain experience. If you can’t hold on to them once they have experience, that’s a you problem. Be more capitalist and compete for their supply of talent; if you are not willing to pay for the real human, then you can have a shitty AI that will never grow beyond a ‘new hire.’

    The future problem, though, is that without the experience of being a junior dev, where do you think senior devs come from? Can’t fix crappy code if all you know how to do is engineer prompts to a new hire.

    “For want of a nail,” no one knew how to do anything in 2030. Doctors were AI, Programmers were AI, Artists were AI, Teachers were AI, Students were AI, Politicians were AI. Humanity suffered and the world suffocated under the energy requirements of doing everything poorly.