Instead of using character ai, which will send all my private conversations to governments, I found this solution. Any thoughts on this? 😅

  • fishynoob@infosec.pub
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    I was going to buy the ARC B580s when they come back down in price, but with the tariffs I don’t think I’ll ever see them at MSRP. Even the used market is very expensive. I’ll probably hold off on buying GPUs for a few more months till I can afford the higher prices/something changes. Thanks for the Lexi V2 suggestion

    • Naz@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      If you are using CPU only, you need to look at very small models or the 2-bit quants.

      Everything will be extremely slow otherwise:

      GPU:

      Loaded Power: 465W

      Speed: 18.5 tokens/second

      CPU: Loaded Power: 115W

      Speed: 1.60 tokens/second

      GPUs are at least 3 times faster for the same power draw.

      • fishynoob@infosec.pub
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 year ago

        Yeah I’m not going to run them on CPUs, that’s not going to be very good. I’ll buy the GPUs when I can.