• 0 Posts
  • 3 Comments
Joined 3 years ago
cake
Cake day: July 5th, 2023

help-circle
  • LLM companies have argued they should get to ignore all copyright, and now that one of their code leaked, suddenly they care greatly about copyright.

    Anthropic itself has argued that digitizing and using the digitized copies to train models is fair use, so long as:

    • They don’t redistribute the physical copies they bought
    • They don’t allow an end user to retrieve the contents of any one specific work at the user interface (if you ask Claude to spit out the entire text of a copyrighted work used to train it, it is designed to resist copying too much out of a single work)

    So they don’t argue that copyright doesn’t count, exactly. They argue that copyright doesn’t prevent model training from ingesting an entire copyrighted work, as long as it’s done with so many other copyrighted works that any given original isn’t a huge contributor to the model or its outputs.

    There’s tension in their positions, but not so much that it would totally fall apart.



  • Arch’s package management is actually the ideal, in my opinion. Official repositories for the stuff the distro maintainers want to officially support, a user-maintained AUR for other common packages, and the ability to build your own software with the Arch Build System, and letting pacman know where everything is. In a sense, the stuff in the official repositories have a privileged position, and you should be aware of the difference between the AUR and the official repositories, but you’re still always in control of what software is installed.

    The AUR packages and user-specific builds can be thought of as side loading, and the distinction can matter in some circumstances. So I’m ok with having another name for different installation/upgrade/update methods.