Quite frequently I come across scanned books that are viewable for free online. For example, the publisher put them there (such as preview chapters), a library (old books from their collection that are in public domain), etc. Since I like hoarding data, and the online viewers that are used to present the book to me might not be very practical, I frequently try to download the books one way or another. This requires toying with the “inspect element” tool and various other methods of getting the images/PDF. Now, all that I access is what is, well, accessible; I don’t hack into the servers or something. But - the stuff is meant to be hidden from the normal user. Does that act of hiding the material, no matter how primitive and easily circumvented, mean that I’m not allowed to access it at all?

I suppose ripping a public domain book is no big deal, but would books under copyright fare differently?

Mainly I’m asking out of curiosity, I don’t expect the police to come visit me for ripping a 16th century dictionary.

Note: I live in EU, but I’d be curious to hear how this is treated elsewhere too.

Edit: I also remembered a funny trick I noticed on one site - it allows viewing PDFs on their website, but not downloading, unless you pay for the PDF. But when you load the page, even without paying, the PDF is already downloaded onto your computer and can be found in the browser cache. Is it legal to simply save the file that is already on your computer?

  • Vipsu@lemmy.world
    link
    fedilink
    English
    arrow-up
    56
    arrow-down
    4
    ·
    3 months ago

    According to the big tech its ok if you’re training large language model with it.

    • lugal@lemmy.world
      link
      fedilink
      arrow-up
      14
      ·
      3 months ago

      You’re confusing the law that applies for the ruling class with the one that applies to common people

      • SlothMama@lemmy.world
        link
        fedilink
        arrow-up
        4
        ·
        3 months ago

        Unironically yes, you would not know who Spiderman was without viewing a copyrighted work demonstrating what he looks like, and now you understand while generative AI fundamentally has to ingest copyrighted works.