• ArtVandelay@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    13 hours ago

    Data scientist here, duckDB is the best. I use it all the time in notebooks where I have a data frame and I want to query it like SQL. In memory duckdb database to the rescue.

  • futatorius@lemm.ee
    link
    fedilink
    English
    arrow-up
    16
    arrow-down
    1
    ·
    1 day ago
    1. It’s pretty straightforward to install PostgreSQL and its GIS extensions. Maybe not one line, but within the abilities of any semi-experienced Linux user.

    2. If you want some visualization capability with your data, IBL Visual Weather (Go, Bratislava!) can also be made to be highly functional and performant, though it can be tricky to set up.

    3. There’s no mention of EDR in the DuckDB blurb (which QGIS now has as a semi-mature plugin). EDR is a newish OCG standard that lets you do multidimensional GIS queries in a sensible way. This is especially useful for environmental data where you might want to query a large number of parameters in a region, a volume, or along a trajectory. Previous approaches to doing this in GIS systems were frustrating at best, and more often, nonexistent.

    4. My job involves wrangling metric shit-tons of geo data and I know literally nobody who uses DuckDB. I’ll have a look in my copious free time, but if its main selling point is ease of installation, that one-time benefit means next to nothing compared to getting the DB (and its visualization capabilities, if any) to actually store and manipulate data in a useful way.

    5. Having said all that, it’s nice that there are new entrants in the field. But please don’t make it end up like the situation with content management systems, where everyone thinks it’s a good idea to write a new CMS and 99% of them are crap.

  • jubilationtcornpone@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    2
    ·
    1 day ago

    It seems like the singular benefit is that DuckDB (or similar OLAP models) can quickly handle lots of expensive read queries on large datasets.

    It’s not a replacement for a traditional RDBMS. I’ve never used it so I don’t know if it’s worth the effort to maintain instead of just using a Postgres read-only instance to run analytics queries but somehow I doubt it.

    My guess would be that it has a few very specific use cases where it can provide some added benefit. So, I fully expect it to be crammed forcefully into software projects where it provides no tangible benefit for the foreseeable future. Just like cough MongoDB cough.