Discussion
Loading...

Post

  • About
  • Code of conduct
  • Privacy
  • Users
  • Instances
  • About Bonfire
Simon Tournier
@zimoun@social.sciences.re  ·  activity timestamp 3 days ago

« Conda ≠ PyPI: Why Conda Is More Than a Package Manager »

From a scientific practitioner perspective (= verify, reuse, rebuild 2-5 years later), the key of package managers is: « Not just binaries: FHS-like user space inside environments ». Yeah! 🤩

About Conda 🤔 2 design choices seems limitations:

• Solver-driven consistency. Conda uses SAT-based dependency solvers.

• Building for the future. This ensures that binaries remain forward compatible.

🤔 An under-the-hood assumption seems wrong here. The SAT-based solver won’t remain consistent 2-5 years later.

Conda builds for the immediate future (2-6 months) which isn’t the timescale for scientific projects.

Scientific practitioner: STOP Conda! 🤒
Switch to #Nix or #Guix. 😍
The future will thank you. 😘

https://conda.org/blog/conda-is-not-pypi

  • Copy link
  • Flag this post
  • Block
d@nny disc@ mc²
@hipsterelectron@circumstances.run replied  ·  activity timestamp 2 days ago

@zimoun spack's approach is to generate new solutions for a checksum-based strong consistency which mirrors guix's graph through an ASP solver based upon the weaker consistency you describe. i think this approach reduces maintenance difficulty and makes packages maintained by a single team more generalizable without needing to individually consider the combinatorial relationship to all potential dependees

  • Copy link
  • Flag this comment
  • Block
d@nny disc@ mc²
@hipsterelectron@circumstances.run replied  ·  activity timestamp 2 days ago

@zimoun i likely don't have enough free time to do this anytime soon but would very much like to look into somehow overlaying a solver-based graph onto the guix checksummed graph at some point

  • Copy link
  • Flag this comment
  • Block
Simon Tournier
@zimoun@social.sciences.re replied  ·  activity timestamp 2 days ago

@hipsterelectron It’d be interesting.

🤔 Since Guix isn’t designed to be solver-based, I don’t know how it would be technically doable.

Roughly, the packages are defined inside a Scheme library, thus the “guix checksummed graph” is one specific revision (version) of this Scheme library.

An solver-based overlay would be mean mixing different revisions of a same library. A difficult topic, IMHO.

Well, Guix provides a mechanism of “inferior“: from one revision of the Scheme library, extract one specific package defined inside another revision and allow co-installation with the both revisions. But this is limited and scale poorly because some barrier; see “a difficult topic”. 😁

All in all, I’b be very interested by your idea of some overlay to have a solver-based graph. 🤩

  • Copy link
  • Flag this comment
  • Block
d@nny disc@ mc²
@hipsterelectron@circumstances.run replied  ·  activity timestamp 2 days ago

@zimoun i really deeply appreciate the effort you exerted in explaining these points to me. it is definitely clear why such an overlay would be nontrivial. i believe there is actually a similar issue that spack has in that its package.py descriptions exist in memory in python code, which is converted into a clingo logic program using its python API, but never in an serialized format (which would allow us to cache much of the process that contributes to solver invocation time, but this would be more complex than that implies). i am therefore rather impressed that there is any "inferior" mechanism at all, which spack cannot claim (although this isn't necessary, it's still interesting).

i have been working on representing the pip resolve in separate distinct layers for quite a while https://github.com/pypa/pip/issues/12921 and i am generally somewhat fascinated by representing dynamic processes such as autoconf configuration into other tools such as cargo--but i don't think that would be useful to guix, except inasmuch as it would enable structured overrides: so static and safe systems like guix would have a structured interface to plug in the user's dependencies, configurable systems like spack can link build configuration to solver values, and builds in other contexts can fall back on dynamic configuration like autoconf or the pip resolve.

so i have this mental model of configuration mapping, which is intended to represent package manager relationships within a build system. i think a lot of my study has been around these more dynamic relationships, but i think understanding how guix users navigate the graph (which is "static" in a sense, yet composed of scheme code) would be a useful exercise for me to understand its representations of software relationships.

once again thanks so much for your time. have always found guix contributors very helpful and supportive.

  • Copy link
  • Flag this comment
  • Block
Konrad Hinsen
@khinsen@scholar.social replied  ·  activity timestamp 3 days ago

@zimoun Long-time consistency is one issue with the SAT solver approach, as every long-time conda user has probably found out.

But the SAT solver approach also has a more fundamental problem: there is no guarantee that a solution to all declared version constraints actually results in a working system.

  • Copy link
  • Flag this comment
  • Block
matrss
@matrss@mastodon.social replied  ·  activity timestamp 2 days ago

@khinsen agree, as long as install-time dependency resolution happens testing packages is fundamentally impossible. @zimoun

  • Copy link
  • Flag this comment
  • Block
Simon Tournier
@zimoun@social.sciences.re replied  ·  activity timestamp 3 days ago

@khinsen Yes. About the fundamental problem, see [1]. 😀

Then Conda users ask what’s the issue when installing “explicit packages”. Sadly, they haven’t read Conda docs:

« Since the solver is not involved, the dependencies of the explicit package(s) are not processed at all. This can leave the environment in an inconsistent state, which can be fixed by running conda update --all, for example. »

Somehow, Conda advocates are lying: Either Conda is SAT-based thus not future-proof; Either Conda isn’t SAT-based thus poorly deals with the situation.

Therefore, all scientific practitioners should know that Conda isn’t designed for scientific projects.

1: https://www.mancoosi.org/edos/algorithmic/#toc15
2: https://docs.conda.io/projects/conda/en/stable/dev-guide/deep-dives/solvers.html#explicit-package-installs

Solvers — conda 25.9.1 documentation

EDOS - algorithmic

  • Copy link
  • Flag this comment
  • Block
Filip Buric
@filipb@mathstodon.xyz replied  ·  activity timestamp 2 days ago

@zimoun @khinsen Not contesting the issue with SAT dep solving, but good practices would have the environment spec exported with exact package versions for later reproduction anyway, greatly limiting the SAT problem when recreating the env (mostly to a pick of OS + architecture build, if i understand correctly). One issue from my experience with recreating from such a YAML file [1] may be lack of package versions for the given target OS + arch. Another issue is old versions disappearing from the channels but this isn't specific to Conda. I don't recall the solver ever being a problem when reproducing, but I do tend to only use the same constellation of packages.

I've my own frustrations with Conda, but with the prospect of a replacement in mind, I see 3 big pros with it, mainly having to do with convenience (at least when working on a project):

1. Since it's a general package manager, one can install not just Python libraries, but also various software. In the bioinformatics world particularly, this has been an immense help to set up a working heterogeneous processing pipeline easily.

2. On the surface it behaves like a drop-in replacement for Python native environments + PyPI, making the switch quite easy, while also offering useful extras like committed environment versions that can be rolled back.

3. It's completely userspace, meaning you don't need a back-and-forth with a server admin to set it up.

I would like to find something better than Conda but limited time to experiment and learn a new system, which I'm sure is the situation for many. Guix does sound quite interesting though and is on my TODO list 😊 (I know it has some of the features I listed above).

[1] https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#creating-an-environment-from-an-environment-yml-file

Managing environments — conda 25.9.2.dev38 documentation

  • Copy link
  • Flag this comment
  • Block
Samuel
@samuel@social.nihil.ws replied  ·  activity timestamp 3 days ago

The whole reason I got into #guix was to have something that actually solves the problem Conda claims to solve. With Conda, I've never had a project that, from beginning to end: (a) ran on multiple machines, (b) had a consistent environment where I could run my experiments, and (c) stayed like that.

I still need to convince my co-authors to switch to #guix (or even #nix), but I have hope we'll get there.

  • Copy link
  • Flag this comment
  • Block
Log in

bonfire.cafe

A space for Bonfire maintainers and contributors to communicate

bonfire.cafe: About · Code of conduct · Privacy · Users · Instances
Bonfire social · 1.0.0-rc.3.21 no JS en
Automatic federation enabled
  • Explore
  • About
  • Members
  • Code of Conduct
Home
Login