Environment Path Analysis

Scope

This page integrates the maintained findings from the PyTorch dependency and environment-path evaluations. The source material came from:

the 2-node profiling evaluation with python3 -c "import torch"
the environment-path analysis for the same workload
the full-path profiling evaluation in the final experiment set

The goal is to explain why a simple import touches so many files and how to reduce unnecessary path fan-out without breaking the active runtime.

Workload Context

The maintained conclusions come from profiling-enabled runs that launched a Python environment through Copper and then executed a minimal python3 -c "import torch" workload. Although the user-visible workload is small, the runtime behavior is not. The interpreter, import system, package manager environment, dynamic loader, and native Torch dependency graph all participate in the startup path.

The corresponding profiling outputs showed the mounted environment dominating the lookup stream, especially under:

the environment root itself
lib/
lib/python3.12
lib/python3.12/site-packages
site-packages/torch

The maintained iter3 artifacts under docs/source/iter3 also preserved a full-path usage analysis for the active Conda environment. That analysis is useful because it complements the hot-path tables with a coarse “used versus available” estimate for the exact same workload family.

Why `import torch` Touches So Many Paths

The import is simple at the Python source level, but not at the runtime level. In one launch, the following all happen before user code does meaningful work:

Python interpreter startup
import-system path discovery under lib/python* and site-packages
package discovery inside torch and its transitive imports
dynamic-loader resolution for compiled extension modules
ROCm shared-library loading
repeated probes for optional or absent libraries and helper paths

In practice, the observed path fan-out is the combined effect of:

the selected python3 executable from PATH
Conda activation variables such as CONDA_PREFIX
Python import search rules and site.py processing
LD_LIBRARY_PATH search behavior for native libraries
PyTorch’s compiled ROCm dependency graph
repeated missing-path probes that are normal for loader startup

This is why a nominally simple import often fans out into:

Python interpreter startup work
standard-library discovery
site.py processing
package-directory walks inside site-packages
import of many Torch Python subpackages
loading of compiled extension modules
dynamic-loader resolution of large native dependency sets
optional-library probing that is expected to fail in many cases

Observed Path Classes

The full-path profiling run showed the hottest filesystem classes clearly. Across the four-rank cluster summary, the dominant classes were:

Path class	Total events	Meaning
`environment_prefix`	`5,656,710`	broad activity under the environment root and its parent directories
`python_stdlib`	`394,464`	interpreter startup and stdlib discovery
`torch_python_package`	`377,216`	Python-side Torch package import activity
`python_site_packages`	`327,640`	package-discovery traffic in the active environment
`torch_native_library`	`180,297`	compiled Torch and ROCm libraries loaded during startup
`missing_shared_library_probe`	`2,180`	optional or absent shared-library probes

Representative missing-path classes included:

libhsa-amd-aqlprofile64.so
python312.zip
glibc-hwcaps
pyvenv.cfg

These are usually normal probes rather than application bugs.

The profiling notes also showed heavy data reads from native libraries such as:

libtorch_cpu.so
libtorch_python.so
libamdhip64.so
libMIOpen.so
libmagma.so
librocblas.so
librocsolver.so
librocsparse.so

That pattern is consistent with a large GPU-enabled Torch stack rather than a small pure-Python package import.

Path Coverage in the Iter3 Environment Copy

The iter3 path-usage analysis compared the observed full-path outputs against the full existing path universe under the selected Conda environment root:

Measure	Value
All existing paths under the selected root	`44,262`
Existing files under the selected root	`42,028`
Existing directories under the selected root	`2,234`
Existing paths observed in the run	`2,350`
Missing probe paths	`559`
Existing paths not observed in the run	`41,912`

The same summary expressed that as coverage of the selected root:

Coverage metric	Value
Observed files	`1,982` of `42,028`
File coverage	`4.72%`
Observed directories	`368` of `2,235`
Directory coverage	`16.47%`

That result is useful, but it needs to be interpreted carefully. It does not mean the remaining roughly 95% of files are safe to delete in general. It means only that, in this same-app, same-node-count, same-configuration run, the observed import path touched a relatively small fraction of the total environment tree.

Operationally, the main value of this result is:

it shows that the active workload depends on a minority of the available file tree during this exact startup path
it supports using observed paths as an initial allowlist for cloned or filtered follow-up experiments
it argues for evidence-driven pruning rather than assuming the whole environment is equally active

Environment Variables That Matter Most

PATH: Chooses which python3 is launched. Once the interpreter comes from the Conda environment mounted through Copper, many later paths are derived from that prefix automatically.
CONDA_PREFIX: Anchors the active environment root, including bin, lib, and lib/python*/site-packages.
PYTHONPATH: Adds optional import roots. It is important, but it is not the whole story; Python still derives a large built-in search path from the interpreter and its install prefix.
VIRTUAL_ENV: Is often not the main driver for Conda-based runs, but Python still probes for virtual-environment style markers such as pyvenv.cfg while establishing its runtime layout.
LD_LIBRARY_PATH: Controls shared-library search order for compiled extensions and ROCm libraries. Duplicate or stale entries here can create large probe storms.
srun --export=ALL: Replicates the activated environment across all ranks, which is necessary for correctness but also multiplies import and loader discovery activity.

Path Sources by Subsystem

Different path classes come from different subsystems, so path reduction works best when those subsystems are considered separately.

Python import machinery contributes:

interpreter-prefix discovery
stdlib and lib-dynload walks
site-packages scanning
package and subpackage traversal for torch and its transitive imports

Environment activation contributes:

active environment prefixes from CONDA_PREFIX and related variables
path insertion in PATH
optional import roots from PYTHONPATH
propagated shell state when tasks are launched with full environment export

The dynamic loader contributes:

shared-library searches under the active environment
probing across LD_LIBRARY_PATH entries
optional-library probes for features that may not be installed
hardware-capability directory probes such as glibc-hwcaps

The iter3 path-class summary is consistent with that subsystem view. The largest path classes were:

environment_prefix with 2,828,400 events
python_stdlib with 197,232 events
torch_python_package with 188,608 events
python_site_packages with 163,484 events
torch_native_library with 57,503 events

Those totals show that most activity remains concentrated in a small set of environment, interpreter, package, and native-library regions rather than being evenly distributed across the full environment tree.

Why Missing Paths Repeat

The profiling data shows many repeated negative probes. This is expected for Python and shared-library startup:

the first lookup discovers a path is absent
later lookups ask for the same exact path again
Copper can serve that repeated miss from the metadata ENOENT TTL

This is why high TTL-serve counts are a positive signal. They mean Copper is collapsing repeated negative metadata work that the workload would otherwise reissue.

The version4 path-analysis note highlighted several repeated examples:

libhsa-amd-aqlprofile64.so
python312.zip
glibc-hwcaps
pyvenv.cfg

The iter3 artifacts preserved the same pattern in both the TTL top-path tables and the missing-probe lists. Representative repeated probe paths included:

.../torch/lib/libhsa-amd-aqlprofile64.so
.../lib/python312.zip
.../lib/glibc-hwcaps
.../conda_env/pyvenv.cfg
.../conda_env/bin/pyvenv.cfg

These should generally be interpreted as normal startup probes first and optimization opportunities second.

Pruning and Cleanup Guidance

The safest cleanup sequence is:

remove duplicate path entries first
remove obviously nonexistent path entries
remove stale environment or toolchain directories
only then experiment with a reduced or allowlist-based environment copy

The environment-path and full-path profiling evaluations support the following practical rules:

keep the active environment core intact first: environment root, bin, lib, lib/python*, and site-packages
prefer trimming duplicate or stale LD_LIBRARY_PATH entries before touching Torch library directories
prefer trimming duplicate or unnecessary PYTHONPATH additions before modifying the interpreter tree
treat python*.zip, pyvenv.cfg, and glibc-hwcaps as optimization hints, not as correctness failures

Minimization Priorities

The maintained guidance from the path-analysis work is to minimize the active environment in layers rather than trying to remove all path fan-out at once.

The safest order is:

eliminate duplicate path entries
eliminate obviously nonexistent path entries
remove stale toolchain or environment references that are no longer active
preserve the active runtime core while measuring again
only then consider more aggressive allowlist-style environment reduction

This approach keeps the debugging loop tied to observed profiling evidence instead of guessing which paths are safe to remove.

Operational Interpretation

The right question is usually not “why is Python probing so many files?” but “which of those probes are avoidable in the active environment?”

The maintained guidance from these evaluations is:

keep the active environment small and purpose-built
route only the necessary environment prefixes through Copper
keep the metadata ENOENT TTL enabled
use profiling outputs to identify duplicate, stale, or noisy environment paths before changing package contents