Troubleshooting
PyPlyne diagnostics identify the phase where the problem happened. Start there, then match the error symptom.
| Phase | Meaning | First thing to check |
|---|---|---|
parse | The source did not match the grammar. | Syntax near the caret, unsupported expression forms, or shape annotations on the wrong side. |
compile | The source parsed, but the compiler rejected the pipeline semantics. | Shape contracts, verb families, placeholders, and pipeline source annotations. |
runtime | Python or Polars raised while executing the compiled code. | Exception type, data shape, column names, files, dependencies, and Python tracebacks. |
First Pass Triage
Use text diagnostics when reading failures by hand:
uv run pyplyne send --expr 'rows |> where(amount > 0)'
Use JSON when another tool, editor, or test needs structured fields:
uv run pyplyne send --expr 'rows |> where(amount > 0)' --json
The JSON response includes ok, phase, error, traceback, shapes, and a
diagnostic object with fields such as error_type, message, line,
column, source, caret, hint, and display.
Failed server requests return a non-2xx HTTP status, but the response body still contains the useful JSON diagnostic. Clients and tests should read the body even when their HTTP library raises an error.
Good debugging loop:
- Read
phaseanderror_typefirst. - Use
source,line,column, andcaretto identify the smallest failing expression. - Check
hintbefore reading the full traceback. - Use
shapesto confirm whether each name isseq,df, or scalar. - Fix one contract at a time and rerun the smallest expression.
For piped or generated source, you can pass a stable virtual source name so diagnostics point to something recognizable:
printf 'numbers = seq [1, 2, 3\n' \
| uv run pyplyne send --json --source-name scratch.pyplyne
Parse Errors
Parse errors mean PyPlyne could not read the source as valid PyPlyne syntax. Fix the source form before looking for type, data, or Polars issues.
Common parse symptoms:
syntax errornear the end of a line Cause: missing],),}, quote, or newline. Fix: balance delimiters and rerun the smallest statement.shape annotations go on the right-hand sideCause: a shape was written before the name. Fix: writename = seq expressionorname = df expression.- The caret is on Python syntax that works outside PyPlyne
Cause: the expression uses syntax not in PyPlyne's current grammar, such as
comprehensions, slices,
ifexpressions,in,is,**, star arguments, orimport *. Fix: move that logic into an imported Python helper, or rewrite with supported expressions and verbs.
Shape Annotation On The Wrong Side
Use shape annotations on the right-hand side:
sales = df read_csv("sales.csv")
Not:
df sales = read_csv("sales.csv")
Compile Errors
Compile errors mean the syntax is readable, but the pipeline contracts do not make sense yet.
Common compile symptoms:
where is a df verb, but the current pipeline is seqCause: a table verb is running on sequence-shaped data. Fix: start withdf, convert withto_table(), or use the matching sequence verb.filter is a seq verb, but the current pipeline is dfCause: a sequence verb is running on table-shaped data. Fix: start withseq, convert withto_rows(), or use the matching table verb.requires a known seq pipelineorrequires a known df pipelineCause: a pipeline starts from an arbitrary expression whose shape is unknown. Fix: seed the source withseqordf.cannot mix _ with numbered placeholdersCause: a callback uses_and_1/_2together. Fix: use only_for one argument, or only numbered placeholders.numbered placeholders must start at _1 and be consecutiveCause: a callback skips placeholder numbers. Fix: use_1,_2, ... without gaps.
Wrong Verb Family
where is a table verb. filter is a sequence verb.
rows = seq [{"amount": 120}]
rows |> filter(amount > 100)
rows = df [{"amount": 120}]
rows |> where(amount > 100)
If you need to cross shapes, use to_rows() or to_table():
rows = df [{"amount": 120}]
high_value_rows = rows
|> to_rows()
|> filter(amount > 100)
Missing Shape For A Pipeline Source
When a pipeline starts from an arbitrary expression, annotate it:
total = seq load_values()
|> reduce(_1 + _2)
This tells PyPlyne which verb family should apply.
Runtime Errors
Runtime errors come from executing generated Python. The last line of the error usually gives the exception type and message; the traceback and PyPlyne diagnostic show the context that led there.
Common runtime symptoms:
TypeError: seq annotation expects iterable dataCause:seqwas applied to a scalar or non-iterable value. Fix: use iterable data, or wrap one item in a list.TypeError: df annotation expects table-shaped dataCause:dfwas applied to a scalar or other non-table value. Fix: use dictionaries, row lists, DataFrames, LazyFrames, or file/table reads.ColumnNotFoundErrorCause: a Polars table expression referenced a missing column. Fix: check spelling, inspect the columns before this step, or moveselect(...)until after all needed columns are used.SchemaError,ShapeError,InvalidOperationError, orComputeErrorCause: Polars found a schema, dtype, shape, or computation mismatch. Fix: confirm the input schema, cast or parse values before the operation, and materialize lazy boundaries withcollect()when needed.FileNotFoundErrorCause: a file read is using the wrong path or working directory. Fix: run from the expected project root, use an absolute path, or check the path beforeread_csv,read_json,read_parquet, orread_excel.NameError: name ... is not definedCause: a function, helper, or import is missing from the session. Fix: import the module, define the helper, or use a supported PyPlyne verb.group_by(...) must be followed by summarizeCause: a grouped table state was assigned or materialized too early. Fix: addsummarize(...), or removegroup_by(...)before assignment.- Excel import or engine errors Cause: Excel support was not installed. Fix: install the optional Excel dependencies.
seq Got A Scalar Or Mapping
seq expects iterable data. Wrap a single record in a list:
row = seq [{"item": "coffee", "qty": 3}]
If the data is table-shaped, use df instead:
row = df {"item": "coffee", "qty": 3}
Missing Column In A Table Verb
Bare names inside table verbs are Polars column references:
sales = df [{"amount": 120}]
sales |> where(missing > 0)
If this raises ColumnNotFoundError, inspect the available columns and use the
actual column name:
sales |> where(amount > 0)
If a column is created in mutate(...), keep the mutate(...) before any
where(...), select(...), group_by(...), or summarize(...) step that uses
the new column.
group_by Without summarize
group_by(...) creates a grouped table state. Follow it with summarize(...)
before assignment:
summary = sales
|> group_by(region)
|> summarize(total = sum(amount))
Excel Helpers Are Missing Dependencies
Excel support is optional:
uv sync --extra excel
Interactive Sessions
Persistent sessions keep variables, imports, shapes, and _ between requests.
That is useful, but stale state can make a new snippet look broken.
Use the REPL shape commands:
:vars
:shapes
With the session server, request JSON diagnostics:
uv run pyplyne send --expr 'rows |> where(amount > 0)' --json
Fix patterns for session confusion:
- A name exists but has the wrong shape: reassign it with the intended
seqordfannotation, or start a fresh session. - A previous failed run left later code confusing: check
shapes; parse and compile failures do not execute the snippet, but runtime failures can leave earlier successful statements from the same snippet inenv. - Diagnostics refer to a generic session label: use
--source-namefor piped or expression source. - Text output hides the full stack trace: use
--jsonorrun(..., raise_on_error=False)and inspecttraceback.
Python API Diagnostics
By default, PyPlyneSession.run(...) raises errors. For applications, tests, and
agents, capture the result and inspect the raw Python result fields:
from pyplyne import PyPlyneSession
session = PyPlyneSession()
result = session.run("bad = seq 42\n", raise_on_error=False)
if not result.ok:
print(result.phase)
print(type(result.error).__name__, result.error)
print(result.traceback)
print(result.shapes)
The Python API result exposes phase, error, traceback, stdout,
stderr, result, and shapes. The HTTP JSON response adds the formatted
diagnostic object with error_type, line, column, caret, hint, and
display.
When turning a fix into a regression test, assert the specific failure mode instead of only checking that "something failed":
result = session.run("nonsense = seq 42\n", raise_on_error=False)
assert not result.ok
assert result.phase == "runtime"
assert type(result.error).__name__ == "TypeError"
assert "seq annotation expects iterable data" in str(result.error)