Skip to main content

PyPlyne

A language for clean pipelines in Python

Write clean, readable data pipelines as small scripts that run with the Python code and tools you already use.

Installuv add "pyplyne @ git+https://github.com/pyplyne-org/pyplyne.git"

Requires Python 3.13+. Use the Git source install until the PyPI package is published.

tables.pyplynedf pipeline
summary = df sales  |> group_by(region)  |> summarize(total = sum(amount))  |> arrange(total)
Start with the table named sales.

input

regionamount
north110
south165
north200
west70
south90
west120

output

regionamount
north110
south165
north200
west70
south90
west120
records.pyplyneseq pipeline
restock = seq orders  |> filter(qty > 1)  |> keep_fields(item)  |> set_fields(buy = item == "pens")
Start with a list of order records.

input

item "notebook"qty 1
item "coffee"qty 3
item "pens"qty 2

output

item "notebook"qty 1
item "coffee"qty 3
item "pens"qty 2

A small language layer for everyday Python data work.

Run it once, embed it in Python, or keep a live session warm.

Write a pipeline file

Write a .pyplyne pipeline script and run it once.

Start with the simplest workflow: write clean pipeline code, import the Python helpers you already trust, then run it from the command line.

  • Use normal Python imports at the top of the file.
  • Write one readable pipeline for tables or records.
  • Run the script with the PyPlyne CLI.
Read the quickstart
pipeline.pyplyne
from pathlib import Path

sales = df read_csv(Path("sales.csv"))
summary = sales
  |> where(amount > 100)
  |> group_by(region)
  |> summarize(total = sum(amount))

print(summary)
summary
terminal
uv run pyplyne pipeline.pyplyne

Use it from Python

Run PyPlyne and get real Python objects back.

Run a short PyPlyne source string in one call, end it with the value you want, and read the live Polars DataFrame from the result.

  • Pass existing Python or Polars objects into the session.
  • Use a one-shot run when you do not need persistent state.
  • Use the source string's final expression as result.result.
See the Python API
python
import polars as pl
from pyplyne import run

sales = pl.DataFrame([
    {"region": "north", "amount": 120},
    {"region": "south", "amount": 80},
])

result = run("""
summary = df sales
  |> where(amount > 100)
  |> select(region, amount)

summary
""", context={"sales": sales})

summary = result.result
print(summary)

Explore from VS Code

Run blocks interactively while editing a normal pipeline file.

The local VS Code extension gives `.pyplyne` files syntax highlighting, block execution, diagnostics, and a persistent PyPlyne session.

  • Load data once, then run the block you are currently editing.
  • Use the assignment command when you want to inspect the assigned value.
  • Jump between blocks with the default editor keybindings.
Set up VS Code
large_sales.pyplyne
sales = df read_csv("sales.csv")

large_sales = sales
  |> where(amount > 100)
  |> select(region, amount)

large_sales

Agent iteration loop

Let agents refine pipelines against warm data.

Keep imports, helpers, and loaded data in one PyPlyne session so an agent can send small snippets, read structured feedback, and keep improving the transformation.

  • Preload expensive setup once with --load.
  • Return JSON with result, diagnostics, traceback, and shape metadata.
  • Use stable source names so errors point back to each agent step.
See the agent loop
terminal
uv run pyplyne serve --port 8765 --load setup.pyplyne

uv run pyplyne send --json <<'PYPLYNE'
summary = df sales
  |> group_by(region)
  |> summarize(total = sum(amount))

summary
PYPLYNE