BED¶
from_pyr(pyr)
¶
Convert a PyRanges object to a BED-like DataFrame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pyr
|
PyRanges
|
PyRanges object with at least the columns "Chromosome", "Start", and "End". |
required |
Returns:
| Type | Description |
|---|---|
DataFrame
|
BED-like DataFrame with columns renamed to "chrom", "chromStart", and "chromEnd". |
Source code in python/seqpro/bed.py
read(path)
¶
Reads a bed-like (BED3+) file as a pandas DataFrame. The file type is inferred from the file extension and supports .bed, .narrowPeak, and .broadPeak.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
PathLike
|
Path to the bed-like file. |
required |
Returns:
| Type | Description |
|---|---|
DataFrame
|
BED-like DataFrame with typed columns and zero-based coordinate metadata. |
Source code in python/seqpro/bed.py
set_schema(bed, to, from_=None)
¶
Rename coordinate columns to match a target schema.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
A polars or pandas DataFrame with genomic coordinate columns. |
required | |
to
|
SchemaLike
|
Target schema: a shorthand string ("bed", "pb", "pr", "gtf") or a tuple of column names (chrom, start, end[, strand]). |
required |
from_
|
SchemaLike | None
|
Source schema hint. Auto-detected if not provided. |
None
|
Source code in python/seqpro/_coords.py
sort(bed)
¶
Sort a BED-like DataFrame by chromosome, start, and end position, using the natural order of chromosome names e.g. 1, 2, ..., 10, ...
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
bed
|
FrameT
|
DataFrame with BED-format columns: "chrom", "chromStart", "chromEnd". Accepts polars or pandas DataFrames. |
required |
Returns:
| Type | Description |
|---|---|
FrameT
|
Sorted DataFrame of the same type as the input. |
Source code in python/seqpro/bed.py
to_pyr(bedlike)
¶
Convert a BED-like DataFrame to a PyRanges object.
Warning
PyRanges automatically sorts the DataFrame by chromosome and start position, so the order of the regions may change after conversion. You can keep track of the original order by adding an index column before converting to a PyRanges object. After converting back to a DataFrame, you can sort the DataFrame by the index to get the original order.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
bedlike
|
BED-like DataFrame (polars or pandas) with at least the columns "chrom", "chromStart", and "chromEnd". |
required |
Returns:
| Type | Description |
|---|---|
PyRanges
|
PyRanges object with columns renamed to "Chromosome", "Start", and "End". |
Source code in python/seqpro/bed.py
with_len(bed, length)
¶
Set the length of regions in a BED-like DataFrame to a fixed length by expanding or shrinking relative to the center (or peak) of the window. If the original region size + length is odd, the center will be 1 position closer the right end.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
bed
|
FrameT
|
BED-like DataFrame with at least the columns "chromStart" and "chromEnd". |
required |
length
|
int
|
Desired length of the windows. Must be non-negative. |
required |
Returns:
| Type | Description |
|---|---|
FrameT
|
DataFrame of the same type as the input with updated "chromStart" and "chromEnd" columns. |