Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -129,11 +129,15 @@ Developers, please make pull requests to the https://github.com/elmbeech/physice


## Release Notes:
+ version 4.1.2 (2026-03-06): elmbeech/physicelldataloader
+ new **custom_data_astype** TimeStep class and TimeSeries class function to set the dtype of custom\_data variables even after the timestep or timeseries is loaded.
+ TimeSeries \_\_init\_\_ function can now handle a list of TimeStep objects as input instead of a path.

+ version 4.1.1 (2026-02-28): elmbeech/physicelldataloader
+ reduce memory footprint.
+ reduced memory footprint.

+ version 4.1.0 (2025-12-31): elmbeech/physicelldataloader
+ new TimeStep class and TimeSeris class function **get_spatialdata** and command line command **pcdl_get_spatialdata**.
+ new **get_spatialdata** TimeStep class and TimeSeris class function and **pcdl_get_spatialdata** command line command.

+ version 4.0.5 (2025-10-22): elmbeech/physicelldataloader
+ **settingxml** default is now set to False, because the cell\_type id label mapping can, in recent PhysiCell output, be retrieved from output\*.xml too.
Expand Down
2 changes: 2 additions & 0 deletions man/REFERENCE.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ Basically, there are four types of functions:

### TimeStep initialize
+ [help(pcdl.TimeStep.\_\_init\_\_)](https://github.com/elmbeech/physicelldataloader/tree/master/man/docstring/mcds.__init__.md) #! workhorse function
+ [help(mcds.custom_data_astype)](https://github.com/elmbeech/physicelldataloader/tree/master/man/docstring/mcds.custom_data_astype.md) #! workhorse function
+ [help(mcds.set_verbose_false)](https://github.com/elmbeech/physicelldataloader/tree/master/man/docstring/mcds.set_verbose_false.md) #! workhorse function
+ [help(mcds.set_verbose_true)](https://github.com/elmbeech/physicelldataloader/tree/master/man/docstring/mcds.set_verbose_true.md) #! workhorse function

Expand Down Expand Up @@ -124,6 +125,7 @@ Basically, there are four types of functions:
+ [help(mcdsts.read_mcds)](https://github.com/elmbeech/physicelldataloader/tree/master/man/docstring/mcdsts.read_mcds.md)
+ [help(mcdsts.get_mcds_list)](https://github.com/elmbeech/physicelldataloader/tree/master/man/docstring/mcdsts.get_mcds_list.md) #! workhose function
+ [help(mcdsts.get_annmcds_list)](https://github.com/elmbeech/physicelldataloader/tree/master/man/docstring/mcdsts.get_annmcds_list.md) #! workhose function
+ [help(mcdsts.custom_data_astype)](https://github.com/elmbeech/physicelldataloader/tree/master/man/docstring/mcdsts.custom_data_astype.md) #! workhorse function
+ [help(mcdsts.set_verbose_false)](https://github.com/elmbeech/physicelldataloader/tree/master/man/docstring/mcdsts.set_verbose_false.md) #! workhorse function
+ [help(mcdsts.set_verbose_true)](https://github.com/elmbeech/physicelldataloader/tree/master/man/docstring/mcdsts.set_verbose_true.md) #! workhorse function

Expand Down
3 changes: 2 additions & 1 deletion man/TUTORIAL_python3_timestep.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,8 @@ python3 -c"import pathlib, pcdl, shutil; pcdl.install_data(); s_ipath=str(pathli
By default, all data related to the snapshot is loaded.
For speed and less memory usage, it is however possible to only load the essential (output xml and cell mat data),
and exclude microenvironment, graph data, PhysiBoss data, and the PhysiCell\_settings.xml cell type ID label mapping. \
For custom\_data variables it is possible to specify data types, apart from the generic float type, namely: int, bool, and str. \
For custom\_data variables it is possible to specify data types, apart from the generic float type, namely: int, bool, and str.
This can be done too, after the data is loaded, using the mcds.custom\_data\_astype function. \
For paths, in general, unix (slash) and windows (backslash) notation will work.

The basic way to load a mcds time step object:
Expand Down
2 changes: 1 addition & 1 deletion man/docstring/mcds.__init__.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
relative or absolute path to the directory where
the PhysiCell output files are stored.

custom_dtype: dictionary; default is {}
custom_data_type: dictionary; default is {}
variable to specify custom_data variable types other than
floats (namely: int, bool, str) like this: {var: dtype, ...}.
downstream float and int will be handled as numeric,
Expand Down
26 changes: 26 additions & 0 deletions man/docstring/mcds.custom_data_astype.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# mcds.custom_data_astype()


## input:
```
custom_data_type: dictionary; default is {}
variable to specify custom_data variable types other than
floats (namely: int, bool, str) like this: {var: dtype, ...}.
downstream float and int will be handled as numeric,
bool as Boolean, and str as categorical data.

```

## output:
```
self.data['cell']['df_cell']:
the dtype of columns as specified in the custom_data_type dictionary.

```

## description:
```
function to set the dtype of custom_data variables,
even after the data is loaded.

```
5 changes: 3 additions & 2 deletions man/docstring/mcdsts.__init__.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,10 @@

## input:
```
output_path: string, default '.'
output_path: string or list of mcds objects, default '.'
relative or absolute path to the directory where
the PhysiCell output files are stored.
the PhysiCell output files are stored or
a list of mcds timestep objects.

custom_data_type: dictionary; default is {}
variable to specify custom_data variable types
Expand Down
26 changes: 26 additions & 0 deletions man/docstring/mcdsts.custom_data_astype.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# mcdsts.custom_data_astype()


## input:
```
custom_data_type: dictionary; default is {}
variable to specify custom_data variable types other than
floats (namely: int, bool, str) like this: {var: dtype, ...}.
downstream float and int will be handled as numeric,
bool as Boolean, and str as categorical data.

```

## output:
```
self.data['cell']['df_cell']:
the dtype of columns as specified in the custom_data_type dictionary.

```

## description:
```
function to set the dtype of custom_data variables,
even after the data is loaded.

```
8 changes: 8 additions & 0 deletions man/scarab.py
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,10 @@ def docstring_md(s_function, ls_doc, s_header=None, s_opath='man/docstring/'):
ls_doc = pcdl.TimeStep.__init__.__doc__.split('\n'),
s_header = "mcds = pcdl.TimeStep('path/to/outputnnnnnnnn.xml')"
)
docstring_md(
s_function = 'mcds.custom_data_astype',
ls_doc = pcdl.TimeStep.custom_data_astype.__doc__.split('\n'),
)
docstring_md(
s_function = 'mcds.set_verbose_false',
ls_doc = pcdl.TimeStep.set_verbose_false.__doc__.split('\n'),
Expand Down Expand Up @@ -332,6 +336,10 @@ def docstring_md(s_function, ls_doc, s_header=None, s_opath='man/docstring/'):
ls_doc = pcdl.TimeSeries.__init__.__doc__.split('\n'),
s_header = "mcdsts = pcdl.TimeSeries('path/to/output')"
)
docstring_md(
s_function = 'mcdsts.custom_data_astype',
ls_doc = pcdl.TimeSeries.custom_data_astype.__doc__.split('\n'),
)
docstring_md(
s_function = 'mcdsts.get_xmlfile_list',
ls_doc = pcdl.TimeSeries.get_xmlfile_list.__doc__.split('\n'),
Expand Down
2 changes: 1 addition & 1 deletion pcdl/VERSION.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = '4.1.1'
__version__ = '4.1.2'
84 changes: 60 additions & 24 deletions pcdl/timeseries.py
Original file line number Diff line number Diff line change
Expand Up @@ -164,9 +164,10 @@ class TimeSeries:
def __init__(self, output_path='.', custom_data_type={}, load=True, microenv=True, graph=True, physiboss=True, settingxml=False, verbose=True):
"""
input:
output_path: string, default '.'
output_path: string or list of mcds objects, default '.'
relative or absolute path to the directory where
the PhysiCell output files are stored.
the PhysiCell output files are stored or
a list of mcds timestep objects.

custom_data_type: dictionary; default is {}
variable to specify custom_data variable types
Expand Down Expand Up @@ -208,31 +209,67 @@ def __init__(self, output_path='.', custom_data_type={}, load=True, microenv=Tru
TimeSeries.__init__ generates a class instance the instance offers
functions to process all time steps in the output_path directory.
"""
output_path = output_path.replace('\\','/')
while (output_path.find('//') > -1):
output_path = output_path.replace('//','/')
if (output_path.endswith('/')) and (len(output_path) > 1):
output_path = output_path[:-1]
if not os.path.isdir(output_path):
sys.exit(f'Error @ TimeSeries.__init__ : this is not a path! could not load {output_path}.')
self.path = output_path
# bue 2022-10-22: is output*.xml always the correct pattern? i could add initial or final step.
self.ls_xmlfile = [s_pathfile.replace('\\','/').split('/')[-1] for s_pathfile in sorted(glob.glob(self.path + f'/output*.xml'))]
if (len(self.ls_xmlfile) == 0):
sys.exit(f'Error @ TimeSeries.__init__ : could not detect any output*.xml! is the given output_path correct? {output_path}')
self.custom_data_type = custom_data_type
self.microenv = microenv
self.graph = graph
self.physiboss = physiboss
self.settingxml = settingxml
# set generic variables
self.verbose = verbose
if load:
self.read_mcds()
else:
self.l_mcds = None
self.l_annmcds = None
self.l_sdmcds = None

# load mcds timeseries from list if mcds timesteps
if (type(output_path) is list):
self.ls_xmlfile = None
self.l_mcds = output_path
self.custom_data_type = None
self.microenv = None
self.graph = None
self.physiboss = None
self.settingxml = None

# load mcds timrseries from physicell output directory
else:
output_path = output_path.replace('\\','/')
while (output_path.find('//') > -1):
output_path = output_path.replace('//','/')
if (output_path.endswith('/')) and (len(output_path) > 1):
output_path = output_path[:-1]
if not os.path.isdir(output_path):
sys.exit(f'Error @ TimeSeries.__init__ : this is not a path! could not load {output_path}.')
self.path = output_path
# bue 2022-10-22: is output*.xml always the correct pattern? i could add initial or final step.
self.ls_xmlfile = [s_pathfile.replace('\\','/').split('/')[-1] for s_pathfile in sorted(glob.glob(self.path + f'/output*.xml'))]
if (len(self.ls_xmlfile) == 0):
sys.exit(f'Error @ TimeSeries.__init__ : could not detect any output*.xml! is the given output_path correct? {output_path}')
self.custom_data_type = custom_data_type
self.microenv = microenv
self.graph = graph
self.physiboss = physiboss
self.settingxml = settingxml
if load:
self.read_mcds()
else:
self.l_mcds = None


def custom_data_astype(self, custom_data_type={}):
"""
input:
custom_data_type: dictionary; default is {}
variable to specify custom_data variable types other than
floats (namely: int, bool, str) like this: {var: dtype, ...}.
downstream float and int will be handled as numeric,
bool as Boolean, and str as categorical data.

output:
self.data['cell']['df_cell']:
the dtype of columns as specified in the custom_data_type dictionary.

description:
function to set the dtype of custom_data variables,
even after the data is loaded.
"""
# variable typing
for mcds in self.get_mcds_list():
mcds.custom_data_astype(custom_data_type=custom_data_type)


def set_verbose_false(self):
"""
Expand Down Expand Up @@ -1826,4 +1863,3 @@ def get_sdmcds_list(self):
function returns a binding to the self.l_sdmcds list of spdata mcds objects.
"""
return self.l_sdmcds

23 changes: 22 additions & 1 deletion pcdl/timestep.py
Original file line number Diff line number Diff line change
Expand Up @@ -491,7 +491,7 @@ def __init__(self, xmlfile, output_path='.', custom_data_type={}, microenv=True,
relative or absolute path to the directory where
the PhysiCell output files are stored.

custom_dtype: dictionary; default is {}
custom_data_type: dictionary; default is {}
variable to specify custom_data variable types other than
floats (namely: int, bool, str) like this: {var: dtype, ...}.
downstream float and int will be handled as numeric,
Expand Down Expand Up @@ -545,6 +545,27 @@ def __init__(self, xmlfile, output_path='.', custom_data_type={}, microenv=True,
self.data = self._read_xml(xmlfile, output_path)


def custom_data_astype(self, custom_data_type={}):
"""
input:
custom_data_type: dictionary; default is {}
variable to specify custom_data variable types other than
floats (namely: int, bool, str) like this: {var: dtype, ...}.
downstream float and int will be handled as numeric,
bool as Boolean, and str as categorical data.

output:
self.data['cell']['df_cell']:
the dtype of columns as specified in the custom_data_type dictionary.

description:
function to set the dtype of custom_data variables,
even after the data is loaded.
"""
# variable typing
self.data['cell']['df_cell'] = self.data['cell']['df_cell'].astype(custom_data_type)


def set_verbose_false(self):
"""
input:
Expand Down
8 changes: 4 additions & 4 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -71,18 +71,18 @@ classifiers = [
# bue 2024-12-06: enforcing some versions
dependencies = [
"anndata>=0.10.8",
"bioio>=1.2.1", # needs numpy < 2.0.0
"bioio>=2.0.0",
"bioio-ome-tiff",
"geopandas>=0.14", # spatialdata==0.6.0
"geopandas>=0.14", # spatialdata
"matplotlib",
"neuroglancer",
"numpy",
"pandas>=2.2.2",
"requests",
"scikit-image>=0.24.0",
"scipy>=1.13.0",
"shapely>=2.0.1", # spatialdata==0.6.0
"spatialdata>=0.6.0",
"shapely>=2.0.1", # spatialdata
"spatialdata>=0.7.2",
"vtk",
]

Expand Down
11 changes: 11 additions & 0 deletions test/test_timeseries_2d.py
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,12 @@ def test_mcdsts_make_movie_jpeg6(self, mcdsts=mcdsts):
class TestTimeSeriesInit(object):
''' tests for loading a pcdl.TimeSeries data set. '''

def test_mcdsts_custom_data_astype(self):
mcdsts = pcdl.TimeSeries(s_path_2d, load=True, verbose=False)
mcdsts.custom_data_astype({'sample': bool})
assert(str(type(mcdsts)) == "<class 'pcdl.timeseries.TimeSeries'>") and \
(mcdsts.l_mcds[-1].data['cell']['df_cell']['sample'].dtype == bool)

def test_mcdsts_set_verbose_true(self):
mcdsts = pcdl.TimeSeries(s_path_2d, load=False, verbose=False)
mcdsts.set_verbose_true()
Expand Down Expand Up @@ -177,6 +183,11 @@ def test_mcdsts_read_mcds_xmlfilelist(self):
(len(mcdsts.l_mcds) == 3) and \
(mcdsts.l_mcds == l_mcds)

def test_mcdsts_form_list_of_mcds(self):
mcdsts = pcdl.TimeSeries([])
assert(str(type(mcdsts)) == "<class 'pcdl.timeseries.TimeSeries'>") and \
(len(mcdsts.l_mcds) == 0)


## micro environment related functions ##

Expand Down
11 changes: 11 additions & 0 deletions test/test_timeseries_3d.py
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,12 @@
class TestTimeSeries3dInit(object):
''' tests for loading a pcdl.TimeSeries data set. '''

def test_mcdsts_custom_data_astype(self):
mcdsts = pcdl.TimeSeries(s_path_3d, load=True, verbose=False)
mcdsts.custom_data_astype({'sample': bool})
assert(str(type(mcdsts)) == "<class 'pcdl.timeseries.TimeSeries'>") and \
(mcdsts.l_mcds[-1].data['cell']['df_cell']['sample'].dtype == bool)

def test_mcdsts_set_verbose_true(self):
mcdsts = pcdl.TimeSeries(s_path_3d, load=False, verbose=False)
mcdsts.set_verbose_true()
Expand Down Expand Up @@ -104,6 +110,11 @@ def test_mcdsts_read_mcds_xmlfilelist(self):
(len(mcdsts.l_mcds) == 3) and \
(mcdsts.l_mcds == l_mcds)

def test_mcdsts_form_list_of_mcds(self):
mcdsts = pcdl.TimeSeries([])
assert(str(type(mcdsts)) == "<class 'pcdl.timeseries.TimeSeries'>") and \
(len(mcdsts.l_mcds) == 0)


## micro environment related functions ##

Expand Down
Loading
Loading