PhysiCell-Tools · elmbeech · Mar 7, 2026 · Mar 1, 2026 · Mar 7, 2026 · Mar 7, 2026
diff --git a/README.md b/README.md
@@ -129,11 +129,15 @@ Developers, please make pull requests to the https://github.com/elmbeech/physice
 
 
 ## Release Notes:
++ version 4.1.2 (2026-03-06): elmbeech/physicelldataloader
+    + new  **custom_data_astype** TimeStep class and TimeSeries class function to set the dtype of custom\_data variables even after the timestep or timeseries is loaded.
+    + TimeSeries \_\_init\_\_ function can now handle a list of TimeStep objects as input instead of a path.
+
 + version 4.1.1 (2026-02-28): elmbeech/physicelldataloader
-    + reduce memory footprint.
+    + reduced memory footprint.
 
 + version 4.1.0 (2025-12-31): elmbeech/physicelldataloader
-    + new TimeStep class and TimeSeris class function **get_spatialdata** and command line command **pcdl_get_spatialdata**.
+    + new **get_spatialdata** TimeStep class and TimeSeris class function and **pcdl_get_spatialdata** command line command.
 
 + version 4.0.5 (2025-10-22): elmbeech/physicelldataloader
     + **settingxml** default is now set to False, because the cell\_type id label mapping can, in recent PhysiCell output, be retrieved from output\*.xml too.

diff --git a/man/REFERENCE.md b/man/REFERENCE.md
@@ -39,6 +39,7 @@ Basically, there are four types of functions:
 
 ### TimeStep initialize
 + [help(pcdl.TimeStep.\_\_init\_\_)](https://github.com/elmbeech/physicelldataloader/tree/master/man/docstring/mcds.__init__.md)  #! workhorse function
++ [help(mcds.custom_data_astype)](https://github.com/elmbeech/physicelldataloader/tree/master/man/docstring/mcds.custom_data_astype.md)  #! workhorse function
 + [help(mcds.set_verbose_false)](https://github.com/elmbeech/physicelldataloader/tree/master/man/docstring/mcds.set_verbose_false.md)  #! workhorse function
 + [help(mcds.set_verbose_true)](https://github.com/elmbeech/physicelldataloader/tree/master/man/docstring/mcds.set_verbose_true.md)  #! workhorse function
 
@@ -124,6 +125,7 @@ Basically, there are four types of functions:
 + [help(mcdsts.read_mcds)](https://github.com/elmbeech/physicelldataloader/tree/master/man/docstring/mcdsts.read_mcds.md)
 + [help(mcdsts.get_mcds_list)](https://github.com/elmbeech/physicelldataloader/tree/master/man/docstring/mcdsts.get_mcds_list.md)  #! workhose function
 + [help(mcdsts.get_annmcds_list)](https://github.com/elmbeech/physicelldataloader/tree/master/man/docstring/mcdsts.get_annmcds_list.md)  #! workhose function
++ [help(mcdsts.custom_data_astype)](https://github.com/elmbeech/physicelldataloader/tree/master/man/docstring/mcdsts.custom_data_astype.md)  #! workhorse function
 + [help(mcdsts.set_verbose_false)](https://github.com/elmbeech/physicelldataloader/tree/master/man/docstring/mcdsts.set_verbose_false.md)  #! workhorse function
 + [help(mcdsts.set_verbose_true)](https://github.com/elmbeech/physicelldataloader/tree/master/man/docstring/mcdsts.set_verbose_true.md)  #! workhorse function
 

diff --git a/man/TUTORIAL_python3_timestep.md b/man/TUTORIAL_python3_timestep.md
@@ -33,7 +33,8 @@ python3 -c"import pathlib, pcdl, shutil; pcdl.install_data(); s_ipath=str(pathli
 By default, all data related to the snapshot is loaded.
 For speed and less memory usage, it is however possible to only load the essential (output xml and cell mat data),
 and exclude microenvironment, graph data, PhysiBoss data, and the PhysiCell\_settings.xml cell type ID label mapping. \
-For custom\_data variables it is possible to specify data types, apart from the generic float type, namely: int, bool, and str. \
+For custom\_data variables it is possible to specify data types, apart from the generic float type, namely: int, bool, and str.
+This can be done too, after the data is loaded, using the mcds.custom\_data\_astype function. \
 For paths, in general, unix (slash) and windows (backslash) notation will work.
 
 The basic way to load a mcds time step object:

diff --git a/man/docstring/mcds.__init__.md b/man/docstring/mcds.__init__.md
@@ -11,7 +11,7 @@
                 relative or absolute path to the directory where
                 the PhysiCell output files are stored.
 
-            custom_dtype: dictionary; default is {}
+            custom_data_type: dictionary; default is {}
                 variable to specify custom_data variable types other than
                 floats (namely: int, bool, str) like this: {var: dtype, ...}.
                 downstream float and int will be handled as numeric,

diff --git a/man/docstring/mcds.custom_data_astype.md b/man/docstring/mcds.custom_data_astype.md
@@ -0,0 +1,26 @@
+# mcds.custom_data_astype()
+
+
+## input:
+```
+            custom_data_type: dictionary; default is {}
+                variable to specify custom_data variable types other than
+                floats (namely: int, bool, str) like this: {var: dtype, ...}.
+                downstream float and int will be handled as numeric,
+                bool as Boolean, and str as categorical data.
+
+```
+
+## output:
+```
+            self.data['cell']['df_cell']:
+                the dtype of columns as specified in the custom_data_type dictionary.
+
+```
+
+## description:
+```
+            function to set the dtype of custom_data variables,
+            even after the data is loaded.
+
+```
diff --git a/man/docstring/mcdsts.__init__.md b/man/docstring/mcdsts.__init__.md
@@ -3,9 +3,10 @@
 
 ## input:
 ```
-            output_path: string, default '.'
+            output_path: string or list of mcds objects, default '.'
                 relative or absolute path to the directory where
-                the PhysiCell output files are stored.
+                the PhysiCell output files are stored or
+                a list of mcds timestep objects.
 
             custom_data_type: dictionary; default is {}
                 variable to specify custom_data variable types

diff --git a/man/docstring/mcdsts.custom_data_astype.md b/man/docstring/mcdsts.custom_data_astype.md
@@ -0,0 +1,26 @@
+# mcdsts.custom_data_astype()
+
+
+## input:
+```
+            custom_data_type: dictionary; default is {}
+                variable to specify custom_data variable types other than
+                floats (namely: int, bool, str) like this: {var: dtype, ...}.
+                downstream float and int will be handled as numeric,
+                bool as Boolean, and str as categorical data.
+
+```
+
+## output:
+```
+            self.data['cell']['df_cell']:
+                the dtype of columns as specified in the custom_data_type dictionary.
+
+```
+
+## description:
+```
+            function to set the dtype of custom_data variables,
+            even after the data is loaded.
+
+```
diff --git a/man/scarab.py b/man/scarab.py
@@ -127,6 +127,10 @@ def docstring_md(s_function, ls_doc, s_header=None, s_opath='man/docstring/'):
     ls_doc = pcdl.TimeStep.__init__.__doc__.split('\n'),
     s_header = "mcds = pcdl.TimeStep('path/to/outputnnnnnnnn.xml')"
 )
+docstring_md(
+    s_function = 'mcds.custom_data_astype',
+    ls_doc = pcdl.TimeStep.custom_data_astype.__doc__.split('\n'),
+)
 docstring_md(
     s_function = 'mcds.set_verbose_false',
     ls_doc = pcdl.TimeStep.set_verbose_false.__doc__.split('\n'),
@@ -332,6 +336,10 @@ def docstring_md(s_function, ls_doc, s_header=None, s_opath='man/docstring/'):
     ls_doc = pcdl.TimeSeries.__init__.__doc__.split('\n'),
     s_header = "mcdsts = pcdl.TimeSeries('path/to/output')"
 )
+docstring_md(
+    s_function = 'mcdsts.custom_data_astype',
+    ls_doc = pcdl.TimeSeries.custom_data_astype.__doc__.split('\n'),
+)
 docstring_md(
     s_function = 'mcdsts.get_xmlfile_list',
     ls_doc = pcdl.TimeSeries.get_xmlfile_list.__doc__.split('\n'),

diff --git a/pcdl/VERSION.py b/pcdl/VERSION.py
@@ -1 +1 @@
-__version__ = '4.1.1'
+__version__ = '4.1.2'
diff --git a/pcdl/timeseries.py b/pcdl/timeseries.py
@@ -164,9 +164,10 @@ class TimeSeries:
     def __init__(self, output_path='.', custom_data_type={}, load=True, microenv=True, graph=True, physiboss=True, settingxml=False, verbose=True):
         """
         input:
-            output_path: string, default '.'
+            output_path: string or list of mcds objects, default '.'
                 relative or absolute path to the directory where
-                the PhysiCell output files are stored.
+                the PhysiCell output files are stored or
+                a list of mcds timestep objects.
 
             custom_data_type: dictionary; default is {}
                 variable to specify custom_data variable types
@@ -208,31 +209,67 @@ def __init__(self, output_path='.', custom_data_type={}, load=True, microenv=Tru
             TimeSeries.__init__ generates a class instance the instance offers
             functions to process all time steps in the output_path directory.
         """
-        output_path = output_path.replace('\\','/')
-        while (output_path.find('//') > -1):
-            output_path = output_path.replace('//','/')
-        if (output_path.endswith('/')) and (len(output_path) > 1):
-            output_path = output_path[:-1]
-        if not os.path.isdir(output_path):
-            sys.exit(f'Error @ TimeSeries.__init__ : this is not a path! could not load {output_path}.')
-        self.path = output_path
-        # bue 2022-10-22: is output*.xml always the correct pattern? i could add initial or final step.
-        self.ls_xmlfile = [s_pathfile.replace('\\','/').split('/')[-1] for s_pathfile in sorted(glob.glob(self.path + f'/output*.xml'))]
-        if (len(self.ls_xmlfile) == 0):
-            sys.exit(f'Error @ TimeSeries.__init__ : could not detect any output*.xml! is the given output_path correct? {output_path}')
-        self.custom_data_type = custom_data_type
-        self.microenv = microenv
-        self.graph = graph
-        self.physiboss = physiboss
-        self.settingxml = settingxml
+        # set generic variables
         self.verbose = verbose
-        if load:
-            self.read_mcds()
-        else:
-            self.l_mcds = None
         self.l_annmcds = None
         self.l_sdmcds = None
 
+        # load mcds timeseries from list if mcds timesteps
+        if (type(output_path) is list):
+            self.ls_xmlfile = None
+            self.l_mcds = output_path
+            self.custom_data_type = None
+            self.microenv = None
+            self.graph = None
+            self.physiboss = None
+            self.settingxml = None
+
+        # load mcds timrseries from physicell output directory
+        else:
+            output_path = output_path.replace('\\','/')
+            while (output_path.find('//') > -1):
+                output_path = output_path.replace('//','/')
+            if (output_path.endswith('/')) and (len(output_path) > 1):
+                output_path = output_path[:-1]
+            if not os.path.isdir(output_path):
+                sys.exit(f'Error @ TimeSeries.__init__ : this is not a path! could not load {output_path}.')
+            self.path = output_path
+            # bue 2022-10-22: is output*.xml always the correct pattern? i could add initial or final step.
+            self.ls_xmlfile = [s_pathfile.replace('\\','/').split('/')[-1] for s_pathfile in sorted(glob.glob(self.path + f'/output*.xml'))]
+            if (len(self.ls_xmlfile) == 0):
+                sys.exit(f'Error @ TimeSeries.__init__ : could not detect any output*.xml! is the given output_path correct? {output_path}')
+            self.custom_data_type = custom_data_type
+            self.microenv = microenv
+            self.graph = graph
+            self.physiboss = physiboss
+            self.settingxml = settingxml
+            if load:
+                self.read_mcds()
+            else:
+                self.l_mcds = None
+
+
+    def custom_data_astype(self, custom_data_type={}):
+        """
+        input:
+            custom_data_type: dictionary; default is {}
+                variable to specify custom_data variable types other than
+                floats (namely: int, bool, str) like this: {var: dtype, ...}.
+                downstream float and int will be handled as numeric,
+                bool as Boolean, and str as categorical data.
+
+        output:
+            self.data['cell']['df_cell']:
+                the dtype of columns as specified in the custom_data_type dictionary.
+
+        description:
+            function to set the dtype of custom_data variables,
+            even after the data is loaded.
+        """
+        # variable typing
+        for mcds in self.get_mcds_list():
+            mcds.custom_data_astype(custom_data_type=custom_data_type)
+
 
     def set_verbose_false(self):
         """
@@ -1826,4 +1863,3 @@ def get_sdmcds_list(self):
             function returns a binding to the self.l_sdmcds list of spdata mcds objects.
         """
         return self.l_sdmcds
-
diff --git a/pcdl/timestep.py b/pcdl/timestep.py
@@ -491,7 +491,7 @@ def __init__(self, xmlfile, output_path='.', custom_data_type={}, microenv=True,
                 relative or absolute path to the directory where
                 the PhysiCell output files are stored.
 
-            custom_dtype: dictionary; default is {}
+            custom_data_type: dictionary; default is {}
                 variable to specify custom_data variable types other than
                 floats (namely: int, bool, str) like this: {var: dtype, ...}.
                 downstream float and int will be handled as numeric,
@@ -545,6 +545,27 @@ def __init__(self, xmlfile, output_path='.', custom_data_type={}, microenv=True,
         self.data = self._read_xml(xmlfile, output_path)
 
 
+    def custom_data_astype(self, custom_data_type={}):
+        """
+        input:
+            custom_data_type: dictionary; default is {}
+                variable to specify custom_data variable types other than
+                floats (namely: int, bool, str) like this: {var: dtype, ...}.
+                downstream float and int will be handled as numeric,
+                bool as Boolean, and str as categorical data.
+
+        output:
+            self.data['cell']['df_cell']:
+                the dtype of columns as specified in the custom_data_type dictionary.
+
+        description:
+            function to set the dtype of custom_data variables,
+            even after the data is loaded.
+        """
+        # variable typing
+        self.data['cell']['df_cell'] = self.data['cell']['df_cell'].astype(custom_data_type)
+
+
     def set_verbose_false(self):
         """
         input:

diff --git a/pyproject.toml b/pyproject.toml
@@ -71,18 +71,18 @@ classifiers = [
 # bue 2024-12-06: enforcing some versions
 dependencies = [
     "anndata>=0.10.8",
-    "bioio>=1.2.1",  # needs numpy < 2.0.0
+    "bioio>=2.0.0",
     "bioio-ome-tiff",
-    "geopandas>=0.14",  # spatialdata==0.6.0
+    "geopandas>=0.14",  # spatialdata
     "matplotlib",
     "neuroglancer",
     "numpy",
     "pandas>=2.2.2",
     "requests",
     "scikit-image>=0.24.0",
     "scipy>=1.13.0",
-    "shapely>=2.0.1",  # spatialdata==0.6.0
-    "spatialdata>=0.6.0",
+    "shapely>=2.0.1",  # spatialdata
+    "spatialdata>=0.7.2",
     "vtk",
 ]
 

diff --git a/test/test_timeseries_2d.py b/test/test_timeseries_2d.py
@@ -119,6 +119,12 @@ def test_mcdsts_make_movie_jpeg6(self, mcdsts=mcdsts):
 class TestTimeSeriesInit(object):
     ''' tests for loading a pcdl.TimeSeries data set. '''
 
+    def test_mcdsts_custom_data_astype(self):
+        mcdsts = pcdl.TimeSeries(s_path_2d, load=True, verbose=False)
+        mcdsts.custom_data_astype({'sample': bool})
+        assert(str(type(mcdsts)) == "<class 'pcdl.timeseries.TimeSeries'>") and \
+              (mcdsts.l_mcds[-1].data['cell']['df_cell']['sample'].dtype == bool)
+
     def test_mcdsts_set_verbose_true(self):
         mcdsts = pcdl.TimeSeries(s_path_2d, load=False, verbose=False)
         mcdsts.set_verbose_true()
@@ -177,6 +183,11 @@ def test_mcdsts_read_mcds_xmlfilelist(self):
               (len(mcdsts.l_mcds) == 3) and \
               (mcdsts.l_mcds == l_mcds)
 
+    def test_mcdsts_form_list_of_mcds(self):
+        mcdsts = pcdl.TimeSeries([])
+        assert(str(type(mcdsts)) == "<class 'pcdl.timeseries.TimeSeries'>") and \
+              (len(mcdsts.l_mcds) == 0)
+
 
 ## micro environment related functions ##
 

diff --git a/test/test_timeseries_3d.py b/test/test_timeseries_3d.py
@@ -46,6 +46,12 @@
 class TestTimeSeries3dInit(object):
     ''' tests for loading a pcdl.TimeSeries data set. '''
 
+    def test_mcdsts_custom_data_astype(self):
+        mcdsts = pcdl.TimeSeries(s_path_3d, load=True, verbose=False)
+        mcdsts.custom_data_astype({'sample': bool})
+        assert(str(type(mcdsts)) == "<class 'pcdl.timeseries.TimeSeries'>") and \
+              (mcdsts.l_mcds[-1].data['cell']['df_cell']['sample'].dtype == bool)
+
     def test_mcdsts_set_verbose_true(self):
         mcdsts = pcdl.TimeSeries(s_path_3d, load=False, verbose=False)
         mcdsts.set_verbose_true()
@@ -104,6 +110,11 @@ def test_mcdsts_read_mcds_xmlfilelist(self):
               (len(mcdsts.l_mcds) == 3) and \
               (mcdsts.l_mcds == l_mcds)
 
+    def test_mcdsts_form_list_of_mcds(self):
+        mcdsts = pcdl.TimeSeries([])
+        assert(str(type(mcdsts)) == "<class 'pcdl.timeseries.TimeSeries'>") and \
+              (len(mcdsts.l_mcds) == 0)
+
 
 ## micro environment related functions ##