Skip to content
This repository was archived by the owner on Jul 27, 2024. It is now read-only.
This repository was archived by the owner on Jul 27, 2024. It is now read-only.

ProtoFromDataFrames fails for dataframes with categorical columns #237

@ysayeed

Description

@ysayeed

When attempting to create the proto for facets-overview, if any of the columns are categorical, the operation will fail with an attribute error. I would expect it to properly parse the dataframe, treating the category dtype as a string and displaying it in the "Categorical Features" section in the same way.

Below is example code to produce this error and the traceback:

from facets_overview.generic_feature_statistics_generator import GenericFeatureStatisticsGenerator  
import pandas as pd  
df = pd.DataFrame({'col1': pd.Categorical(['a', 'b', 'c', 'a', 'b', 'c'])})  
proto = GenericFeatureStatisticsGenerator().ProtoFromDataFrames([{'name': 'test', 'table': df}])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".../facets_overview/base_generic_feature_statistics_generator.py", line 54, in ProtoFromDataFrames
    table_entries[col] = self.NdarrayToEntry(table[col])
  File ".../facets_overview/base_generic_feature_statistics_generator.py", line 119, in NdarrayToEntry
    data_type = self.DtypeToType(x.dtype)
  File ".../facets_overview/base_generic_feature_statistics_generator.py", line 66, in DtypeToType
    if dtype.char in np.typecodes['AllFloat']:
AttributeError: 'CategoricalDtype' object has no attribute 'char'

This is using facets-overview 1.0.0 and pandas 1.1.4.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions