`pycmtensor.data`#

PyCMTensor data module

Module Contents#

class pycmtensor.data.Data(df: pandas.DataFrame, choice: str, **kwargs)[source]#

Base Data class object.

Parameters:

df (pandas.DataFrame) – the input Pandas dataframe
choice (str) – column string name of the choice dependent variable
**kwargs – Keyword arguments, accepted arguments are drop:pd.Series, autoscale:bool, autoscale_except:list[str], split:float

Note

The following is an example initialization of the swissmetro dataset:

swissmetro = pd.read_csv("../data/swissmetro.dat", sep="\t")
db = pycmtensor.Data(
    df=swissmetro,
    choice="CHOICE",
    drop=[swissmetro["CHOICE"]==0],
    autoscale=True,
    autoscale_except=["ID", "ORIGIN", "DEST"],
    split=0.8,
)

property x[source]#

property y[source]#

property all[source]#

property n_train_samples[source]#

property n_valid_samples[source]#

property train_data[source]#

property valid_data[source]#

split_db(split_frac: float)[source]#

Split database data into train and valid sets

Arg:: split_frac (float): fractional value between 0.0 and 1.0.

get_nrows() → int[source]#: Returns the lenth of the DataFrame object

get_train_data(tensors, index=None, batch_size=None, shift=None)[source]#

Alias to get train data slice from self.pandas.inputs()

See PandasDataFrame.inputs() for details

get_valid_data(tensors, index=None, batch_size=None, shift=None)[source]#

Alias to get valid data slice from self.pandas.inputs()

See PandasDataFrame.inputs() for details

scale_data(**kwargs)[source]#

Scales data values by data/scale from key=scale keyword argument

Parameters:: **kwargs – {key: scale} keyword arguments

autoscale_data(except_for=[None])[source]#

Autoscale variable values to within -10.0 < x < 10.0

Parameters:: except_for (list[str]) – list of column labels to skip autoscaling step

info()[source]#: Outputs information about the Data class object

pycmtensor.data

Contents

pycmtensor.data#

Module Contents#

`pycmtensor.data`#