:py:mod:`pycmtensor.data`
=========================

.. py:module:: pycmtensor.data

.. autoapi-nested-parse::

   PyCMTensor data module



Module Contents
---------------


.. py:class:: Data(df: pandas.DataFrame, choice: str, **kwargs)


   
   Base Data class object.

   :param df: the input Pandas dataframe
   :type df: pandas.DataFrame
   :param choice: column string name of the choice dependent variable
   :type choice: str
   :param \*\*kwargs: Keyword arguments, accepted arguments are `drop:pd.Series`,
                      `autoscale:bool`, `autoscale_except:list[str]`, `split:float`

   .. note::

      The following is an example initialization of the swissmetro dataset::

          swissmetro = pd.read_csv("../data/swissmetro.dat", sep="\t")
          db = pycmtensor.Data(
              df=swissmetro,
              choice="CHOICE",
              drop=[swissmetro["CHOICE"]==0],
              autoscale=True,
              autoscale_except=["ID", "ORIGIN", "DEST"],
              split=0.8,
          )

   .. py:property:: x


   .. py:property:: y


   .. py:property:: all


   .. py:property:: n_train_samples


   .. py:property:: n_valid_samples


   .. py:property:: train_data


   .. py:property:: valid_data


   .. py:method:: split_db(split_frac: float)

      Split database data into train and valid sets

      Arg:
          split_frac (float): fractional value between 0.0 and 1.0.


   .. py:method:: get_nrows() -> int

      Returns the lenth of the DataFrame object


   .. py:method:: get_train_data(tensors, index=None, batch_size=None, shift=None)

      Alias to get train data slice from `self.pandas.inputs()`

      See :meth:`PandasDataFrame.inputs()` for details


   .. py:method:: get_valid_data(tensors, index=None, batch_size=None, shift=None)

      Alias to get valid data slice from `self.pandas.inputs()`

      See :meth:`PandasDataFrame.inputs()` for details


   .. py:method:: scale_data(**kwargs)

      Scales data values by data/scale from `key=scale` keyword argument

      :param \*\*kwargs: {key: scale} keyword arguments


   .. py:method:: autoscale_data(except_for=[None])

      Autoscale variable values to within -10.0 < x < 10.0

      :param except_for: list of column labels to skip autoscaling step
      :type except_for: list[str]


   .. py:method:: info()

      Outputs information about the Data class object



