Navigation

    Gpushare.com

    • Register
    • Login
    • Search
    • Popular
    • Categories
    • Recent
    • Tags

    【记录】pytorch_tabular

    技术交流
    1
    1
    113
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • 183****0229
      183****0229 last edited by

      地址:https://github.com/manujosephv/pytorch_tabular
      介绍:为表格数据建模深度学习模型的标准框架。[PyTorch和PyTorch Lightning框架]
      PyTorch Tabular 旨在让表格形式数据的深度学习变得容易,并且可供现实案例和研究使用。
      安装: pip install pytorch_tabular[all]
      文档:https://pytorch-tabular.readthedocs.io/en/latest/
      可用模型:

      • FeedForward Network with Category Embedding is a simple FF network, but with an Embedding layers for the categorical columns.
      • Neural Oblivious Decision Ensembles for Deep Learning on Tabular Data is a model presented in ICLR 2020 and according to the authors have beaten well-tuned Gradient Boosting models on many datasets.
      • TabNet: Attentive Interpretable Tabular Learning is another model coming out of Google Research which uses Sparse Attention in multiple steps of decision making to model the output.
      • Mixture Density Networks is a regression model which uses gaussian components to approximate the target function and provide a probabilistic prediction out of the box.
      • AutoInt: Automatic Feature Interaction Learning via Self-Attentive Neural Networks is a model which tries to learn interactions between the features in an automated way and create a better representation and then use this representation in downstream task
      • TabTransformer is an adaptation of the Transformer model for Tabular Data which creates contextual representations for categorical features.
      • FT Transformer from Revisiting Deep Learning Models for Tabular Data

      使用:

      from pytorch_tabular import TabularModel
      from pytorch_tabular.models import CategoryEmbeddingModelConfig
      from pytorch_tabular.config import DataConfig, OptimizerConfig, TrainerConfig, ExperimentConfig
      
      data_config = DataConfig(
          target=['target'], #target should always be a list. Multi-targets are only supported for regression. Multi-Task Classification is not implemented
          continuous_cols=num_col_names,
          categorical_cols=cat_col_names,
      )
      trainer_config = TrainerConfig(
          auto_lr_find=True, # Runs the LRFinder to automatically derive a learning rate
          batch_size=1024,
          max_epochs=100,
          gpus=1, #index of the GPU to use. 0, means CPU
      )
      optimizer_config = OptimizerConfig()
      
      model_config = CategoryEmbeddingModelConfig(
          task="classification",
          layers="1024-512-512",  # Number of nodes in each layer
          activation="LeakyReLU", # Activation between each layers
          learning_rate = 1e-3
      )
      
      tabular_model = TabularModel(
          data_config=data_config,
          model_config=model_config,
          optimizer_config=optimizer_config,
          trainer_config=trainer_config,
      )
      tabular_model.fit(train=train, validation=val)
      result = tabular_model.evaluate(test)
      pred_df = tabular_model.predict(test)
      tabular_model.save_model("examples/basic")
      loaded_model = TabularModel.load_from_checkpoint("examples/basic")
      
      1 Reply Last reply Reply Quote 1
      • First post
        Last post