Transformers documentation
Backbone
Backbone
A backbone is a model used for feature extraction for higher level computer vision tasks such as object detection and image classification. Transformers provides an AutoBackbone class for initializing a Transformers backbone from pretrained model weights, and two utility classes:
- BackboneMixin enables initializing a backbone from Transformers or timm and includes functions for returning the output features and indices.
- BackboneConfigMixin sets the output features and indices of the backbone configuration.
timm models are loaded with the TimmBackbone and TimmBackboneConfig classes.
Backbones are supported for the following models:
- BEiT
- BiT
- ConvNext
- ConvNextV2
- DiNAT
- DINOV2
- FocalNet
- MaskFormer
- NAT
- ResNet
- Swin Transformer
- Swin Transformer v2
- ViTDet
AutoBackbone
BackboneMixin
BackboneConfigMixin
A Mixin to support handling the out_features and out_indices attributes for the backbone configurations.
set_output_features_output_indices
< source >( out_features: list | None out_indices: list | None )
Parameters
Sets output indices and features to new values and aligns them with the given stage_names.
If one of the inputs is not given, find the corresponding out_features or out_indices
for the given stage_names.
Serializes this instance to a Python dictionary. Override the default to_dict() from PreTrainedConfig to
include the out_features and out_indices attributes.
Verify that out_indices and out_features are valid for the given stage_names.
TimmBackbone
Wrapper class for timm models to be used as backbones. This enables using the timm models interchangeably with the other models in the library keeping the same API.
TimmBackboneConfig
class transformers.TimmBackboneConfig
< source >( backbone = None num_channels = 3 features_only = True out_indices = None freeze_batch_norm_2d = False output_stride = None **kwargs )
Parameters
- backbone (
str, optional) — The timm checkpoint to load. - num_channels (`
, defaults to3`) — The number of input channels. - features_only (
bool, optional, defaults toTrue) — Whether to output only the features or also the logits. - out_indices (“) — Indices of the intermediate hidden states (feature maps) to return from the backbone. Each index corresponds to one stage of the model.
- freeze_batch_norm_2d (
bool, optional, defaults toFalse) — Converts allBatchNorm2dandSyncBatchNormlayers of provided module intoFrozenBatchNorm2d. - output_stride (“) — The ratio between the spatial resolution of the input and output feature maps.
This is the configuration class to store the configuration of a TimmBackbone. It is used to instantiate a Timm Backbone model according to the specified arguments, defining the model architecture. Instantiating a configuration with the defaults will yield a similar configuration to that of the
Configuration objects inherit from PreTrainedConfig and can be used to control the model outputs. Read the documentation from PreTrainedConfig for more information.
Example:
>>> from transformers import TimmBackboneConfig, TimmBackbone
>>> # Initializing a timm backbone
>>> configuration = TimmBackboneConfig("resnet50")
>>> # Initializing a model from the configuration
>>> model = TimmBackbone(configuration)
>>> # Accessing the model configuration
>>> configuration = model.config