TinyChatEngine
|
Public Member Functions | |
Fp32CLIPVisionTransformer (std::string param_path, const struct model_config config, bool is_vila) | |
struct Fp32CLIPVisionTransformer_output | forward (const struct Fp32CLIPVisionTransformer_input &input, bool is_vila) |
Public Attributes | |
Embedding | embed_positions |
Conv2D | embed_patch |
LayerNorm | pre_layernorm |
Linear_FP | mm_proj_0 |
Linear_FP | mm_proj_2 |
int | voc_size |
int | embed_dim |
int | padding_idx |
int | hidden_dim |
int | num_heads |
int | image_size |
int | patch_size |
int | num_patches |
int | num_positions |
int | projection_dim |
int | mmproj_dim |
std::vector< Fp32CLIPEncoderLayer > | layers |
std::string | profile_name = "Fp32CLIPVisionTransformer" |