TinyChatEngine
|
Public Member Functions | |
Int4llamaDecoderLayer (std::string param_path, const struct model_config config, int layer_idx) | |
struct Int4llamaDecoderLayer_output | forward (std::string param_path, const struct Int4llamaDecoderLayer_input &input, int layer_idx) |
Public Attributes | |
std::string | profile_name = "Int4llamaDecoderLayer" |
int | embed_dim |
int | num_attention_heads |
int | hidden_dim |
int | layer_idx |
float | rms_norm_eps |
Int4llamaAttention | attn |
LlamaRMSNorm | input_layernorm |
LlamaRMSNorm | post_attention_layernorm |
Linear_FP_int4 | gate_proj |
Linear_FP_int4 | down_proj |
Linear_FP_int4 | up_proj |
float * | input_layernorm_weight_ptr = nullptr |
float * | post_attention_layernorm_ptr = nullptr |