|
TinyChatEngine
|
Public Member Functions | |
| Int4llamaDecoderLayer (std::string param_path, const struct model_config config, int layer_idx) | |
| struct Int4llamaDecoderLayer_output | forward (std::string param_path, const struct Int4llamaDecoderLayer_input &input, int layer_idx) |
Public Attributes | |
| std::string | profile_name = "Int4llamaDecoderLayer" |
| int | embed_dim |
| int | num_attention_heads |
| int | hidden_dim |
| int | layer_idx |
| float | rms_norm_eps |
| Int4llamaAttention | attn |
| LlamaRMSNorm | input_layernorm |
| LlamaRMSNorm | post_attention_layernorm |
| Linear_FP_int4 | gate_proj |
| Linear_FP_int4 | down_proj |
| Linear_FP_int4 | up_proj |
| float * | input_layernorm_weight_ptr = nullptr |
| float * | post_attention_layernorm_ptr = nullptr |