The lifecycle of ggml tensor #1109

soccercheng · 2025-02-11T01:14:43Z

soccercheng
Feb 11, 2025

I'm working on enabling GGML on a PCIE card with RISC-V AMP(Asymmetric multiprocessing) architecture in it, some of testing and examples can run on my port.

I'm curious about the lifecycle of ggml tensor, in the CPU and RPC implementations, I'm not able find any clues about when and how ggml tensors are released on RPC server side?

For example, in "examples/gpt-2", I can find "identical" ggml tensors are set multiple times with different size...

[add_ggml_tensor] in 1st graph compute

[MCU]GGML: [DEBUG] add_ggml_tensor: New tensor id(0x636f73530070), [tensor(inp_tokens, 110fee3a0): buffer(0X110fee540), data(0X1e30983c0), data_size(0X10)]
[MCU]MCU_WORKERS: [INFO] (0) Processing command task: A647E2F1-B5AD-40DE-B2A8-E6F60E4AA138, command=0X00000007
[MCU]GGML: [DEBUG] add_ggml_tensor: New tensor id(0x636f735301f0), [tensor(position, 110fee1e0): buffer(0X110fee540), data(0X1e30983e0), data_size(0X10)]
[MCU]MCU_WORKERS: [INFO] (0) Processing command task: 7DF94AE0-3B5F-4EC6-8A40-49EBA4AB2C6D, command=0X00000007
[MCU]GGML: [DEBUG] add_ggml_tensor: New tensor id(0x636f735307f0), [tensor(KQ_mask, 110fedfe0): buffer(0X110fee540), data(0X1e3098400), data_size(0X40)]
[MCU]MCU_WORKERS: [INFO] (0) Processing command task: 48A337B4-A73E-49EE-AC94-8C26B9D26BD8, command=0X00000007
[MCU]GGML: [DEBUG] add_ggml_tensor: New tensor id(0x636f73530370), [tensor(node_0, 110fede20): buffer(0X110fee540), data(0X1e30a2400), data_size(0X3000)]
[MCU]MCU_WORKERS: [INFO] (0) Processing command task: B375DA85-462C-4ADF-8EAE-B1A2667B96C1, command=0X00000007
[MCU]GGML: [DEBUG] add_ggml_tensor: New tensor id(0x636f735304f0), [tensor(node_1, 110fedc20): buffer(0X110fee540), data(0X1e30a6000), data_size(0X3000)]

[add_ggml_tensor] in 2nd graph compute

[MCU]GGML: [WARN] add_ggml_tensor: Existing tensor id(0x636f73530070), [tensor(inp_tokens, 110fee3a0): buffer(0X110fee540), data(0X1e30983c0), data_size(0X10)]
[MCU]GGML: [WARN] add_ggml_tensor: [serialized tensor(inp_tokens, 1e3098280, id(0x636f73530070)): buffer(0X636f7350e6a0), data(0X1e30983c0), data_size(0X14)]
[MCU]GGML: [DEBUG] add_ggml_tensor: New tensor id(0x636f73530070), [tensor(inp_tokens, 110fb0ce0): buffer(0X110fee540), data(0X1e30983c0), data_size(0X14)]
[MCU]MCU_WORKERS: [INFO] (0) Processing command task: 77D0C8F9-E3CD-4744-84CC-5F764FB20580, command=0X00000007
[MCU]GGML: [WARN] add_ggml_tensor: Existing tensor id(0x636f735301f0), [tensor(position, 110fee1e0): buffer(0X110fee540), data(0X1e30983e0), data_size(0X10)]
[MCU]GGML: [WARN] add_ggml_tensor: [serialized tensor(position, 1e3098280, id(0x636f735301f0)): buffer(0X636f7350e6a0), data(0X1e30983e0), data_size(0X14)]
[MCU]GGML: [DEBUG] add_ggml_tensor: New tensor id(0x636f735301f0), [tensor(position, 110fb0b20): buffer(0X110fee540), data(0X1e30983e0), data_size(0X14)]
[MCU]MCU_WORKERS: [INFO] (0) Processing command task: 5092049B-7AFD-40A2-BE7C-7022C9C85793, command=0X00000007
[MCU]GGML: [WARN] add_ggml_tensor: Existing tensor id(0x636f735307f0), [tensor(KQ_mask, 110fedfe0): buffer(0X110fee540), data(0X1e3098400), data_size(0X40)]
[MCU]GGML: [WARN] add_ggml_tensor: [serialized tensor(KQ_mask, 1e3098280, id(0x636f735307f0)): buffer(0X636f7350e6a0), data(0X1e3098400), data_size(0XB4)]
[MCU]GGML: [DEBUG] add_ggml_tensor: New tensor id(0x636f735307f0), [tensor(KQ_mask, 110fb0920): buffer(0X110fee540), data(0X1e3098400), data_size(0XB4)]
[MCU]MCU_WORKERS: [INFO] (0) Processing command task: C767DCF0-273B-470E-A90E-8F683501007F, command=0X00000007
[MCU]GGML: [WARN] add_ggml_tensor: Existing tensor id(0x636f73530370), [tensor(node_0, 110fede20): buffer(0X110fee540), data(0X1e30a2400), data_size(0X3000)]
[MCU]GGML: [WARN] add_ggml_tensor: [serialized tensor(node_0, 1e3098280, id(0x636f73530370)): buffer(0X636f7350e6a0), data(0X1e30a2400), data_size(0X3C00)]
[MCU]GGML: [DEBUG] add_ggml_tensor: New tensor id(0x636f73530370), [tensor(node_0, 110fb0760): buffer(0X110fee540), data(0X1e30a2400), data_size(0X3C00)]
[MCU]MCU_WORKERS: [INFO] (0) Processing command task: BBCFC35A-CEB4-4A76-9068-906E8B5B9720, command=0X00000007
[MCU]GGML: [WARN] add_ggml_tensor: Existing tensor id(0x636f735304f0), [tensor(node_1, 110fedc20): buffer(0X110fee540), data(0X1e30a6000), data_size(0X3000)]
[MCU]GGML: [WARN] add_ggml_tensor: [serialized tensor(node_1, 1e3098280, id(0x636f735304f0)): buffer(0X636f7350e6a0), data(0X1e30a6000), data_size(0X3C00)]
[MCU]GGML: [DEBUG] add_ggml_tensor: New tensor id(0x636f735304f0), [tensor(node_1, 110fb0560): buffer(0X110fee540), data(0X1e30a6000), data_size(0X3C00)]

Can you please provide some hints for my reference ?

slaren · 2025-02-11T12:44:47Z

slaren
Feb 11, 2025
Maintainer

For most backends, tensors are just a pointer within a ggml_backend_buffer. Normally, tensors are never released. There are some exceptions to this, but that's true for both the CPU and RPC backends.

0 replies

soccercheng · 2025-02-17T09:36:06Z

soccercheng
Feb 17, 2025
Author

I have the following further questions:

Does the PC side ggml runtime re-use the tensors of previous ggml_cgraph as the input tensors to following ggml_cgraph ?
It seems that in RPC backend implementation, the ggml_cgraph and tensors are dropped after computation, does it mean on the PC side ggml runtime, all the nodes (tensors) of the ggml_cgraph are "initialized" and "set" before a ggml_cgraph is submit to backend for execution ?

The reason why I am qurious about the life cycle of tensors is because, in my current backend implementation, to prevent from re-serialization of tensors during set and ggml_cgraph submission stages, a tensor is serialized, along with it's pointer address on PC side ggml runtime as tensor ID, to backend during init stage only. In set and ggml_cgraph submission stages, only IDs of tensors are encapsulated and set to backend side.

So, I need to control the life cycle of tensors on backend side accordingly.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The lifecycle of ggml tensor #1109

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments

{{title}}

{{title}}

Select a reply

The lifecycle of ggml tensor #1109

soccercheng Feb 11, 2025

Replies: 2 comments

slaren Feb 11, 2025 Maintainer

soccercheng Feb 17, 2025 Author

soccercheng
Feb 11, 2025

slaren
Feb 11, 2025
Maintainer

soccercheng
Feb 17, 2025
Author