LibPATA (Parallel Asynchronous Tensors Abstraction), is a PoC experiment that creates a low-level tensor programing model with an abstraction of Tensor Operator Set. Developer controls parallelization via async execution with built-in seemless synchronization for data dependet tasks.
We also provide an easy and powerfull tool to split tensors into sub-tensor views to enable easy splitting of work accross the data plane to enable the parallel execution.
Experimentation is around:
- Async Execution - can we abstract it ?
- Parallelization control - how developer can control parallel execution which is on an accelerator, not related to the CPU parallelism
- Fusion, pipelining - how developer can enbale function futions and hetro-hw pipelining.
- cmake => 3.19
- C++ 11
- gtest - master branch
- libxsmm - master branch
Share & Enjoy 😃