GSoC 2019 Report of Tensor Core: Linear Algebra - sympy/sympy GitHub Wiki

This page is served as a summary of the work that has been done and still needs to be done regarding the project of Tensor Core: Linear Algebra at the end of the GSoC 2019 program.

About me

I am Zhiqi KANG, a 4th-year undergraduate from Université de Technologie de Compiègne, France. My preferred programming language is Python and Java.

Project plan

The major goal of this project is to improve the newly implemented module Array by adding new features and ameliorating its efficiency.

  • Phase 1
    1. Replace dense array by sparse array to improve the efficiency while executing
    2. Implement new format of data structure for sparse arrays
  • Phase 2
    1. Modify operators to a NumPy-like behavior
    2. Implement lazy-evaluation for array operation
  • Phase 3
    1. Implement code generation for array module about different languages
    2. implement integration over arrays
    3. Extra work if the time permits

Some work has not yet been completed, especially for the phase 3 due to a lack of time after some extra implementation and encountered difficulties. I will continously contribute to the Array module to accomplish these tasks.

Pull request

Merged

Implementation of a new class ArrayComprehension

This task is aiming to offer new features to other students in the GSoC 2019 for some specific purposes, which was not included in the initial plan. This class is a sympified version of Python's list comprehension. The main purpose was to enable the use of symbol in a list comprehension. The example of Table in Methematica was being followed during the implementation. Besides, a complementary class ArrayComprehensionMap was implemented to map external functions such as lambda to a ArrayComprehension .

Related PRs:

#16845 Added ArrayComprehension class

#16969 Added functions to ArrayComprehension

#17408 Added support for lambda in ArrayComprehension

Replacing dense array by sparse array

In many cases in SymPy, a sparse array is cast to a dense array, which will lead to a extra storage cost and a loss of efficiency. The sparse array is a special type of array whose elements are mostly(generally more than 50%) zeros. As a result, storing only the non-zero value would provide an advantage in economising the space. A good efficiency can also be found in some operations since the algorithm was optimized.

Related PRs:

#16937 Added case in derive_by_arrayfor sparse array

#17000 Added cases for sprase arrays in tensorproduct

#17014 Added cases for sparse arrays in mul and rmul

#17026 Added cases in NDimArray operations for sparse array

#17035 Added cases in permutedims for sparse arrays

Implementation of a NumPy-like behavior

Since Array module is newly developed, it can accept modification to follow the example of standrd library such as NumPy. A more standardized syntax and behavior can bring out a good consistancy with other libraries and a improved effeciency while operating. In this task what was mostly considered is a NumPy-like behavior in getting items from an array.

Related PRs:

#17226 Modified behavior of getitem for arrays

#17282 Fixed bug of getitem for single item array

#17393 Fixed bug of index with slicing in Array

Implementation of a lazy-evaluation operators

A lazy-evaluation allows the value to be calculated and returned only if needed, which can save a great deal of storage compared with a eager-evaluation where all operations are executed at once and stored in the cache waiting to be used. Since the operations often involve high dimension Array, it is pratical to implement lazy operator in this module. Typically, a new class Flatten was created to flatten the array in a lazy-evalution way.

Related PRs:

#17282 New iter for Array module

In progress

Implementation of new data structure for sparse array

Although the sparse array can be efficient in storage, when it comes to vector operations, the default data structure in SymPy(termed as Dictionary of Keys) does not have the best performance. In order to solve the problem, some new formats of storage were introduced in order to provide more possibilities for specific operations: some would be advantageous in incremental construction(such as LIL), while others would be more efficient in vector products(such as CSR).

Related PRs:

#17160 Implemented LIL format for sparse array

Future work

  • Implementation of code generation of Array module
  • Implementation of integration over arrays

Conclusion

Participating in this program allows me to learn a lot of things, from how to contribute to a open source community to how to ameliorate the algorithm to improve the efficiency of code. I would like to thank my mentors Francesco and Sartaj who have been patient and helpful during the entire project. I would like to thank the SymPy community and all the members who have offered me help, it is a great pleasure to join such a excellent community.There is still a lot for me to learn and to practice. To sum up, I will keep contributing to SymPy in the future and always remember the precious experience that I have accumulated during the summer.