GSoC 2018 Application Ravicharan: Group Theory - sympy/sympy GitHub Wiki

About Me

Personal Info

Heading Details
Name Ravicharan
University Indian Institute of Information Technology, Allahabad
Email [email protected]
Github gandle RavicharanN
Blog https://ravicharann.github.io/blog
Time zone IST (UTC +5:30)

Personal Background

I am a sophomore at Indian Institute of Information Technology, Allahabad pursuing ‘Information Technology’. I was introduced to programming in the freshman year of my college and my clean interest in math from the childhood helped me get acquainted with programming easily. I’ve been programming for 2 years now.

I started off with practicing competitive style programming and then moved on towards developing mini projects of my own which helped me get hands on multiple programming languages other than C and C++.

I’m familiar with multiple programming languages, but primarily, my proficiency is inclined towards Python and C++.

I work on MacOS (Sierra currently) with Visual Studio Code as my primary code editor. I was introduced to VS Code a few months ago and started using it ever since then because of its wide range of available plugins and customization flexibility.

Technical Background

Courses opted at university

  • Theory of computation
  • Discrete Mathematics and Graph Theory
  • Mathematics - 1: Univariate and multivariate calculus, Sequence and series, Calculus on Vector Fields
  • Statistics and Probability
  • Data Structures and Algorithms
  • Object Oriented Methodologies
  • Advanced Linear Algebra
  • Convex Optimisation (Ongoing - Current semester)
  • Design and Analysis of Algorithms (Ongoing - Current Semester)

Additionally, I have a good knowledge of Group Theory and Abstract Algebra. So, there would be no problem in understanding the new parts of group theory if necessary.

Previous Projects

Here is a list of few projects that I’ve worked on in the past one year. A few other minor projects can be found in my Github repositories.

  • Ad.ai: (Github Link - Ad.ai)    An NPL integrated messenger bot which helps doctors predict the depression levels of a patient based on the conversation it has with the patient. Also, the bot collects the data from the user and predicts if the depression can is a recurrence or a non-recurrence depression using an existing dataset that was available online. Implemented as a part of a hackathon

  • StalkHub: (Github Link - Stalkhub)    A consolidated portal that keeps track of the information about the performance in programming platforms(like Codechef, Codeforces, SPOJ...etc) of all the members of an organization or a school. Major work on this has been done in the last summer.

  • Sensitive Data Trim: (Github Link - SenstiveDataTrim)    An application that uses Luhn Algorithm and other generalized representations to detect the sensitive data like credit card numbers and account numbers and trims them off. This was made as a competition held on the Topcoder platform

Patch Requirement

  • #14250 - Merged - Add sampling for Gamma inverse Disribution
  • #13986 - Merged - Add functionality for infinity norm of matrices
  • #14369 - Open - Add support for Quantity(name, dim, scale_factor, abbrev)
  • #13963 - Duplicated - Docstring Fix. First PR

I would be submitting more patches once the proposal is submitted as I would be free during the first week of April.

The project

Abstract

This project application mainly aims at improving the Group Theory part of the combinatorics module. It is majorly divided into 3 phases :

  • Phase1:    1. Aims in implementing an automaton for word reduction    2. Implementation of algorithms to compute and check the isomorphism between 2 groups.

  • Phase2:    1. Implementation of the modified Todd-Coxeter algorithm for subgroup presentation.    2. Polycyclic presentation of finitely presented groups.    3. Implementation of the power-conjugate presentation and the corresponding algorithms.

  • Phase3:    1. Rewriting systems for power-conjugate presentation    2. Computing the quotient of groups

Implementation

Implementation of an automaton for word reduction

The current implementation of the word reduction in rewriting systems is implemented by reducing words based on the set of rules (S). As suggested by valglad in GSOC 2017’s report, an automaton can be used for the word reduction method. In the previous implementation of the Knuth-Bendix process, most of the time is taken up in reducing the words to equivalent irreducible forms using the set of rules. Using the automaton for the word reduction method helps in completing the task of word reduction in a much more efficient way.

The implementation can be done in the following way: Let, S be the set of rules, which will be returned by the rewriting system method. A rule (x, y) in S says that an element 'x' that can be reduced to 'y'. Let, M be the finite state automata. The non-final states in M are essentially the left-hand side of the rules, because, the left-hand side of the rules can further be reduced.

Consider the example, w = abc with the set of rules S and an automaton M. Let b be the first occurrence of a left-hand rule in w from the given set of rules S. Thus this can be reduced into another form based on the corresponding rule. As mentioned in the handbook, when the word w is fed into the automaton M, and when we read the symbol b, we find a dead state for the prefix ab of w. Then we locate a rule (b, x) that belongs to S such that b can be reduced to another symbol x.

This process is put in a loop until we get a ‘w’ that can’t be reduced further. Thus, the reduced word is obtained. An explanation for the usage of an automaton in word reduction has been given in detail in the section 13.1.3 of the Handbook.

Isomorphism between two groups

This could essentially be split into 2 parts. The first one is to check whether the given two groups are isomorphic to each other. A random_isomorphism_test method is mentioned in the section 11.4.2 of the Handbook to check if the given groups are isomorphic. However, for better efficiency, we can directly check if the isomorphism exists between the groups for certain special cases. For example, isomorphism could directly be decided between two prime groups of the same order.

The second part is to compute the isomorphism between two groups if at all there exists an isomorphism between them, else return None. GAP implements a method called IsomorphismGroups for this purpose. Similarly, a group_isomorphism method can be implemented in Sympy for this purpose. Additionally, a method is_isomorphism will be implemented to check for the isomorphism before the computation. If the is_isomorphism method returns False, the group_ismorphism will return None.

The implementation of the group_isomorphism method can be similar to that of the algorithm used by the GAP engine.

gap> G:=DihedralGroup(8);
<pc group of size 8 with 3 generators>
gap> H:=Group( (1,5)(2,3)(4,8)(6,7), (1,2)(3,8)(4,6)(5,7) );
Group([ (1,5)(2,3)(4,8)(6,7), (1,2)(3,8)(4,6)(5,7) ])
gap> IsomorphismGroups(H,G);
[ (1,5)(2,3)(4,8)(6,7), (1,2)(3,8)(4,6)(5,7) ] -> [ f1*f3, f1*f2 ]
gap> T:=Group([ (1,7,4,3)(2,8,6,5), (1,4)(2,6)(3,7)(5,8) ]);
Group([ (1,7,4,3)(2,8,6,5), (1,4)(2,6)(3,7)(5,8) ])
gap> IsomorphismGroups(T,G);
fail

GAP implements an efficient way to verify the isomorphism between two small groups. Similar to that of GAP, an IdGroups method can be implemented in Sympy which computes the isomorphism for smaller groups. The implementation can be done by first checking the size of the group and then verifying the isomorphism using a suitable algorithm between the two (Random Isomorphism test & IdGroups) depending upon the size of the group.

But, the implementation of id_groups requires a dependency on the SmallGroup library of GAP. Currently, GAP's way of doing this is that it fetches and returns the kth group of a particular order from the predefined catalog they have. But, implementation of a custom database for the Sympy engine could really take up a large part of the timeline. Since we already have another alternative for checking the isomorphism, I have added a buffer period in the timeline, in which I can try to implement the IdGroup method if we could somehow make use of the GAP catalog of the SmallGroup class else I could implement a few other things like I said in the 'BufferPeriod' section.

Implementation of the modified Todd-Coxeter algorithm:

Let G be the given group and H be the subgroup of the given group. Now the modified Todd-Coxeter algorithm is used to compute the presentation of the given generating set rather than the Schreier set. An algorithm has been provided for the modified Todd-Coxeter algorithm in Section 5.3.2 of the Handbook.

An attempt has been made to implement this method in the 2016’s GSoC project but was left out after then. Since this is a pretty straightforward algorithm, the implementation of the algorithm can be done right from scratch rather than modifying the previous patch.

Polycyclic presentation of finitely generated groups:

Implement a Polycyclic group - PcGroup class. A few methods of the pc-groups require similar implementation as that of the FpGroups, so, PcGroup can be implemented as a child of the FpGroup such that it will inherit the methods required.

Additionally, algorithms for a few methods like computing the polycyclic generating sequence of a polycyclic group require the given group to be an FpGroup. So it would be better to implement the PcGroup such that it inherits the properties of the FpGroup class

Implement a method compute_pcgs to compute the polycyclic generating sequence of a given group. The method returns a sequence of FreeGroupElements which serve as the generating sequence for the polycyclic group. An additional method is_polycyclic will be implemented to check if the Group passed into the compute_pcgs method is a polycyclic group. The compute_pcgs method will throw an error if the group that is sent as an argument into the method is not polycyclic.

The compute_pcgs method requires an implementation of the polycyclic quotient method, mentioned in this paper, to compute the quotients, as per the definition of the polycyclic sequences. Additionally a method is_pcgs can be implemented to check if the given series generate a polycyclic group.

>>G = FpGroup((1,2,3,4),(1,2));   
>>p = Pcgs(G);
[ (3,4), (2,4,3), (1,4)(2,3), (1,2)(3,4) ]
>>IsPcgs( p );
true
>> p[1];
(3,4)

A few elementary methods can be implemented once we have the polycyclic generating sequence of the group G. A few of the methods that can be implemented are, relative_order, is_finite_order_pcgs, is_prime_order_pcgs.

Relative Order can be computed by the straight implementation of the definition as mentioned in the handbook, however, the implementation can be referred from GAP source as well. The is_finite_order_pcgs method can be computed by checking if all the elements in the relative order of the given Polycyclic group are finite. And similarly, if all the elements in the relative order list are prime it is a Prime ordered Pcgs.

>> G = Group( (1,2,3,4),(1,2) );
>> p = Pcgs(G);
>> RelativeOrders(p);
[ 2, 3, 2, 2 ]
>> IsFiniteOrdersPcgs(p);
True
>> IsPrimeOrders(p);
True 

By Theorem 8.8 in the Handbook, every polycyclic sequence can represent a unique polycyclic group. Thus a method will also be implemented to produce the polycyclic series when a sequence is given. This has already been implemented in GAP and it can be replicated in the Sympy engine.

Implement a method pc-presentation, power conjugate presentation, that takes in a polycyclic group and a set of exponents as values and returns a 4-tuple which contains the polycyclic presentation, polycyclic sequence, relative orders and the setOfExponents.

Implement the collection algorithm. This algorithm provides an efficient way to solve the word problem. The central idea is to eliminate all the minimal uncollected subwords by applying the polycyclic relations, the final word that we get once the algorithm terminates will be collected word. A detailed implementation has been mentioned in section 8.1.3 of the Handbook.

Rewriting systems for polycyclic groups:

Rewriting systems has already been implemented but it was not specific to the polycyclic groups. A method is_consistent shall be implemented to check if the given power-conjugate presentation is consistent. Checking the consistency of a power-conjugate presentation is a special case of confluence testing in the Knuth-Bendix completion process. The implementation is mentioned in detail in Section 12.4 of the Handbook.

Computing the Quotients of groups

Implementation of the Quotient algorithms to compute the quotient groups of the finitely presented groups. All the quotient methods can be implemented as methods in the FpGroup class.

A maximal_abelian_quotient(free_group) method will be implemented to compute the maximal abelian quotient of a given free_group. Gap provides an implementation of this method as shown below:

gap> f:=FreeGroup(2);;fp:=f/[f.1^6,f.2^6,(f.1*f.2)^12];
<fp group on the generators [ f1, f2 ]>
gap> hom:=MaximalAbelianQuotient(fp);
[ f1, f2 ] -> [ f1, f3 ]

An epimorphism_group method will be implemented which computes an epimorphism from an FpGroup ‘F’ to a p-group ‘P’ with a class ‘C’ which will be a quotient for the Fp-group. This method can be implemented as epimorphism(fpGrp, p-group, class=None). Here, the default class will be None and when no class is passed into the function, the computed epimorphism will be the largest finite p-group quotient. A detailed implementation of this algorithm is given in Section 9.1.1 of the Handbook.

A method epimorphism_nilpotent_quotient( FpGrp, n ) called is implemented in GAP which returns an epimorphism on the class n finite nilpotent quotient of the finitely presented group Fpgrp. If n is omitted, the largest finite nilpotent quotient is returned.

Buffer Period

As mentioned previously, a buffer period is given just as a contingency if we decide not to implement the id_group method and the SmallGroups class. This period will be a half a week before the Phase-1 evaluation. I noticed that the group theory part of the combinatorics module lacks proper documentation in specific areas. This can be improved during this period. Also, the corresponding docstrings with the implementation examples can be provided for all the classes and methods included.

Proposed Timeline

Here is the tentative timeline that goes in line with the plan of my implementation.

Community Bonding Period: April 23rd - May 14th

Goal: Interact with the community. Look into the raw implementation of a few methods that GAP implements.

  • I will be having the end semester examinations between April 27th and May 8th. However, I can guarantee that I can stay active on the Gitter channel and interact with my mentor.
  • Setup a blog to post the work done during GSoC period from time to time.

Phase-1


Week 1: May 14th - May 20th

Goal: Implement the automaton for word reduction.

  • As mentioned, an automaton for word reduction can be implemented during this period.
  • There are ways other mentioned to improve the Knuth-Benedix completion process algorithm. This can be looked into if there is sufficient time after the implementation of the automaton.
  • Test the code using GAP and update the blog.

Week 2: May 21st - May 28th

Goal: Implement an IsomorphismGroups method for the computation of Isomorphism between two groups.

  • Look into GAP implementation of the IsomorphismGroups method.
  • Integrate this method into the Sympy engine.
  • Finish up the work, test the code and update the blog.

Week 3 & 4: May 29th - June 11th

Goal: Complete the whole of Isomorphism between two groups section

  • Implement the random isomorphism test.
  • Try integrating the SmallGroups catalog of GAP into Sympy.Implement the contingency work mentioned in the Buffer Period section.
  • Test the code using GAP. Update the blog.
  • Submit the current status of the project for phase-1 evaluation.

Phase 2


Week 5: June 12th - June 19th

Goal: Implementation of modified Todd-Coxeter algorithm

  • Implement the modified Todd-Coxeter algorithm for subgroup representation.
  • Test the code using GAP, Update the blog.

Week 6 and 7: June 20 - July 4th

Goal: Polycyclic presentation of finite group and related algorithms

  • Implement a PcGroup class as a child of the FpGroup class.
  • Implement methods to compute the polycyclic generating sequence for a given group.
  • Implement an is_polycyclic method to check if the given group is polycyclic.
  • Implement other elementary methods like RelativeOrder, isPrimeOrder...etc
  • Test the code using GAP. Update the blog.

Week 8: July 3rd - July 11th

Goal: Implement power conjugate presentation.

  • Implement other elementary methods of Pc-Groups as in the GAP library.
  • Implement methods for computing the power conjugate presentations.
  • Implement the collection algorithm to solve the word problem.
  • Test the code using GAP. Update the blog.
  • Submit the project for phase 2 evaluation.

Phase 3


Week 9: July 12th - July 18th

Goal: Rewriting system for polycyclic groups

  • Use pc-presentations for rewriting system.
  • Implement a method is_confluent to check if the given presentation is consistent.
  • Test the code using GAP. Update the blog.

Week 10 & 11: July 19th - July 31st

Goal: Implementation of Quotient groups

  • Implement the quotient methods using the corresponding algorithms.
  • Implement the MaximalAbelianQuotient(free_group) method for the computation of the abelian quotient.
  • Implement the epimorphism group method Epimorphism(fpGrp, p-group, class=None).
  • Finish up implementing the other quotient methods using the algorithms mentioned.
  • Test the code. Update the blog.

Final Week

  • Complete the unfinished work, if any.
  • Tidy up the code quality and documentation of the code implemented during the GSoC period.
  • Submit the code for final evaluation.

A PR will be sent every week with the corresponding implementation of that week's plan along with the unit tests.

Commitment to the project

I would be having my end semester examination during the community bonding period. However, I will have the time to interact with the community and my mentor.

My summer break will start by the time the community bonding period ends and I can devote my complete to the project during my holidays. I can work for 7-8 hours during the weekdays and 3-4 hours during the weekends. As a whole, I can spend 40-50 hours in a week.

Post GSoc

  • Continue contributing to the community.
  • Try improving the other sub-modules of the library.
  • Review a few patches from time to time.
  • I recently looked into the idea of one-one pattern matching using an external library, MatchPy. I can as well try contributing to this after the GSoC period ends.

References