GSoC 2018 Application Ravicharan: Group Theory - sympy/sympy GitHub Wiki
About Me
Personal Info
Heading | Details |
---|---|
Name | Ravicharan |
University | Indian Institute of Information Technology, Allahabad |
[email protected] | |
Github gandle | RavicharanN |
Blog | https://ravicharann.github.io/blog |
Time zone | IST (UTC +5:30) |
Personal Background
I am a sophomore at Indian Institute of Information Technology, Allahabad pursuing ‘Information Technology’. I was introduced to programming in the freshman year of my college and my clean interest in math from the childhood helped me get acquainted with programming easily. I’ve been programming for 2 years now.
I started off with practicing competitive style programming and then moved on towards developing mini projects of my own which helped me get hands on multiple programming languages other than C and C++.
I’m familiar with multiple programming languages, but primarily, my proficiency is inclined towards Python and C++.
I work on MacOS (Sierra currently) with Visual Studio Code as my primary code editor. I was introduced to VS Code a few months ago and started using it ever since then because of its wide range of available plugins and customization flexibility.
Technical Background
Courses opted at university
- Theory of computation
- Discrete Mathematics and Graph Theory
- Mathematics - 1: Univariate and multivariate calculus, Sequence and series, Calculus on Vector Fields
- Statistics and Probability
- Data Structures and Algorithms
- Object Oriented Methodologies
- Advanced Linear Algebra
- Convex Optimisation (Ongoing - Current semester)
- Design and Analysis of Algorithms (Ongoing - Current Semester)
Additionally, I have a good knowledge of Group Theory and Abstract Algebra. So, there would be no problem in understanding the new parts of group theory if necessary.
Previous Projects
Here is a list of few projects that I’ve worked on in the past one year. A few other minor projects can be found in my Github repositories.
-
Ad.ai: (Github Link - Ad.ai) An NPL integrated messenger bot which helps doctors predict the depression levels of a patient based on the conversation it has with the patient. Also, the bot collects the data from the user and predicts if the depression can is a recurrence or a non-recurrence depression using an existing dataset that was available online. Implemented as a part of a hackathon
-
StalkHub: (Github Link - Stalkhub) A consolidated portal that keeps track of the information about the performance in programming platforms(like Codechef, Codeforces, SPOJ...etc) of all the members of an organization or a school. Major work on this has been done in the last summer.
-
Sensitive Data Trim: (Github Link - SenstiveDataTrim) An application that uses Luhn Algorithm and other generalized representations to detect the sensitive data like credit card numbers and account numbers and trims them off. This was made as a competition held on the Topcoder platform
Patch Requirement
- #14250 - Merged - Add sampling for Gamma inverse Disribution
- #13986 - Merged - Add functionality for infinity norm of matrices
- #14369 - Open - Add support for Quantity(name, dim, scale_factor, abbrev)
- #13963 - Duplicated - Docstring Fix. First PR
I would be submitting more patches once the proposal is submitted as I would be free during the first week of April.
The project
Abstract
This project application mainly aims at improving the Group Theory part of the combinatorics module. It is majorly divided into 3 phases :
-
Phase1: 1. Aims in implementing an automaton for word reduction 2. Implementation of algorithms to compute and check the isomorphism between 2 groups.
-
Phase2: 1. Implementation of the modified Todd-Coxeter algorithm for subgroup presentation. 2. Polycyclic presentation of finitely presented groups. 3. Implementation of the power-conjugate presentation and the corresponding algorithms.
-
Phase3: 1. Rewriting systems for power-conjugate presentation 2. Computing the quotient of groups
Implementation
Implementation of an automaton for word reduction
The current implementation of the word reduction in rewriting systems is implemented by reducing words based on the set of rules (S). As suggested by valglad in GSOC 2017’s report, an automaton can be used for the word reduction method. In the previous implementation of the Knuth-Bendix process, most of the time is taken up in reducing the words to equivalent irreducible forms using the set of rules. Using the automaton for the word reduction method helps in completing the task of word reduction in a much more efficient way.
The implementation can be done in the following way: Let, S be the set of rules, which will be returned by the rewriting system method. A rule (x, y) in S says that an element 'x' that can be reduced to 'y'. Let, M be the finite state automata. The non-final states in M are essentially the left-hand side of the rules, because, the left-hand side of the rules can further be reduced.
Consider the example, w = abc
with the set of rules S and an automaton M. Let b be the first occurrence of a left-hand rule in w from the given set of rules S. Thus this can be reduced into another form based on the corresponding rule. As mentioned in the handbook, when the word w is fed into the automaton M, and when we read the symbol b, we find a dead state for the prefix ab
of w. Then we locate a rule (b, x)
that belongs to S such that b
can be reduced to another symbol x
.
This process is put in a loop until we get a ‘w’ that can’t be reduced further. Thus, the reduced word is obtained. An explanation for the usage of an automaton in word reduction has been given in detail in the section 13.1.3 of the Handbook.
Isomorphism between two groups
This could essentially be split into 2 parts. The first one is to check whether the given two groups are isomorphic to each other. A random_isomorphism_test method is mentioned in the section 11.4.2 of the Handbook to check if the given groups are isomorphic. However, for better efficiency, we can directly check if the isomorphism exists between the groups for certain special cases. For example, isomorphism could directly be decided between two prime groups of the same order.
The second part is to compute the isomorphism between two groups if at all there exists an isomorphism between them, else return None
. GAP implements a method called IsomorphismGroups
for this purpose. Similarly, a group_isomorphism
method can be implemented in Sympy for this purpose. Additionally, a method is_isomorphism
will be implemented to check for the isomorphism before the computation. If the is_isomorphism
method returns False
, the group_ismorphism
will return None
.
The implementation of the group_isomorphism
method can be similar to that of the algorithm used by the GAP engine.
gap> G:=DihedralGroup(8);
<pc group of size 8 with 3 generators>
gap> H:=Group( (1,5)(2,3)(4,8)(6,7), (1,2)(3,8)(4,6)(5,7) );
Group([ (1,5)(2,3)(4,8)(6,7), (1,2)(3,8)(4,6)(5,7) ])
gap> IsomorphismGroups(H,G);
[ (1,5)(2,3)(4,8)(6,7), (1,2)(3,8)(4,6)(5,7) ] -> [ f1*f3, f1*f2 ]
gap> T:=Group([ (1,7,4,3)(2,8,6,5), (1,4)(2,6)(3,7)(5,8) ]);
Group([ (1,7,4,3)(2,8,6,5), (1,4)(2,6)(3,7)(5,8) ])
gap> IsomorphismGroups(T,G);
fail
GAP implements an efficient way to verify the isomorphism between two small groups. Similar to that of GAP, an IdGroups
method can be implemented in Sympy which computes the isomorphism for smaller groups. The implementation can be done by first checking the size of the group and then verifying the isomorphism using a suitable algorithm between the two (Random Isomorphism test & IdGroups) depending upon the size of the group.
But, the implementation of id_groups
requires a dependency on the SmallGroup
library of GAP. Currently, GAP's way of doing this is that it fetches and returns the kth group of a particular order from the predefined catalog they have. But, implementation of a custom database for the Sympy engine could really take up a large part of the timeline. Since we already have another alternative for checking the isomorphism, I have added a buffer period in the timeline, in which I can try to implement the IdGroup
method if we could somehow make use of the GAP catalog of the SmallGroup
class else I could implement a few other things like I said in the 'BufferPeriod' section.
Implementation of the modified Todd-Coxeter algorithm:
Let G be the given group and H be the subgroup of the given group. Now the modified Todd-Coxeter algorithm is used to compute the presentation of the given generating set rather than the Schreier set. An algorithm has been provided for the modified Todd-Coxeter algorithm in Section 5.3.2 of the Handbook.
An attempt has been made to implement this method in the 2016’s GSoC project but was left out after then. Since this is a pretty straightforward algorithm, the implementation of the algorithm can be done right from scratch rather than modifying the previous patch.
Polycyclic presentation of finitely generated groups:
Implement a Polycyclic group - PcGroup
class. A few methods of the pc-groups require similar implementation as that of the FpGroups, so, PcGroup
can be implemented as a child of the FpGroup
such that it will inherit the methods required.
Additionally, algorithms for a few methods like computing the polycyclic generating sequence of a polycyclic group require the given group to be an FpGroup. So it would be better to implement the PcGroup
such that it inherits the properties of the FpGroup
class
Implement a method compute_pcgs
to compute the polycyclic generating sequence of a given group. The method returns a sequence of FreeGroupElements
which serve as the generating sequence for the polycyclic group. An additional method is_polycyclic
will be implemented to check if the Group passed into the compute_pcgs
method is a polycyclic group. The compute_pcgs
method will throw an error if the group that is sent as an argument into the method is not polycyclic.
The compute_pcgs
method requires an implementation of the polycyclic quotient method, mentioned in this paper, to compute the quotients, as per the definition of the polycyclic sequences. Additionally a method is_pcgs
can be implemented to check if the given series generate a polycyclic group.
>>G = FpGroup((1,2,3,4),(1,2));
>>p = Pcgs(G);
[ (3,4), (2,4,3), (1,4)(2,3), (1,2)(3,4) ]
>>IsPcgs( p );
true
>> p[1];
(3,4)
A few elementary methods can be implemented once we have the polycyclic generating sequence of the group G. A few of the methods that can be implemented are, relative_order
, is_finite_order_pcgs
, is_prime_order_pcgs
.
Relative Order can be computed by the straight implementation of the definition as mentioned in the handbook, however, the implementation can be referred from GAP source as well. The is_finite_order_pcgs
method can be computed by checking if all the elements in the relative order of the given Polycyclic group are finite. And similarly, if all the elements in the relative order list are prime it is a Prime ordered Pcgs.
>> G = Group( (1,2,3,4),(1,2) );
>> p = Pcgs(G);
>> RelativeOrders(p);
[ 2, 3, 2, 2 ]
>> IsFiniteOrdersPcgs(p);
True
>> IsPrimeOrders(p);
True
By Theorem 8.8 in the Handbook, every polycyclic sequence can represent a unique polycyclic group. Thus a method will also be implemented to produce the polycyclic series when a sequence is given. This has already been implemented in GAP and it can be replicated in the Sympy engine.
Implement a method pc-presentation
, power conjugate presentation, that takes in a polycyclic group and a set of exponents as values and returns a 4-tuple which contains the polycyclic presentation, polycyclic sequence, relative orders and the setOfExponents.
Implement the collection algorithm. This algorithm provides an efficient way to solve the word problem. The central idea is to eliminate all the minimal uncollected subwords by applying the polycyclic relations, the final word that we get once the algorithm terminates will be collected word. A detailed implementation has been mentioned in section 8.1.3 of the Handbook.
Rewriting systems for polycyclic groups:
Rewriting systems has already been implemented but it was not specific to the polycyclic groups. A method is_consistent
shall be implemented to check if the given power-conjugate presentation is consistent. Checking the consistency of a power-conjugate presentation is a special case of confluence testing in the Knuth-Bendix completion process. The implementation is mentioned in detail in Section 12.4 of the Handbook.
Computing the Quotients of groups
Implementation of the Quotient algorithms to compute the quotient groups of the finitely presented groups. All the quotient methods can be implemented as methods in the FpGroup
class.
A maximal_abelian_quotient(free_group)
method will be implemented to compute the maximal abelian quotient of a given free_group. Gap provides an implementation of this method as shown below:
gap> f:=FreeGroup(2);;fp:=f/[f.1^6,f.2^6,(f.1*f.2)^12];
<fp group on the generators [ f1, f2 ]>
gap> hom:=MaximalAbelianQuotient(fp);
[ f1, f2 ] -> [ f1, f3 ]
An epimorphism_group
method will be implemented which computes an epimorphism from an FpGroup ‘F’ to a p-group ‘P’ with a class ‘C’ which will be a quotient for the Fp-group. This method can be implemented as epimorphism(fpGrp, p-group, class=None)
. Here, the default class will be None
and when no class is passed into the function, the computed epimorphism will be the largest finite p-group quotient. A detailed implementation of this algorithm is given in Section 9.1.1 of the Handbook.
A method epimorphism_nilpotent_quotient( FpGrp, n )
called is implemented in GAP which returns an epimorphism on the class n finite nilpotent quotient of the finitely presented group Fpgrp. If n is omitted, the largest finite nilpotent quotient is returned.
Buffer Period
As mentioned previously, a buffer period is given just as a contingency if we decide not to implement the id_group
method and the SmallGroups
class. This period will be a half a week before the Phase-1 evaluation. I noticed that the group theory part of the combinatorics module lacks proper documentation in specific areas. This can be improved during this period. Also, the corresponding docstrings with the implementation examples can be provided for all the classes and methods included.
Proposed Timeline
Here is the tentative timeline that goes in line with the plan of my implementation.
Community Bonding Period: April 23rd - May 14th
Goal: Interact with the community. Look into the raw implementation of a few methods that GAP implements.
- I will be having the end semester examinations between April 27th and May 8th. However, I can guarantee that I can stay active on the Gitter channel and interact with my mentor.
- Setup a blog to post the work done during GSoC period from time to time.
Phase-1
Week 1: May 14th - May 20th
Goal: Implement the automaton for word reduction.
- As mentioned, an automaton for word reduction can be implemented during this period.
- There are ways other mentioned to improve the Knuth-Benedix completion process algorithm. This can be looked into if there is sufficient time after the implementation of the automaton.
- Test the code using GAP and update the blog.
Week 2: May 21st - May 28th
Goal: Implement an IsomorphismGroups
method for the computation of Isomorphism between two groups.
- Look into GAP implementation of the
IsomorphismGroups
method. - Integrate this method into the Sympy engine.
- Finish up the work, test the code and update the blog.
Week 3 & 4: May 29th - June 11th
Goal: Complete the whole of Isomorphism between two groups section
- Implement the random isomorphism test.
- Try integrating the
SmallGroups
catalog of GAP into Sympy.Implement the contingency work mentioned in the Buffer Period section. - Test the code using GAP. Update the blog.
- Submit the current status of the project for phase-1 evaluation.
Phase 2
Week 5: June 12th - June 19th
Goal: Implementation of modified Todd-Coxeter algorithm
- Implement the modified Todd-Coxeter algorithm for subgroup representation.
- Test the code using GAP, Update the blog.
Week 6 and 7: June 20 - July 4th
Goal: Polycyclic presentation of finite group and related algorithms
- Implement a
PcGroup
class as a child of theFpGroup
class. - Implement methods to compute the polycyclic generating sequence for a given group.
- Implement an
is_polycyclic
method to check if the given group is polycyclic. - Implement other elementary methods like
RelativeOrder
,isPrimeOrder
...etc - Test the code using GAP. Update the blog.
Week 8: July 3rd - July 11th
Goal: Implement power conjugate presentation.
- Implement other elementary methods of Pc-Groups as in the GAP library.
- Implement methods for computing the power conjugate presentations.
- Implement the collection algorithm to solve the word problem.
- Test the code using GAP. Update the blog.
- Submit the project for phase 2 evaluation.
Phase 3
Week 9: July 12th - July 18th
Goal: Rewriting system for polycyclic groups
- Use pc-presentations for rewriting system.
- Implement a method
is_confluent
to check if the given presentation is consistent. - Test the code using GAP. Update the blog.
Week 10 & 11: July 19th - July 31st
Goal: Implementation of Quotient groups
- Implement the quotient methods using the corresponding algorithms.
- Implement the
MaximalAbelianQuotient(free_group)
method for the computation of the abelian quotient. - Implement the epimorphism group method
Epimorphism(fpGrp, p-group, class=None)
. - Finish up implementing the other quotient methods using the algorithms mentioned.
- Test the code. Update the blog.
Final Week
- Complete the unfinished work, if any.
- Tidy up the code quality and documentation of the code implemented during the GSoC period.
- Submit the code for final evaluation.
A PR will be sent every week with the corresponding implementation of that week's plan along with the unit tests.
Commitment to the project
I would be having my end semester examination during the community bonding period. However, I will have the time to interact with the community and my mentor.
My summer break will start by the time the community bonding period ends and I can devote my complete to the project during my holidays. I can work for 7-8 hours during the weekdays and 3-4 hours during the weekends. As a whole, I can spend 40-50 hours in a week.
Post GSoc
- Continue contributing to the community.
- Try improving the other sub-modules of the library.
- Review a few patches from time to time.
- I recently looked into the idea of one-one pattern matching using an external library, MatchPy. I can as well try contributing to this after the GSoC period ends.
References
- [1] Derek F.Holt, Bettina Fick, Eamonn A.O'Brian. Handbook of Computational Group Theory.
- [2] GAP reference manual
- [3] Groupprops subwiki for rewriting systems
- [4] Paper on Polycyclic Quotient Algorithm
- [5] Valeriia Gladkova's GSoC'17 report
- [6] Valeriia Gladkova's GSoC'17 proposal
- [7] Gaurav Dhingra's proposal