Research on GitHub Repositories - bounswe/2021SpringGroup9 GitHub Wiki
We have explored the GitHub to discover some useful repositories about any subject with a proper software engineering practice.
Below the documentation of our research:
Transformers
Backed-up by two of the most popular deep learning libraries, PyTorch and TensorFlow, Transformers offers its users easy-to-use, state-of-art Natural Language Processing (NLP) models. It is definitely a “must know” repository for the ones who are interested in or dealing with ML/IE related subjects.
- Since Information Retrieval and ML are crucial technologies in computer science, it is a need for almost every project to implement these techniques, pre-trained models.
- Transformers allows users to implement these pre-trained models easily, with it’s clear README and also with it’s tutorial.
- README offers even a couple of online demos on how to properly use the features, and a quick tour with example code snippets showing how should one get start with Transformers.
- Model architectures and relevant papers for each of them also provided in the README
Detectron
Detectron is a system that detects objects and states what they are that is made by using Python and Caffe2.
- It includes trained models here: Model Zoo
- They provided a guide for the setup here: Installation
- There is a guide about the general use of it here: Getting Started
- References for the project are stated in README.md here: References
Now there is a newer version for Detectron named Detectron2. It is a ground-up rewrite.
- It uses PyTorch as deep learning framework instead of Caffe2 and is faster.
- It has a pretty good and wider (than the Detectron) documentation here: Detectron 2 Documentation
Projects such as these inspire many other researches and new projects since they have a wide range of use. Some of them for the original Detectron are stated here: Research projects
Z3 Theorem Prover
Z3 Theorem Prover is developed at Microsoft to provide a powerful tool for software verification and theorem proving. The project is very research oriented and can stand out as an interesting one. Z3 can solve a wide variety of mathematical problems very fast and the team is working actively to enhance its capabilities even more. The prover has bindings for multiple programming languages such as Python, C++, Java, Julia. I discovered Z3 when I was trying to develop a solver for some kind of puzzle similar to Sudoku.
- README explains the installation for different programming languages well.
- There are links to papers, slides, blogs and social media.
- Written and video tutorials are easily accessible.
- The repository contains contribution guideline, FAQ and documentation pages.
- There's a webpage to try the system online.
Ruffle
Ruffle is a project that aims to implement an emulator of recently discontinued Adobe Flash player using WebAssembly. Using newer and safer technologies like WebAssembly and Rust helps for avoiding possible security issues that were abundant in Flash. Found this project while searching for a way to play Age of Wars (2007) again.
- Documentation on the github wiki section.
- There is a demo link on the README file.
- Github discussions are actively used.
- Has an active dicord server for contributors and users.
- Bonus points for using rust.
Awesome Python
Awesome Python is a repository consisting of a list of all awesome Python libraries, frameworks, software and resources. Their motto is "Life is short you need Python." which I agree a lot with. I think that Python provides a lot of useful libraries and although I use Python frequently, there are a lot of libraries, software and frameworks that I don't know about. This repository helps its users to find the most suitable and needed Python frameworks and libraries for their projects.
When you open the repository it welcomes you with a detailed, and yet, easy to read README file. The following are what I liked about the README:
- In the first line, it explains what this repository does with a simple one sentence. This helps the user to understand the functionality of the repository just as they start to read.
- They have a table of content so that the user can find the frameworks and libraries they need by choosing the topic they want. For example, if I want a tool for Data Visualization, README directs me to the list of Data Visualization tools that contain Matploblib, Pygal, Dash and many more.
- They also have Github Pages which looks simple and contains a lot of useful information.
I think all Python users should have access to this repository.
Apache Spark
Spark is an analytics engine that provides very useful tools for large-scale data processing. It's a must have in many areas that require manipulation and inspection of large data. What I like most about Spark is that it provides high level APIs for different programming languages such as Java, Scala, Python etc.(I am familiar with the Scala API) It also provides some other tools such as Spark SQL, DataFrames and so on. If you view the documentation, you can see that it is very pleasing to the eye and easy to understand and well-organized.
- In the beginning of the README file, a brief summary about Apache Spark is put together.
- After that, the building and integrating Spark into your project is described for various different programming langauages.
- Lastly, some example programs are mentioned that are included in the Spark and how one can run them. (Such as Pi example)
I think everyone who is interested in DB management, ML and Data Analytics should become familiar with Spark which is a very powerful tool for any computer scientist / engineer. Melih Özcan
Dear ImGui
Dear ImGui is a GUI library for C++. It is an immediate mode GUI library; so instead of drawing window contents with operating system's API and waiting for interrupts, window data is expressed as vertex buffers which can be rendered using a graphics library (eg: SDL, OpenGL), and window contents are specified and updated inside a loop. That means the entire GUI is rebuilt at each iteration of the loop. This has some performance costs, but it allows for a more dynamic GUI, also it is easier to build. Furthermore, the GUI windows can be placed on top of other graphical programs which are using the same graphical library (eg: games, real-time 3D applications, fullscreen applications).
- Easy-to-read README file which explains the project, gives many examples, and includes a FAQ
- Well documented and comprehensive Wiki section
- Good use of Issues section that contains both issues from the users and discussions among developers
DBeaver
DBeaver is a multi-platform database tool. It supports more than 80 databases. Its main goal is usability. Developers, SQL programmers, database administrators and analysts can use DBeaver in a professional way because it is a universal database management tool for everyone. I liked this repository because of many reasons.
- It has a simple explanation of what the application is about and how to start using the application in the README file.
- It is well documented in the Wiki page. The topics are well segmented, and they can be found easily.
- Almost every Wiki page has visual aid to make the documentation understandable.
- Incomplete headings are red in the Wiki page. In this way, users know that it is a work in progress.
Neural Doodle
Neural Doodle turns two-bit doodles into masterpieces using deep neural metwork. The algorithm extracts annotated patches from the style image, and transfers over to the target image. I liked this repository because it is so clear and image processing is an area I am interested in.
- Setup guide is provided: Installation
- Example pictures and videos about Neural Doole in the README file makes the repository understandable.
- Possible problems that may arise and solution ways about these problems are shared: Troubleeshooting
I recommend those with an interest in image processing to check out this repository.
Reference: https://github.com/alexjc/neural-doodle