-
Notifications
You must be signed in to change notification settings - Fork 188
Google Summer of Code 2020 Ideas
We are hoping to participate in GSoC 2020 as a sub organization under the Python Software Foundation. You can find out more about that at http://python-gsoc.org/.
Tern analyzes container images for license information. It will tell you the OS that your container is based on, what packages were installed, their versions and their associated licenses. It can also work with other license scanners to find this information at the file level. You can also provide it a Dockerfile, which it will use to build a container image and analyze it with information about what line in the Dockerfile brought in what packages so engineers can make decisions on how to best write their Dockerfile to containerize their applications. It supports a number of reporting formats including SPDX. Tern can be used during cloud native application development, as part of a CI/CD pipeline or as a containerized service.
Tern mostly works on Docker images at this time. We cannot use already available images on Dockerhub to run our unit tests as they may change without warning. Hence we have to create our own images to test against. The same problem applies for functional tests that work on specific kinds of situations. This gives you an opportunity to get hands-on with building containers using Docker, basic docker CLI commands and docker container image layouts along with Python3.
Related issues:
Tern needs to be able to understand how to get license information for packages installed with language package managers. We will aim to enable ruby and node package managers and a stretch goal of enabling golang if that is something you want to do. This gives you an opportunity to improve the accuracy of Tern's default method of analysis and tackle some advanced Python coding along the way.
Related issues:
The metadata for the layers in an analyzed container are stored in a cache which is just a yaml file. Convert this into a key-value store database. The first step is to pick a good key-value store library to implement (which will require some architectural discussions with the maintainers and a proposal for the chosen implementation). The next step is to migrate the codebase to the new implementation. This means that all the functions used to query the database (notably cache.py and some functions in common.py) will need to be updated. Working on this allows you to learn some container image fundamentals, key-value stores and get familiar with Tern's core codebase.
Related issues:
There are other open issues available to work on. Some of these issues are proposals from the community for which there is no roadmap to implementation. If you're up for a challenge you can pick those issues. If you have an idea to propose (one such proposal is a UI to read the data Tern produces), go ahead and submit it as an issue to discuss!
Development requires a Linux distro (we typically use Ubuntu) with Docker installed. Once you are ready, you can follow the README and see if you can run Tern on a docker image of your choice. You can find images to play with on hub.docker.com.
The maintainers and contributors are on a Slack channel. Follow these instructions to get access. Once you are in, join the channel #tern
. We're looking forward to seeing you!
Discussions on issues happen on the GitHub issues. You can also file an issue to ask a question or propose an idea you would like to work on.
Instructions on how to apply are found on the Python GSoC website. Our sub-org name is "tern". Don't forget to include it your application title!