GSoC 2026 - s3tools/s3cmd GitHub Wiki

Welcome to the S3cmd Google Summer of Code 2026 projects page.


Contributor's guide

We are quite open and don't require a lot of formalities for you to apply for a GSoC project with us.
In the following, you will find more info to help you determine if we could be a good fit for each other.

What is required

  • A good knowledge of Python
  • A basic knowledge of Git and Github
  • An understanding of what is an API and how to interact with a server
  • Being comfortable with the usage of command line tools
  • Be curious

Having some experience dealing with various versions of Python running on multiple OS (Linux, Mac, Windows) would be a great plus.
A previous experience with S3cmd, S3, Object Storage, or cloud services is NOT REQUIRED to apply but would be appreciated.
It is usually fun and easy to understand when you are new to the subject.

Note:

Despite being a client for "Object Storage" services, you can expect to be able to develop and test S3cmd with no or very low cost:

  • s3cmd is entirely based on Python with a very limited number of basic dependencies and doesn't need compilation, so very little "computer resources" are needed.
  • Small Open Source and S3 compatible servers can easily run in local (Ex.: Minio).
  • Cloud object storage services usually offer very comfortable "free tiers".

Apply for a project

You can find a list of suggested project ideas in the following of this page ([link](# Idea List)) but we also encourage candidates to come up with their own project idea.

How to apply:

  1. Try to understand the project and eventually give a try to s3cmd
  2. Read the GSoC timeline, contributor responsibilities to ensure your eligibility
  3. (Recommended) Open a new issue here with the "[GSoC2026]" tag in title to present yourself
  • Who are you?
  • What is your background?
  • In which country are you located? Which Timezone?
  • What is your motivation to become a contributor for the S3cmd organization?
  • Which projects are you interested in and why?
  • What is your projected availability during the program to complete the project?
  1. Submit your application to the Google system before the deadline on March 31 (18:00 UTC). All applications must go through Google's application system; we can't accept any application unless it is submitted there.

Feel free to send an email to florent AT sodria.com if you want to exchange privately, to ask questions or discuss of a possible application.


Idea List

Project 1 - Create a new cache feature backed by an embedded database

  • Desirable skills: Python, DB, SQLite, S3 API
  • Estimated duration: 350 hours
  • Difficulty: hard
  • Mentor: @fviard

To be able to synchronize local and remote files, we have to compare the file "hash" from both sides.
This requires us to do an expensive "recalculation" of local files "hashes" at each run.
Performance can be improved a lot by using a cache of local files "hashes" to avoid this recalculation.

Currently, s3cmd has already a "cache" feature but its implementation is very inefficient.
It is a single raw text file based on a "pickle" marshaling of the file list in memory.
We could improve considerably the performance, the reliability and the memory usage by developing a brand new cache logic that would use an embedded database (Sqlite3, MDB, LMDB, ...) to store the information.

In addition, there is a limitation of the s3 protocol regarding big files (ie multipart files) that prevents us to be able to retrieve the "hash" of the remote side for such a file.
If the new cache system could record some info about the remote side, the performance could be boosted even more.

Project 2 - Add a server "profile" option to tweak the logic behavior to specifics of a given target server/service type

  • Desirable skills: Python, API, Cli, S3 API
  • Estimated duration: 350 hours
  • Difficulty: medium
  • Mentor: @fviard

Originally, s3cmd was developed to only interact with the Amazon AWS S3 service.
Little by little, a lot of other cloud services appeared that were offering an S3-Compatible interface.
At the same time, a lot of OSS and proprietary software stacks were also created to self host S3-Compatible servers.

Sadly, so far, s3cmd stayed a "one size" fit all client for S3 services, with the lowest common denominator in term of API usage for all servers and services.
For example, we expect all services to use "MD5" for "file hash" calculation.
Or we might not profit of more interesting API or API versions provided by some services as they are not widely available.

The purpose of this project is to offer a way for users to select a preset "profile" for the service that he is using.
Each profile will have a predetermined set a dynamic behavior configurations like "feature flags".

The main goal of this project is to create the profile option and the general logic.
Secondary goals are to create some dynamic behaviors using these profiles, and to create the profile for most common server/services types (ex.: "aws", "gcs", "digitalocean", "scaleway", "ibmcos", "minio", "radosgw", ...)

Project 3 - Add shell scripts for command line auto-completion

  • Desirable skills: Shell, Bash, ZSH, Python
  • Estimated duration: 175 hours
  • Difficulty: medium
  • Mentor: @fviard

It would be nice to have the proper shell scripts to have command line auto-completion for s3cmd.
It should auto-complete commands but also be able to retrieve "remote" path suggestions when possible. This project would probably require more "shell" skills than "Python" skills.

We would like to have an auto-completion script at least for Bash and ZSH,
but it would be ok if a student wants to do a smaller by only supporting a single shell type (Bash or ZSH).

Related: https://github.com/s3tools/s3cmd/issues/985 , https://github.com/s3tools/s3cmd/issues/1092

Project 4 - Modernize s3cmd with a Python 3 refactor and an extra asyncio interface

  • Desirable skills: Python, API, Cli, S3 API
  • Estimated duration: 350 hours
  • Difficulty: medium
  • Mentor: @fviard

Over time, the codebase has accumulated legacy patterns (including historical Python 2 compatibility and older internal abstractions), which makes maintenance and evolving the tool harder than it needs to be.

The purpose of this project is to produce a “new generation” of s3cmd that is Python 3–only and easier to extend, while preserving the UX people rely on. In addition, the project will introduce a first-class asynchronous (asyncio) interface for library usage, alongside the existing synchronous behavior, so that s3cmd can be embedded cleanly into modern Python apps and automation.

Key goals:

  • Drop Python 2 compatibility code paths and modernize Python style (typing where beneficial, clearer modules, simpler flow).
  • Refactor internals toward a cleaner separation between:
    • CLI argument parsing / output formatting
    • core “operations” (S3 actions like put/get/sync/ls/etc.)
    • S3 transport/client layer (S3 API calls, retries, pagination, error mapping)
  • Add an asyncio-based API (e.g., AsyncS3Client / async operations) that mirrors the synchronous API where possible.
  • Keep the CLI stable (or introduce changes only behind clearly documented flags), with a strong focus on avoiding regressions.
  • Improve automated tests and CI coverage around refactored components to make future contributions safer.

The hardest part of this project is not the mechanical Python 3 migration itself, but designing the new internal structure and async interface in a way that stays understandable, doesn’t duplicate code unnecessarily, and doesn’t break existing workflows.

Project 5 - Rework the command line parser to be more user friendly

  • Desirable skills: Python, Shell
  • Estimated duration: 175 hours
  • Difficulty: medium
  • Mentor: @fviard

s3cmd supports a huge number of commands and flags from the command line.
After adding so many features, the help is really crowded and it might be hard for a new user to understand how to use a command, or which flag might be relevant. The purpose of this project would be to rework the parser, re-organise commands, and maybe group them, in order to be able to provide a proper "help" per command that will not be crowded with useless flags.

Related: https://github.com/s3tools/s3cmd/issues/1035

Other ideas

Additionally, you can have a look at the opened issues with "feature-request" labels to find alternative idea of projects: https://github.com/s3tools/s3cmd/issues?q=is%3Aopen+is%3Aissue+label%3Afeature-request