ETL ESB - sgml/signature GitHub Wiki

History

Decade Equivalent to Airbyte Why it fits Active/Inactive Open-Source Alternatives at the Time
1960s Custom COBOL/Assembler batch jobs Early data integration was hand-coded batch processes moving records between mainframes and tape. Active: Inactive (legacy only, no active development) None (integration was bespoke, proprietary mainframe code)
1970s IBM Information Management System (IMS) and Customer Information Control System (CICS) batch utilities Enterprises used IBM's IMS databases and CICS transaction systems with utilities to extract and load data. Partially Active (IMS/CICS still exist, but batch ETL utilities are legacy) None (open-source movement hadn't reached enterprise ETL yet)
1980s SAS Data Integration / early ETL utilities SAS and similar tools offered reusable scripts for extraction and transformation. Active (SAS still maintained, though niche) Early Unix shell scripting, awk/sed pipelines (community-driven but not formal ETL tools)
1990s Informatica PowerCenter (1993), IBM DataStage (1997) Commercial ETL platforms matured, providing graphical interfaces and connectors. Active (still maintained, enterprise use) GNU tools (Perl, Bash, awk) used for DIY ETL; no formal open-source ETL platforms yet
2000s Talend Open Studio (2006), Pentaho Kettle (2001) Open-source ETL tools appeared, democratizing integration. Active (Talend and Pentaho still maintained, though less dominant) Talend, Pentaho Kettle (a.k.a. PDI), CloverETL (community edition)
2010s Apache NiFi (2014), Singer.io (2017), Fivetran (2012) Cloud-native and open-source pipelines emerged; Singer's tap/target spec resembles Airbyte's connectors. Active (NiFi, Fivetran, Singer maintained, though Singer less active) Apache NiFi, Singer.io, Apache Sqoop, Luigi, Airflow (for orchestration)

Non-HTTP Transports

Alternative Submission Method Transport Mechanism Example Free Server (Pi 1 default‑ready) GitHub URL License
FTP uploads File Transfer Protocol vsftpd https://github.com/vsftpd/vsftpd GPL-2.0
SFTP uploads SSH File Transfer Protocol Dropbear SFTP https://github.com/mkj/dropbear MIT-style
Raw TCP sockets Custom socket protocols Netcat (listen mode) https://github.com/openbsd/src/tree/master/usr.bin/nc BSD
UDP datagrams Lightweight transport BIND (DNS server) https://github.com/isc-projects/bind9 MPL-2.0
Bluetooth data transfer Short‑range wireless BlueZ stack https://github.com/bluez/bluez GPL-2.0
Email submission SMTP transport Exim https://github.com/Exim/exim GPL-2.0

EDI

                 [ Open Source EDI Projects (Non-JS) ]
                               |
   -------------------------------------------------------------------
   |                           |                                    |
[ Python ]                 [ Java ]                             [ PHP ]
   |                           |                                    |
   |                           |                                    |
[ pyx12 ]                 [ Smooks ]                           [ bots-edi ]
 "Parses HL7/X12           "Transforms X12 into XML/JSON        "EDI translator with
  healthcare sets           for healthcare flows"               mapping & routing"
   |                           |                                    |
   |                           |                                    |
   -----------------------------+------------------------------------
                               |
                               V
                 [ Healthcare Transaction Sets ]
   -------------------------------------------------------------------
   |                           |                                    |
[ 837 Claims ]            [ 835 Remittance ]                  [ 834 Enrollment ]
 "Patient billing &        "Electronic remittance advice       "Benefit enrollment
  insurance claims"         for payments"                      and member updates"
   |                           |                                    |
   |                           |                                    |
   -----------------------------+------------------------------------
                               |
                               V
                 [ Shared Capabilities Across Projects ]
   -------------------------------------------------------------------
   | - Parse X12 segments (ISA, GS, ST...)                         |
   | - Validate compliance (997/999 acknowledgments)               |
   | - Map EDI → internal models (JSON, CSV, DB)                   |
   | - Support batch transport (SFTP, AS2, TCP/IP, WebDAV)         |
   | - Transform data (XML, XSLT)                                  |
   -------------------------------------------------------------------

Comparison of Open Source Service Bus Implementations

Feature Cadence Titanoboa
Language Go Java (JVM-based)
Workflow Model Event-driven workflow execution Low-code workflow orchestration
Database Support MySQL, PostgreSQL, Cassandra Any relational DB via JDBC
Security Mechanisms TLS encryption, authentication via IAM User authentication, token security, role-based access
Scalability Highly scalable via microservices Modular, scales based on workflow complexity
Fault Tolerance Durable execution with automatic retries Workflow recovery and rollback mechanisms
State Management Built-in state persistence Supports external state persistence
Memory Requirements Lightweight (~MBs) Scales dynamically (~MBs-GBs)
Use Case Distributed workflow engine for async tasks Low-code service bus for orchestrating workflows

References

https://www.freecodecamp.org/news/sqlalchemy-makes-etl-magically-easy-ab2bd0df928/

https://dev.to/zchtodd/sqlalchemy-performance-anti-patterns-and-their-fixes-4bmm

https://stackoverflow.com/questions/19334604/creating-seed-data-in-a-flask-migrate-or-alembic-migration

https://news.ycombinator.com/item?id=19098246

https://hakibenita.com/fast-load-data-python-postgresql

https://docs.konghq.com/hub/kong-inc/openid-connect/support/

https://www.ibm.com/docs/en/datapower-gateway/10.5.x?topic=gateway-programming-model-gatewayscript