Recently, we engaged on a client project using the Microsoft Stack and leveraged SSIS with Github for source control. Tools such as Talend, DataStage and SSIS support this notion to varying degrees. Recently, multiple ETL type vendors have supported this fact by releasing source control plugins to external tools or developing inhouse control systems. This is analogous to the crown jewels of software development. Can the same benefits be applied to the data domain? Particularly with extract, transfer and load (ETL) routines, that are typically written in codified routines and undergo rapid changes to adapt to business requirements? The answer is YES! ETL Version ControlĪn active analytics pipeline sees ETL development code undergo quite rapid changes throughout its lifecycle. Git is a popular option in this space due to the combination of its involvement with open source projects, branch-merge competency and distributed code control. Most VCs enable the ability to rollback to previous changes, merge new features of a product, centralizing committed source code as well as invites the opportunity for continuous deployment. While not perfect, many of these issues are overcome through version control systems. The speed at which businesses evolve introduces significant challenges with releasing and managing code at an equivalent velocity. It was meticulous, cumbersome, prone to error and frankly, for any developer, an unpleasant experience. There was a point in time when it was common practice to manually manage and track versions of software development code across developers.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |