Another month has passed, so here we are with a new roundup!
In early May, we released version 0.21. Check the release announcement here
We would like to thank everyone who tested 0.21 and created feature requests and bug tickets. Fixing the bugs and implementing the feature requests will keep us busy for another couple of weeks.
Apart from the topics discussed below, we have continued to walk the path of major code cleanup and refactoring. We don’t want to bore you with the technical stuff, those interested in the deep details can join the discussions in our #dev channel.
Here’s what happened in May!
Apache Beam Support
We now have integrated support for Apache Beam. Beam is an advanced unified programming model that lets you implement batch and streaming data processing jobs that run on any execution engine.
The Beam integration comes with a number of additional transforms:
BigQuery input and output: read from and write to Google BigQuery tables
GCP PubSub subscribe and publish: read from and write to Google Cloud PubSub
Kafka Consume/Produce: read from and write to Kafka streams
Beam Input/Output: define where Beam should read files from or write files to
Beam Timestamp: add timestamps to a bounded data source
Beam Window: create a Beam window
Beam adds 4 additional pipeline run configurations:
Beam DataFlow pipeline engine: run pipelines on Google DataFlow
Beam Direct pipeline engine: a local pipeline engine provided by the Apache Beam community as a way of testing pipelines
Beam Spark pipeline engine: run pipelines on Apache Spark
Beam Flink pipeline engine: run pipelines on Apache Flink
The support for these 4 additional engines brings us closer to the "design once, run anywhere" goal we share with Apache Beam. With Hop’s native local and remote pipeline run configurations, we now have 6 supported engines to run your pipelines on.
With all of the other work that needed to be done, plugin porting continued but at a slower pace than before.
Database plugins: all done
Workflow actions: all done
Transform actions: 100 plugins done (many plugins contain multiple transforms), 37 to go
This is the first time we have a "Community Input" section in these monthly overview.
The projects we discuss below are at different maturity stages, but are a clear indication of growth and activity we’re seeing in the community. Thanks guys, keep up the good work! We’ll continue to work with the community to move these projects forward and keep you updated on progress.
Although not directly related to a project, we’d like to do a shout-out to Nicolas Adment. Nicolas has been all over the place in everything Hop-related since early this year. A big thank you, Nicolas!
We started working to bring a web ui to Hop. There still is some work to do, but the WebHop container works, starts in a couple of seconds, and the project is getting usable.
A big thank you to Hiromu Hota for working with Matt on this!
Check this project at GitHub
The goal of the hop-docker project is to allow Hop to run in both short and long-lived containers.
Check this project out:
GitHub repository: https://github.com/project-hop/hop-docker
Hop UIT - Ultimate Migration Tool
Uwe Geercken picked up the task to build a migration tool to import Kettle/PDI jobs and transformations to Hop workflows and pipelines.
We’ll work with Uwe to make hop-uit fully functional so you can start migrating your Kettle/PDI projects to Hop soon. In the meantime, don’t hesitate to give hop-uit a try and file and bugs you find or feature requests that you have in JIRA
Check the project out on GitHub
All actions and transforms, even the ones that haven’t been ported to Hop yet, are now documented.
Since we’ll start to add a lot of documentation in the next weeks, we integrated a search engine in the Hop website. Although we’ll work on an easy to use navigation and overall documentation structure, sometimes search is just the easiest way of finding what you’re looking for.
If there is anyone willing to help write documentation contact us and we’ll be happy to get you started.
While we’re preparing for a 0.30 release, we’ll continue our code cleanup, bug hunting and documentation efforts!
Next up is a configuration system to change and manage options, porting those final transforms and integrate VFS in HOP.