CANA Data Engineer
We’re looking for an experienced data pipeline builder and wrangler who enjoys optimizing data systems and building them from the ground up. The CANA Data Engineer will build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of large, disconnected data sources.
The CANA Data Engineer will build processes supporting data transformation, data structures, metadata, dependency, and workload management; identify, design, and implement internal process improvements.
The position requires a graduate degree in Computer Science, Statistics, Informatics, Information Systems, or another quantitative field. Specific skills include knowledge and experience with, as examples: data tools (Hadoop, Spark, and Kafka); relational SQL and NoSQL databases (Postgres and Cassandra); data pipeline and workflow management tools (Azkaban, Luigi, Airflow); AWS cloud services (EC2, EMR, RDS, Redshift); stream-processing systems (Storm, Spark-Streaming); object-oriented/object function scripting languages (Python, Java, C++, Scala); and message queuing and stream processing. If you are interested in learning more about the position, connect with me!
CANA Resource Lead