Network Engineering department provides in-depth analysis of worldwide telecommunications networks. As a Data Engineer you will be working in a team responsible for designing, building, and maintaining datasets and data pipelines. If youre keen on working with big data sets, using modern technologies to collect and integrate data from various sources, finding not obvious insights from collected data - join us - even if you just want to begin your data journey!
We work with:
Python and/or Scala for data processing
Large datasets stored in various databases (blob: HDFS+Parquet, S3; PostgreSQL; MongoDB)
Airflow for data pipelines orchestration
Kubernetes / Docker for containerization
Selected BIs for data visualization (Tableau, PowerBI, Superset)
good Python knowledge (even better if combined with data libs: pandas, NumPy)
knowledge of Scala is a plus, interest in learning Scala is key!
familiarity with Apache Spark and column-oriented data formats (parquet / orc)
familiarity with partitioning, indexing and retention strategies
good knowledge of SQL in the context of working with > 1mln rows tables
ability to work with HDFS and S3 storages
nice to have: Azure in the context of data processing (data bricks, adls, data factory, synapse)
Master’s or Bachelor’s degree in Computer Science, Software Technology or equivalent education