Data Pipeline Design and Development: Design, develop, and maintain robust and scalable data pipelines to ingest, transform, and make available heterogeneous data (structured or unstructured such as satellite images) from different actors in the space world. This includes managing real-time and batch data flows, as well as integrating various data sources.
Data Management and Data Ops: Implement Data Ops practices to ensure the quality, reliability, and availability of data throughout their lifecycle. This includes metadata management, data governance, data quality, version management, performance monitoring, and resolving issues related to data pipelines. To do this, you will ensure the proper use and implementation of our data observability tool.
Cloud Data: Use services and tools from the Google Cloud Platform (GCP) to design and implement efficient and scalable data solutions. Leverage services such as Airflow, Bigtable to meet storage, processing, and data provisioning needs. You are able to manage your CI/CD on GitLab.
Data Architecture Data Mesh: You will implement Medallion and Data Mesh architectures and contribute to the implementation of these approaches in our data infrastructure. You will work closely with the Ops team to design and implement a scalable, decentralized, and domain-oriented/data products architecture.
Collaboration and Knowledge Sharing: Collaborate with business teams, data scientists, and other members of the Data team to understand data needs, propose suitable solutions, and share knowledge about best practices in data management and development.