Profiles Desarrollo (IT)
You will join a Multicultural Company where you will get to play out your skills.
At Data Performance team, we are building Intel Data Store (IntelDS), a global Data Lake for enterprise data. It is a Big Data platform fully hosted on AWS and connected today to more than 40 data sources.
The job purpose is to support the big data engineering team building and improving IntelDS by:
• Connecting new sources to enrich the data scope of the platform.
• Design and develop new features based on consumer application requests to ingest data in the different layers of IntelDS.
• Automate the integration and delivery of data objects and data pipelines
The duties and responsibilities of this job are to prepare data and make it available in an efficient and optimized format for our different data consumers, ranging from BI and analytics to data science applications. It requires to work with current technologies used by IntelDS and in particular Apache Spark Spark, Apache Presto, and RedShift on AWS environment. This includes:
• You will design and develop new data ingestion patterns into IntelDS Raw and/or Unified data layers based on the requirements and needs for connecting new data sources or for building new data objects. Working in ingestion patterns allow to automate the data pipelines.
• You will participate to and apply DevSecOps practices by automating the integration and delivery of data pipelines in a cloud environment. This can include the design and implementation of end-to-end data integration tests and/or CICD pipelines.
• You will analyse existing data models, identify and implement performance optimizations for data ingestion and data consumption. The objective is to accelerate data availability within the platform and to consumer applications.
• You will support client applications in connecting and consuming data from the platform, and ensure they follow our guidelines and best practices.
• Yoy will participate in the monitoring of the platform and debugging of detected issues and bugs.
You are the person we are looking for if you:
- You have minimum of 1-2 years prior experience as data engineer with proven experience on Big Data and Data Lakes on a cloud environment.
Bachelor or Master degree in computer science or applied mathematics (or equivalent).
• You have proven experience working with data pipelines / ETL / BI regardless of the technology.
• You have proven experience working with AWS including at least 4 of: RedShift, S3, EMR, Cloud Formation, DynamoDB, RDS, lambda.
• You have experience at Big Data technologies and distributed systems: one of Spark, Presto or Hive.
• You have experience with Python language: scripting and object oriented.
• you are an expert in SQL for datawarehousing (RedShift in particular is a plus).
• You are familiar with GIT, Linux, CI/CD pipelines is a plus.
• you have strong systems/process orientation with demonstrated analytical thinking, organization skills and problem-solving skills.
• You are able to self-manage, prioritize and execute tasks in a demanding environment.
• You have strong consultancy orientation and experience, with the ability to form collaborative, productive working relationships across diverse teams and cultures is a must.
• You have willingness and ability to train and teach others.
• You are able to facilitate meetings and follow up with resulting action items.
What's in it for me?
• Permanent contract
• Competitive salary according to the experience
• Training plan and access to our training platform where you can develop your professional and personal skills • “SEvoluciona” policies: work-life balance, flexitime
• Flexible compensation plan: restaurant tickets, health and life insurance, etc.
• Career path opportunities within a multinational company
• and more benefits per site…!