As a Big Data/PySpark Engineer at Avanade, you will have a deep understanding of the architecture, performance characteristics and limitations of modern storage and computational frameworks, with experience implementing solutions that leverage: HDFS/Hive; Spark/MLlib; Kafka, etc. You will have knowledge in Apache Spark and/or Python programming, deep experience in developing data processing using PySpark such as reading data from external sources, merge data, perform data enrichment and load into target data destinations.
Deep experience in developing data processing tasks using PySpark such as reading data from external sources, merge data, perform data enrichment and load into target data destinations. Knowledge of python packaging, azure/data lake, and data bricks. Experience with ML. Manage large volumes of structured and unstructured data and extract & clean data to make it amenable for analysis. 80% travel is required.
Want more jobs like this?
Get jobs in New York, NY delivered to your inbox every week.
Day-to-day, you will:
- Give colleagues and clients the tools to find and use data for routine and non-routine analysis
- Use your sound eye for business to translate business requirements into technical solutions
- Analyse current business practices, processes and procedures to spot future opportunities
- Assess client needs to build bespoke data design services
- Build the building blocks for transforming enterprise data solutions
- Design and build modern data pipelines, data streams, and data service Application Programming Interfaces (APIs)
- Craft the architectures, data warehouses and databases that support access and Advanced Analytics, and bring them to life through modern visualization tools
- Implement effective metrics and monitoring
- Be comfortable to make your own decisions and guide your colleagues
- Travel as needed
- Transforming business needs into technical solutions
- Mapping data and analytics
- Data profiling, cataloguing and mapping to enable the design and build of technical data flows
- Use validated methods to solve business problems using Azure Data and Analytics services in combination with building data pipelines, data streams and system integration
- Knowledge of multiple Azure data applications
- Experience in preparing data for and building pipelines and architecture.
- Experience with the following: Apache Spark, Python, PySpark, Hadoop, SQL and Azure or AWS
- Pluses: Cosmos DB, TensorFlow, Apache airflow, Snowflake and / or Linux
- SPARK (PYSPARK), HDFS, KAFKA AND OTHER HIGH-VOLUME DATA TOOLS
- SQL and NoSQL storage tools, such as MySQL, Postgres, Cassandra, MongoDB and ElasticSearch
- HDFS/Hive; Spark/MLlib; Kafka, etc.
- Working experience with Linux OS (Redhat/Ubuntu)
- Excellent communication skills (speaking, presenting)
Hands-on experience using Big Data and statistical analysis tools such as Hadoop/Spark, SQL
Must have experience transforming data at scale using SPARK/PYSPARK
You probably have a Bachelors or Master's degree in a quantitative field such as computer science, applied mathematics, statistics or machine learning - or an equivalent combination of education and experience. You're likely to be a Microsoft Certified Solutions Associate, Microsoft Certified Solutions Expert, and/or Database Administrator already, and you have been in a similar professional position for around five to seven years.
Important Note about this Future Opportunity:
We are actively recruiting and interviewing for this 'Future Opportunity' position, however will not be extending offers at the present time.
The outbreak and spread of COVID-19 has created uncertainty for many, and during this period Avanade is focused on the personal safety and well-being of our employees and candidates. The good news is that Avanade is a 38,000-person organization that depends on new ways of working every day and we've been relying on our workplace experience to empower our employees - wherever they're working - for 20 years now. Thanks to our workplace platforms - the likes of Office 365, Microsoft Teams, SharePoint and more - we've been able to continue delivering work seamlessly and connecting with talent to explore opportunities for tomorrow.
What does that mean for you? It means you can apply and interview virtually, via video, for a future career opportunity without pressure to make a decision. It means that you will have the chance to connect with leaders and hiring managers at your own pace.
We encourage you to speak candidly with your recruiter about your career aspirations and expectations throughout this recruiting process. In return, we are committed to being transparent with you about our intent and goals around this position.