Walmart
Sunnyvale, CA, USA
Duties:
Build streaming pipelines using technologies such as Spark Streaming or Kafka.
Develop data pipelines and processing layers using programming languages such as Scala and Python.
Apply SQL and No-SQL database expertise to work with databases such as Cassandra, BigQuery, and Cosmos DB.
Implement workflow management tool Airflow to optimize data processing.
Develop enterprise data warehouses using technologies such as BigQuery.
Run Spark and Hadoop workloads on platforms such as Dataproc to enhance data processing capabilities.
Utilize Big Data technologies such as Spark, PySpark, Hive, and SQL to architect and design scalable, low-latency, and fault-tolerant data processing pipelines.
Implement data governance practices, including data quality, metadata management, and security measures.
Optimize complex queries across large datasets for efficient data processing.
Leverage Kafka for data streaming and processing.
Collaborate with...