实习-大数据开发工程师-杭州

部门介绍:
    PDA(Portal and Data Analytics) team负责Cisco协作产品大数据分析平台的设计与建设; 并基于客户、 产品设计者、管理者、工程师团队等的不同需求,分析客户的使用、服务的质量、产品的运行情况等,为业务决策提供科学的数据分析支持。
工作职责:
  • 跨团队合作,理解业务需求把业务需求转换成系统与技术需求参与大数据批处理与实时处理方案的设计参与数据挖掘和数据处理代码的编写与测试理解并实现相关的机器学习模型与算法建立相应的技术文档

  • 职位要求:
  • 符合或者接近以下需求

    基本需求:数学、计算机科学、软件工程、信息管理或统计相关专业研究生o对大数据技术领域有强烈的兴趣o良好的书面与口头英语沟通能力

    技术需求:理解分布式计算原理o熟悉Hadoop v2, MapReduce, HDFSo有批处理和实时处理开发的相关知识或经验,如Spark, Storm, Spark-Streamingo掌握大数据查询的相关工具,如Pig, Hive, Impalao具备多数据源的集成开发的相关知识或经验o具备NoSQL databases的相关知识或经验, 如HBase, Cassandrao具备多种ETL的相关知识或经验,如Flumeo具备多种消息系统的知识或经验,如Kafkao具备使用机器学习相关工具包的相关知识或经验,如SparkML, Mahouto具备Cloudera的相关知识或经验

    Department introduction:PDA (Portal and Data Analytics) team is responsible for designing and building Cisco collaboration product data analytics platform. Meanwhile, based on various requirements from customer, product management, executive and engineering teams, we make usage data, quality data and operation data analytics to drive business by data science.

    Responsibilities:

    • Work along with cross functional teams to understand business requirement
    • Translates business requirements into system and technical requirements
    • Participate in Big Data batch/real-time processing solution design
    • Program and test data mining, data processing jobs
    • Understand and implement relevant machine learning models and algorithms
    • Create relevant technical documents

    Requirements: Meet or be high potential for the below requirementsCommon requirementsoMaster degree in Mathematics, Computer Science, Software engineering, Information Management or StatisticsoIntense interest in big data technology area.oGood written and verbal English communication skillsTechnical requirementsoProficient understanding of distributed computing principlesoProficiency with Hadoop v2, MapReduce, HDFSoKnowledge of or experience with building batch-processing and stream-processing systems, using solutions such as Spark, Storm or Spark-StreamingoGood knowledge of Big Data querying tools, such as Pig, Hive, ImpalaoKnowledge of or experience with integration of data from multiple data sourcesoKnowledge of or experience with NoSQL databases, such as HBase, CassandraoKnowledge of or experience of various ETL techniques and frameworks, such as FlumeoKnowledge of or experience with various messaging systems, such as KafkaoKnowledge of or experience with Machine Learning toolkits, such as SparkML , MahoutoKnowledge of or experience with Cloudera

Location:

Hangzhou