Spark The Definitive Guide Spark权威指南中文笔记

目前在做Spark开发,所用到的参考资料便是Spark: The Definitive Guide。目前暂时没有中文版,为了记录学习和工作的过程,顺便等待中文版的推出,便将每章节的学习过程总结下来,以加深印象。

前6章不再赘述,前面的技术大牛已经翻译和整理笔记完毕,下面放出地址:
1-6章以及部分12章RDD翻译 by: 刺客五六柒
Spark: The Definitive Guide 2019中文版-开源翻译项目
4-6章以及部分7章学习笔记 by: lzw2016
《Spark: The Definitive Guide 》Spark权威指南学习计划

该书的源码及数据集已经在Github中:
https://github.com/databricks/Spark-The-Definitive-Guide

在目前的开发过程中涉及到最主要的是7-9章,及从不同的数据源获取数据和对已处理好的DF或者RDD进行操作。我将着重在这三章整理笔记。接下来会拓展更多章节。
第10章以及12-14章将会记录,同时流处理的20-23章也会记录.

点击如下链接会跳转到简书
目前暂时发表于简书

目录

  • 大数据和Spark概述
    Chapter 1 to 2:了解Apache Spark
    Chapter 3:了解Spark的工具集
  • 结构化API——DataFrames, SQL, and Datasets
    Chapter 4:结构化API预览
    Chapter 5:基本结构化API操作
Welcome to this first edition of Spark: The Definitive Guide! We are excited to bring you the most complete resource on Apache Spark today, focusing especially on the new generation of Spark APIs introduced in Spark 2.0. Apache Spark is currently one of the most popular systems for large-scale data processing, with APIs in multiple programming languages and a wealth of built-in and third-party libraries. Although the project has existed for multiple years—first as a research project started at UC Berkeley in 2009, then at the Apache Software Foundation since 2013—the open source community is continuing to build more powerful APIs and high-level libraries over Spark, so there is still a lot to write about the project. We decided to write this book for two reasons. First, we wanted to present the most comprehensive book on Apache Spark, covering all of the fundamental use cases with easy-to-run examples. Second, we especially wanted to explore the higher-level “structured” APIs that were finalized in Apache Spark 2.0—namely DataFrames, Datasets, Spark SQL, and Structured Streaming—which older books on Spark don’t always include. We hope this book gives you a solid foundation to write modern Apache Spark applications using all the available tools in the project. In this preface, we’ll tell you a little bit about our background, and explain who this book is for and how we have organized the material. We also want to thank the numerous people who helped edit and review this book, without whom it would not have been possible.
Work with all aspects of batch processing in a modern Java environment using a selection of Spring frameworks. This book provides up-to-date examples using the latest configuration techniques based on Java configuration and Spring Boot. The Definitive Guide to Spring Batch takes you from the “Hello, World!” of batch processing to complex scenarios demonstrating cloud native techniques for developing batch applications to be run on modern platforms. Finally this book demonstrates how you can use areas of the Spring portfolio beyond just Spring Batch 4 to collaboratively develop mission-critical batch processes. You’ll see how a new class of use cases and platforms has evolved to have an impact on batch-processing. Data science and big data have become prominent in modern IT and the use of batch processing to orchestrate workloads has become commonplace. The Definitive Guide to Spring Batch covers how running finite tasks on cloud infrastructure in a standardized way has changed where batch applications are run. Additionally, you’ll discover how Spring Batch 4 takes advantage of Java 9, Spring Framework 5, and the new Spring Boot 2 micro-framework. After reading this book, you’ll be able to use Spring Boot to simplify the development of your own Spring projects, as well as take advantage of Spring Cloud Task and Spring Cloud Data Flow for added cloud native functionality. Includes a foreword by Dave Syer, Spring Batch project founder. What You’ll Learn Discover what is new in Spring Batch 4 Carry out finite batch processing in the cloud using the Spring Batch project Understand the newest configuration techniques based on Java configuration and Spring Boot using practical examples Master batch processing in complex scenarios including in the cloud Develop batch applications to be run on modern platforms Use areas of the Spring portfolio beyond Spring Batch to develop mission-critical batch processes Who This Book Is For Experienced Java and Spring coders new to the Sp
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值