Azkaban概述

Azkaban是一个分布式工作流管理器，在LinkedIn上实现，以解决Hadoop作业依赖性问题。我们有需要按顺序运行的工作，从ETL工作到数据分析产品。

特点

给用户提供了一个非常友好的可视化界面（web界面）
非常方便的上传工作流（打成压缩包）
设置任务间的关系
权限设置
模块化
随时停止和启动任务
可以查看日志记录

与Oozie相比，Azkaban是一个轻量级调度工具。Oozie功能较多，企业应用的功能并非小众的功能可以选择Azkaban

功能：都可以调度使用MR，java，脚本工作流任务，都可以进行定时调度
使用：az直接传参，Oozie支持直接传参和EL表达式
定时：az定时执行任务基于时间，Oozie定时执行任务基于时间或基于数据
资源：az有严格的权限控制，Oozie无严格权限控制

Azkaban的安装

准备

先创建一个文件夹用来盛放Azkaban
azkaban：程序主体
azkaban-executor：执行器（执行端）
azkaban-web：服务端
需要在mysql中创建一个Azkaban用来导入Azkaban脚本
source /root/traning/azkaban/azkaban-2.5.0/create-all-sql-2.5.0.aql

安装部署

创建SSL（安全连接）

[root@bigdata211 ~]# yum install keytool
[root@bigdata211 ~]# keytool -keystore -alias jetty -genkey -keyalg RSA（仅需输入两次密码，其他可不填，需同步时间）
[root@bigdata211 ~]# tzselect（生成时间同步文件，选择地区，国家，城市，并选择yes）
[root@bigdata211 ~]# cp /use/share/zone info/Asia/Shanghai /etc/localtime（设置时区）
[root@bigdata211 ~]# sudo data -s “2018-11-28 20:43:23”（设置时间）

修改配置

azkaban-web：

conf/azkaban.properties：
azkaban.name（名称）
azkaban.color（颜色）
default.timezone.id=Asia/Shanghai（时区）
mysql.user=root（数据库用户名）
mysql.password=tiger（数据库密码）
jetty.password=tiger
jetty.keypassword=tiger
jetty.trustpassword=tiger（设置密码）
conf/azkaban-user.xml：
<user username=“root” password=“tiger” roles=“admin,metrics” />

azkaban-executor：

conf/azkaban.properties：
default.timezone.id=Asia/Shanghai（时区）
mysql.azkaban=azkaban（数据库实例名）
mysql.user=root（数据库用户名）
mysql.password=tiger（数据库密码）

启动服务端：

1	[root@bigdata211 ~]# bin/azkaban-web-start.sh

启动执行器：

1	[root@bigdata211 ~]# bin/azkaban-executor-start.sh

访问web界面：https://192.168.247.211:8443

Azkaban的操作

单任务

创建工程：create project
上传工程：

创建文件并打包成zip压缩包：command.zip

1
2
3

command.job：
type=command
command=echo 'zfhzxg'

执行工程：execute flow

多任务（依赖）

创建工程

上传工程：bf.zip

f.job：
type=command
command=echo 'zfhzxg'
b.job：
type=command
deprndencies=f
command=echo 'henshuai'

执行工程

任务中使用组件

创建工程

上传工程：hdfs.zip

1
2
3

hofstra.job：
type=command
command=/root/training/hadoop-2.7.3/bin/hdfs dis -mkdir /azkaban

执行工程

任务中使用MapReduce

创建工程

1
2
3

wc.job：
type=command
command=/root/training/hadoop-2.7.3/bin/hadoop jar hadoop-mapreduce-examples-2.7.3.jar word count /azwc/in /azwc/out

将jar包和程序打包在一个压缩包中，wc.zip：wc.job，hadoop-mapreduce-examples-2.7.3.jar
上传工程
执行工程

任务中使用hive

创建工程和sql文件

azhive.sql：
use default;
drop table attest;
create table aztest(i dint,name,string) row format delimited fields terminated by ",";
lode data in path 'azdata/user.tat' into table aztest;
create table azres as select * from aztest;
inter overwrite directory '/azdata/userout' select count(*) from aztest;
hive.job：
type=command
command=/root/training/hive/bin/hive -f 'azhive.sql'

将sql文件和程序打包在一个压缩包中，hivef.zip：hivef.job，azhive.sql
上传工程
执行工程