为什么80%的码农都做不了架构师？>>>

资料地址：

导航页

What is HAWQ?

HAWQ Architecture

How HAWQ Manages Resources

Understanding the Fault Tolerance Service

Table Distribution and Storage

Choosing the Table Distribution Policy

Using PXF with Unmanaged Data

XF External Tables and API.

High Availability, Redundancy and Fault Tolerance

In a typical HAWQ deployment, each slave node has one physical HAWQ segment, an HDFS DataNode and a NodeManager installed. Masters for HAWQ, HDFS and YARN are hosted on separate nodes.

nodemanager、hawq.segment、datanode 在同一台机器上

yarn.resourcemanager、hawq.master、namenode分别部署在不同的机器上。

hawq会缓存从yarn获取的container，在内部使用自己的资源管理器管理这些资源。

hawq中的软件组件：

HAWQ Master：

master是hawq系统的入口，接受客户端的连接，并负责sql语句的分析、优化、分配到segment、协调运行并为客户端呈现最终的执行结果。可以通过psql客户端或者jdbc、odbc连接到master，也是global system catalog的所在地。

global system catalog：

the global system catalog is the set of system tables that contain metadata about the HAWQ system itself.

master不包含任何用户数据，用户数据是存储在hdfs上的。

The master authenticates client connections, processes incoming SQL commands, distributes workload among segments, coordinates the results returned by each segment, and presents the final results to the client program.

HAWQ Segment：

并行处理数据的单元，一台机器上只有一个，可以为一个查询片段启动多个Query Executors (QEs) 。就好像一台机器上有多个虚拟的segment一样，这样可以更好的管理可用的资源。

virtual segment：

A virtual segment behaves like a container for QEs. Each virtual segment has one QE for each slice of a query. The number of virtual segments used determines the degree of parallelism (DOP) of a query.

segment与master的不同：

Is stateless. （无状态的）
Does not store the metadata for each database and table.（不存储数据库和表的元数据）
Does not store data on the local file system.（不在本地文件系统上保存数据）

master把相关的元数据信息和sql请求一起发送给segment，元数据中包含表中数据在hdfs上的url，segment通过该url访问对应的数据。

HAWQ Interconnect 连接器

hawq的网络层，当用户连接到数据库并提交一个查询后，系统在没一个segment节点上启动一系列进程来处理这个查询。interconnect 用于在segment之间交换数据，也负责和具体的网络设备通信。

默认情况下interconnect使用udp协议，但是hawq在udp之外做了一些包校验的功能。这意味着通信的可靠性与tcp相当，但是性能和伸缩性超过了tcp。如果使用tcp协议的话，hawq系统不能超过1000个segment节点，默认使用udp协议情况下，则没有这个限制。

HAWQ Resource Manager 资源管理

hawq会缓存从yarn获取到的资源，以实现尽可能低的查询延迟。也可以配置成standalone模式，这中情况下hawq自己管理资源，而不通过yarn。

See How HAWQ Manages Resources for more details on HAWQ resource management.

HAWQ Catalog Service 元数据服务

The HAWQ catalog service stores all metadata, such as UDF/UDT information, relation information, security information and data file locations.

HAWQ Fault Tolerance Service 容错服务

The HAWQ fault tolerance service (FTS) is responsible for detecting segment failures and accepting heartbeats from segments.

See Understanding the Fault Tolerance Service for more information on this service.

HAWQ Dispatcher 查询分发器

dispatcher负责把执行计划分配给一部分segment，并协调它们的执行。dispatcher和resourcemanager是动态调度查询和资源的主要组件。

Table Distribution and Storage：

除了系统表，hawq在hdfs上保存表中的数据，当用户创建一个表时，表的元数据信息保存到master节点所在机器的本地文件系统，对应的数据保存到hdfs中。为了方便数据的管理，一个表相关的数据存储在hdfs中的一个目录下。

hawq的表存储格式， AO (Append-Only) and Parquet,数据文件是可分割的，这样hawq可以使用多个虚拟segment并行的处理一个数据文件，这样增强了查询的并行性。

Table Distribution Policy 表数据的分布策略

默认的策略是随机分布。

相比mpp系统通过hash来分布表中的数据，使用随机分布有一些好处：当集群扩张的时候，hawq可以使用更多的资源，但不需要重新分布数据。当表非常大的时候，重新分布数据代价是非常昂贵的。在使用hdfs重新分布数据时（HDFS redistributes ebalance ），随机分布的数据本地性和datanode出错的容错性，都要好过hash分布，在集群很大的时候，这是非常常见的。

另一方面，对某些查询来说，hash分布的性能好与随机分布，比如一些TPC-H类型的查询。用户应该根据使用场景，选择合适的分配策略。

See Choosing the Table Distribution Policy for more details.

Data Locality 数据本地性

HAWQ considers three aspects when allocating data blocks to virtual segments:

Ratio of local read 本地读取的速率
Continuity of file read 文件连续读取时的性能
Data balance among virtual segments 虚拟segment之间的数据balance

External Data Access ：在外部访问数据

HAWQ can access data in external files using the HAWQ Extension Framework (PXF). PXF is an extensible framework that allows HAWQ to access data in external sources as readable or writable HAWQ tables. PXF has built-in connectors for accessing data inside HDFS files, Hive tables, and HBase tables. PXF also integrates with HCatalog to query Hive tables directly. See Using PXF with Unmanaged Data for more details.

Users can create custom PXF connectors to access other parallel data stores or processing engines. Connectors are Java plug-ins that use the PXF API. For more information see PXF External Tables and API.

Physical Segments and Virtual Segments

In HAWQ, only one physical segment needs to be installed on one host, in which multiple virtual segments can be started to run queries. HAWQ allocates multiple virtual segments distributed across different hosts on demand to run one query. Virtual segments are carriers (containers) for resources such as memory and CPU. Queries are executed by query executors in virtual segments.

Note: In this documentation, when we refer to segment by itself, we mean a physical segment.

Virtual Segment Allocation Policy

Different number of virtual segments are allocated based on virtual segment allocation policies. The following factors determine the number of virtual segments that are used for a query:

Resources available at the query running time 查询运行时，系统中可用资源数量
The cost of the query 查询的资源花费
The distribution of the table; in other words, randomly distributed tables and hash distributed tables 数据表的分布方式
Whether the query involves UDFs and external tables 使用使用了udf和外部表
Specific server configuration parameters, such as default_hash_table_bucket_number for hash table queries and hawq_rm_nvseg_perquery_limit

HDFS Catalog Cache

master用来缓存从namenode获取的数据位置信息，加速hdfs的rpc处理速度。

HDFS catalog cache is a caching service used by HAWQ master to determine the distribution information of table data on HDFS.

HDFS is slow at RPC handling, especially when the number of concurrent requests is high. In order to decide which segments handle which part of data, HAWQ needs data location information from HDFS NameNodes. HDFS catalog cache is used to cache the data location information and accelerate HDFS RPCs.