Configuration | Apache Phoenix

Phoenix provides many different knobs and dials to configure and tune the system to run more optimally on your cluster. The configuration is done through a series of Phoenix-specific properties specified both on client and server-side hbase-site.xml files. In addition to these properties, there are of course all the HBase configuration properties with the most important ones documented here.

The table below outlines the full set of Phoenix-specific configuration properties and their defaults.

Property Description Default

data.tx.snapshot.dir Server-side property specifying the HDFS directory used to store snapshots of the transaction state. No default value. None

data.tx.timeout Server-side property specifying the timeout in seconds for a transaction to complete. Default is 30 seconds. 30

phoenix.query.timeoutMs Client-side property specifying the number of milliseconds after which a query will timeout on the client. Default is 10 min. 600000

phoenix.query.keepAliveMs Maximum time in milliseconds that excess idle threads will wait for a new tasks before terminating when the number of threads is greater than the cores in the client side thread pool executor. Default is 60 sec. 60000

phoenix.query.threadPoolSize Number of threads in client side thread pool executor. As the number of machines/cores in the cluster grows, this value should be increased. 128

phoenix.query.queueSize Max queue depth of the bounded round robin backing the client side thread pool executor, beyond which an attempt to queue additional work is rejected. If zero, a SynchronousQueue is used instead of the bounded round robin queue. The default value is 5000. 5000

phoenix.stats.guidepost.width Server-side parameter that specifies the number of bytes between guideposts. A smaller amount increases parallelization, but also increases the number of chunks which must be merged on the client side. The default value is 100 MB. 104857600

phoenix.stats.guidepost.per.region Server-side parameter that specifies the number of guideposts per region. If set to a value greater than zero, then the guidepost width is determiend by MAX_FILE_SIZE of table / phoenix.stats.guidepost.per.region. Otherwise, if not set, then the phoenix.stats.guidepost.width parameter is used. No default value. None

phoenix.stats.updateFrequency Server-side paramater that determines the frequency in milliseconds for which statistics will be refreshed from the statistics table and subsequently used by the client. The default value is 15 min. 900000

phoenix.stats.minUpdateFrequency Client-side parameter that determines the minimum amount of time in milliseconds that must pass before statistics may again be manually collected through another UPDATE STATISTICS call. The default value is phoenix.stats.updateFrequency / 2. 450000

phoenix.stats.useCurrentTime Server-side parameter that if true causes the current time on the server-side to be used as the timestamp of rows in the statistics table when background tasks such as compactions or splits occur. If false, then the max timestamp found while traversing the table over which statistics are being collected is used as the timestamp. Unless your client is controlling the timestamps while reading and writing data, this parameter should be left alone. The default value is true. true

phoenix.query.spoolThresholdBytes Threshold size in bytes after which results from parallelly executed query results are spooled to disk. Default is 20 mb. 20971520

phoenix.query.maxSpoolToDiskBytes Threshold size in bytes up to which results from parallelly executed query results are spooled to disk above which the query will fail. Default is 1 GB. 1024000000

phoenix.query.maxGlobalMemoryPercentage Percentage of total heap memory (i.e. Runtime.getRuntime().maxMemory()) that all threads may use. Only course grain memory usage is tracked, mainly accounting for memory usage in the intermediate map built during group by aggregation. When this limit is reached the clients block attempting to get more memory, essentially throttling memory usage. Defaults to 15% 15

phoenix.query.maxGlobalMemorySize Max size in bytes of total tracked memory usage. By default not specified, however, if present, the lower of this parameter and the phoenix.query.maxGlobalMemoryPercentage will be used

phoenix.query.maxGlobalMemoryWaitMs Maximum amount of time that a client will block while waiting for more memory to become available. After this amount of time, an InsufficientMemoryException is thrown. Default is 10 sec. 10000

phoenix.query.maxTenantMemoryPercentage Maximum percentage of phoenix.query.maxGlobalMemoryPercentage that any one tenant is allowed to consume. After this percentage, an InsufficientMemoryException is thrown. Default is 100% 100

phoenix.query.dateFormat Default pattern to use for conversion of a date to/from a string, whether through the TO_CHAR(<date>) or TO_DATE(<date-string>) functions, or through resultSet.getString(<date-column>). Default is yyyy-MM-dd HH:mm:ss.SSS yyyy-MM-dd HH:mm:ss.SSS

phoenix.query.dateFormatTimeZone A timezone id that specifies the default time zone in which date, time, and timestamp literals should be interpreted when interpreting string literals or using the TO_DATE function. A time zone id can be a timezone abbreviation such as "PST", or a full name such as "America/Los_Angeles", or a custom offset such as "GMT-9:00". The time zone id "LOCAL" can also be used to interpret all date, time, and timestamp literals as being in the current timezone of the client. GMT

phoenix.query.timeFormat Default pattern to use for conversion of TIME to/from a string, whether through the TO_CHAR(<time>) or TO_TIME(<time-string>) functions, or through resultSet.getString(<time-column>). Default is yyyy-MM-dd HH:mm:ss.SSS yyyy-MM-dd HH:mm:ss.SSS

phoenix.query.timestampFormat Default pattern to use for conversion of TIMESTAMP to/from a string, whether through the TO_CHAR(<timestamp>) or TO_TIMESTAMP(<timestamp-string>) functions, or through resultSet.getString(<timestamp-column>). Default is yyyy-MM-dd HH:mm:ss.SSS yyyy-MM-dd HH:mm:ss.SSS

phoenix.query.numberFormat Default pattern to use for conversion of a decimal number to/from a string, whether through the TO_CHAR(<decimal-number>) or TO_NUMBER(<decimal-string>) functions, or through resultSet.getString(<decimal-column>). Default is #,##0.### #,##0.###

phoenix.mutate.maxSize The maximum number of rows that may be batched on the client before a commit or rollback must be called. 500000

phoenix.mutate.batchSize The number of rows that are batched together and automatically committed during the execution of an UPSERT SELECT or DELETE statement. This property may be overridden at connection time by specifying the UpsertBatchSize property value. Note that the connection property value does not affect the batch size used by the coprocessor when these statements are executed completely on the server side. 1000

phoenix.query.maxServerCacheBytes Maximum size (in bytes) of a single sub-query result (usually the filtered result of a table) before compression and conversion to a hash map. Attempting to hash an intermediate sub-query result of a size bigger than this setting will result in a MaxServerCacheSizeExceededException. Default 100MB. 104857600

phoenix.coprocessor.maxServerCacheTimeToLiveMs Maximum living time (in milliseconds) of server caches. A cache entry expires after this amount of time has passed since last access. Consider adjusting this parameter when a server-side IOException("Could not find hash cache for joinId") happens. Getting warnings like "Earlier hash cache(s) might have expired on servers" might also be a sign that this number should be increased. 30000

phoenix.query.useIndexes Client-side property determining whether or not indexes are considered by the optimizer to satisfy a query. Default is true true

phoenix.index.failure.handling.rebuild Server-side property determining whether or not a mutable index is rebuilt in the background in the event of a commit failure. Only applicable for indexes on mutable, non transactional tables. Default is true true

phoenix.index.failure.block.write Server-side property determining whether or not a writes to the data table are disallowed in the event of a commit failure until the index can be caught up with the data table. Requires that phoenix.index.failure.handling.rebuild is true as well. Only applicable for indexes on mutable, non transactional tables. Default is false false

phoenix.index.failure.handling.rebuild.interval Server-side property controlling the millisecond frequency at which the server checks whether or not a mutable index needs to be partially rebuilt to catch up with updates to the data table. Only applicable for indexes on mutable, non transactional tables. Default is 10 seconds. 10000

phoenix.index.failure.handling.rebuild.overlap.time Server-side property controlling how many milliseconds to go back from the timestamp at which the failure occurred to go back when a partial rebuild is performed. Only applicable for indexes on mutable, non transactional tables. Default is 1 millisecond. 1

phoenix.index.mutableBatchSizeThreshold Number of mutations in a batch beyond which index metadata will be sent as a separate RPC to each region server as opposed to included inline with each mutation. Defaults to 5. 5

phoenix.schema.dropMetaData Determines whether or not an HBase table is dropped when the Phoenix table is dropped. Default is true true

phoenix.groupby.spillable Determines whether or not a GROUP BY over a large number of distinct values is allowed to spill to disk on the region server. If false, an InsufficientMemoryException will be thrown instead. Default is true true

phoenix.groupby.spillFiles Number of memory mapped spill files to be used when spilling GROUP BY distinct values to disk. Default is 2 2

phoenix.groupby.maxCacheSize Size in bytes of pages cached during GROUP BY spilling. Default is 100Mb 102400000

phoenix.groupby.estimatedDistinctValues Number of estimated distinct values when a GROUP BY is performed. Used to perform initial sizing with growth of 1.5x each time reallocation is required. Default is 1000 1000

phoenix.distinct.value.compress.threshold Size in bytes beyond which aggregate operations which require tracking distinct value counts (such as COUNT DISTINCT) will use Snappy compression. Default is 1Mb 1024000

phoenix.index.maxDataFileSizePerc Percentage used to determine the MAX_FILESIZE for the shared index table for views relative to the data table MAX_FILESIZE. The percentage should be estimated based on the anticipated average size of an view index row versus the data row. Default is 50%. 50

phoenix.coprocessor.maxMetaDataCacheTimeToLiveMs Time in milliseconds after which the server-side metadata cache for a tenant will expire if not accessed. Default is 30mins 180000

phoenix.coprocessor.maxMetaDataCacheSize Max size in bytes of total server-side metadata cache after which evictions will begin to occur based on least recent access time. Default is 20Mb 20480000

phoenix.client.maxMetaDataCacheSize Max size in bytes of total client-side metadata cache after which evictions will begin to occur based on least recent access time. Default is 10Mb 10240000

phoenix.sequence.cacheSize Number of sequence values to reserve from the server and cache on the client when the next sequence value is allocated. Only used if not defined by the sequence itself. Default is 100 100

phoenix.clock.skew.interval Delay interval(in milliseconds) when opening SYSTEM.CATALOG to compensate possible time clock skew when SYSTEM.CATALOG moves among region servers. 2000

phoenix.index.failure.handling.rebuild Boolean flag which turns on/off auto-rebuild a failed index from when some updates are failed to be updated into the index. true

phoenix.index.failure.handling.rebuild.interval Time interval(in milliseconds) for index rebuild backend Job to check if there is an index to be rebuilt 10000

phoenix.index.failure.handling.rebuild.overlap.time Index rebuild job builds an index from when it failed - the time interval(in milliseconds) in order to create a time overlap to prevent missing updates when there exists time clock skew. 300000

phoenix.query.force.rowkeyorder Whether or not a non aggregate query returns rows in row key order for salted tables. For version prior to 4.4, use phoenix.query.rowKeyOrderSaltedTable instead. Default is true. true

phoenix.connection.autoCommit Whether or not a new connection has auto-commit enabled when it is created. false

phoenix.table.default.store.nulls The default value of the STORE_NULLS flag used for table creation which determines whether or not null values should be explicitly stored in HBase. This is a client side parameter.
Available starting from Phoenix 4.3. false

phoenix.table.istransactional.default The default value of the TRANSACTIONAL flag used for table creation which determines whether or not a table is transactional . This is a client side parameter.
Available starting from Phoenix 4.7. false

phoenix.transactions.enabled Determines whether or not transactions are enabled in Phoenix. A table may not be declared as transactional if transactions are disabled. This is a client side parameter.
Available starting from Phoenix 4.7. false

phoenix.mapreduce.split.by.stats Determines whether to use the splits determined by stastics for MapReduce input splits. Default is true. This is a server side parameter.
Available starting from Phoenix 4.10. Set to false to enable behavior from previous versions. true

phoenix.log.level

Client-side property enabling query (only SELECT statement) logging. The logs are written to the SYSTEM.LOG table (requires a user to have W access on SYSTEM.LOG table).
Possible values:

Property value	Logging Details
`OFF`	No logging
`INFO`	Enables query logging
`DEBUG`	More details on Query (Explain plan, HBase Scan Details etc)
`TRACE`	Logs query bind parameters as well.

Available starting from Phoenix 4.14.

WARNING: Enabling this feature may leak sensitive information to anyone who can access the SYSTEM.LOG table.

OFF

phoenix.log.sample.rate Client-side property controlling the probability of logging a query to the query log. Set to a value between 0.0(no query) and 1.0(100% queries) .
Available starting from Phoenix 4.14. 1.0